The Python Book
 
dataframe dropnull
20190202

Filter a dataframe to retain rows with non-null values

Eg. you want only the data where the 'population' column has non-null values.

In short

df=df[df['population'].notnull()]

Alternative, replace the null value with something:

df['population']=df['population'].fillna(0) 

In detail

import numpy as np
import pandas as pd

# setup the dataframe
data=[[ 'Isle of Skye',         9232, 124 ],
      [ 'Vieux-Charmont',     np.nan, 320 ],
      [ 'Indian Head',          3844,  35 ],
      [ 'Cihua',              np.nan, 178 ],
      [ 'Miasteczko Slaskie',   7327, 301 ],
      [ 'Wawa',               np.nan,   7 ],
      [ 'Bat Khela',           46079, 673 ]]

df=pd.DataFrame(data, columns=['asciiname','population','elevation'])
 
#display the dataframe
df

            asciiname  population  elevation
0        Isle of Skye      9232.0        124
1      Vieux-Charmont         NaN        320
2         Indian Head      3844.0         35
3               Cihua         NaN        178
4  Miasteczko Slaskie      7327.0        301
5                Wawa         NaN          7


# retain only the rows where population has a non-null value
df=df[df['population'].notnull()]

            asciiname  population  elevation
0        Isle of Skye      9232.0        124
2         Indian Head      3844.0         35
4  Miasteczko Slaskie      7327.0        301
6           Bat Khela     46079.0        673
 
Notes by Willem Moors. Generated on momo:/home/willem/sync/20151223_datamungingninja/pythonbook at 2019-07-31 19:22