modnet.featurizers.utils module

modnet.featurizers.utils.clean_df(df, drop_allnan=True)

Cleans dataframe by dropping missing values, replacing NaN’s and infinities and selecting only columns containing numerical data.

Parameters:
  • df (pd.DataFrame) – the dataframe to clean.

  • drop_allnan (bool) – if True, clean_df will remove features that are fully NaNs.

Returns:

the cleaned dataframe.

Return type:

pandas.DataFrame