modnet.featurizers.featurizers module¶
- class modnet.featurizers.featurizers.MODFeaturizer(n_jobs=None, drop_allnan=True)¶
Bases:
ABC
Base class for multiple featurization across structure, composition and sites.
Child classes must provide iterables of matminer featurizer objects to be applied to the structure, composition and sites of the structures in the input dataframe.
- composition_featurizers¶
Optional iterable of featurizers to apply to the ‘composition’ column (which will be generated if missing).
- Type:
Optional[Iterable[matminer.featurizers.base.BaseFeaturizer]]
- oxid_composition_featurizers¶
Optional iterable of featurizers to apply to the ‘composition_oxid’ column generated by the
CompositionToOxidComposition
converter.- Type:
Optional[Iterable[matminer.featurizers.base.BaseFeaturizer]]
- structure_featurizers¶
Optional iterable of featurizers to apply to the structure as
SiteStatsFingerprint
objects. Uses thesite_stats
attribute to determine which statistics are calculated.- Type:
Optional[Iterable[matminer.featurizers.base.BaseFeaturizer]]
- site_stats¶
Iterable of string statistic names to be used by the
SiteStatsFingerprint
objects.- Type:
Tuple[str]
- featurizer_mode¶
Whether or not to apply all featurizers at once (“multi”), i.e., parallelising over structures, or one-at-a-time (“single”), i.e., parallelising over featurisers.
- Type:
Initialise the MODFeaturizer object with a requested number of threads to use during featurization.
- Parameters:
- set_n_jobs(n_jobs)¶
Set the no. of threads to pass to matminer for featurizer initialisation.
- featurize(df)¶
Run all of the preset featurizers on the input dataframe.
- Parameters:
df (pandas.DataFrame) – the input dataframe with a
"structure"
column containing pymatgenStructure
objects.- Returns:
The featurized DataFrame.
- Return type:
- featurize_composition(df)¶
Decorate input
pandas.DataFrame
of structures with composition features from matminer, specified by the MODFeaturizer preset.Currently applies the set of all matminer composition features.
- Parameters:
df (pandas.DataFrame) – the input dataframe with a
"structure"
column containing pymatgenStructure
objects.- Returns:
- the decorated DataFrame, or an empty
DataFrame if no composition/oxidation featurizers exist for this class.
- Return type:
- featurize_structure(df)¶
Decorate input
pandas.DataFrame
of structures with structural features from matminer, specified by the MODFeaturizer preset.Currently applies the set of all matminer structure features.
- Parameters:
df (pandas.DataFrame) – the input dataframe with a
"structure"
column containing pymatgenStructure
objects.- Returns:
the decorated DataFrame.
- Return type:
- featurize_site(df, aliases=None)¶
Decorate input
pandas.DataFrame
of structures with site features, specified by the MODFeaturizer preset.- Parameters:
df (pandas.DataFrame) – the input dataframe with a
"structure"
column containing pymatgenStructure
objects.aliases (Optional[Dict[str, str]]) – optional dictionary to map matminer output column names to new aliases, mostly used for backwards-compatibility.
- Returns:
the decorated DataFrame.
- Return type: