modnet.featurizers.presets.matminer_all_2023 module

This submodule contains the Matminer2023Featurizer class.

class modnet.featurizers.presets.matminer_all_2023.MatminerAll2023Featurizer(fast_oxid=False, continuous_only=False)

Bases: MODFeaturizer

A “kitchen-sink” featurizer for features implemented in matminer at time of creation (matminer v0.8.0 from late 2022/early 2023).

Follows the same philosophy as the DeBreuck2020Featurizer but with many features changing their underlying matminer implementation, definition and behaviour since the creation of the former featurizer. The featurizer list has also been updated to include all the available featurizers.

Creates the featurizer and imports all featurizer functions.

Parameters:
  • fast_oxid (bool) – Whether to use the accelerated oxidation state parameters within pymatgen when constructing features that constrain oxidation states such that all sites with the same species in a structure will have the same oxidation state (recommended if featurizing any structure with large unit cells).

  • continuous_only (bool) – Whether to keep only the features that are continuous with respect to the composition (only for composition featurizers). Discontinuous features may lead to discontinuities in the model predictions.

load_featurizers()
featurize_composition(df)

Applies the preset composition featurizers to the input dataframe, renames some fields and cleans the output dataframe.

featurize_structure(df)

Applies the preset structural featurizers to the input dataframe, renames some fields and cleans the output dataframe.

featurize_site(df)

Applies the preset site featurizers to the input dataframe, renames some fields and cleans the output dataframe.

class modnet.featurizers.presets.matminer_all_2023.CompositionOnlyMatminerAll2023Featurizer(continuous_only=False, oxidation_featurizers=False, fast_oxid=False)

Bases: MatminerAll2023Featurizer

This subclass simply disables structure and site-level features from the main Matminer2023Featurizer class.

This should yield identical results to the original 2020 version.

Creates the featurizer and imports all featurizer functions.

Parameters:
  • fast_oxid (bool) – Whether to use the accelerated oxidation state parameters within pymatgen when constructing features that constrain oxidation states such that all sites with the same species in a structure will have the same oxidation state (recommended if featurizing any structure with large unit cells).

  • continuous_only (bool) – Whether to keep only the features that are continuous with respect to the composition (only for composition featurizers). Discontinuous features may lead to discontinuities in the model predictions.

  • oxidation_featurizers (bool) –