imputr.imputers.autoimputer#

Module Contents#

Classes#

AutoImputer

Automatic imputation class that implements the RandomForest strategy

class imputr.imputers.autoimputer.AutoImputer(data: pandas.DataFrame, predefined_order: Dict[str, int] = None, predefined_strategies: Dict[str, Dict] = None, predefined_datatypes: Dict[str, Union[str, imputr.domain.DataType]] = None, include_non_missing: bool = False)#

Bases: imputr.imputers._base._BaseImputer

Automatic imputation class that implements the RandomForest strategy as main imputation method. Can be configured to implement other strategies for specific columns and a custom imputation order.

Variables
  • predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2

  • strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.

  • predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.

Parameters
  • data (pd.DataFrame) – The dataframe which undergoes imputation.

  • predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2

  • predefined_strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.

  • predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.

  • include_non_missing (bool (optional)) – Flag to indicate whether columns without missing value need fitting of strategies. Default is set to False.

strategies :Dict[str, imputr.strategy._base._BaseStrategy]#
ordered_columns :List[imputr.domain.Column]#
included_columns :List[imputr.domain.Column]#