imputr.imputers.autoimputer#
Module Contents#
Classes#
Automatic imputation class that implements the RandomForest strategy |
- class imputr.imputers.autoimputer.AutoImputer(data: pandas.DataFrame, predefined_order: Dict[str, int] = None, predefined_strategies: Dict[str, Dict] = None, predefined_datatypes: Dict[str, Union[str, imputr.domain.DataType]] = None, include_non_missing: bool = False)#
Bases:
imputr.imputers._base._BaseImputerAutomatic imputation class that implements the RandomForest strategy as main imputation method. Can be configured to implement other strategies for specific columns and a custom imputation order.
- Variables
predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2
strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.
predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.
- Parameters
data (pd.DataFrame) – The dataframe which undergoes imputation.
predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2
predefined_strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.
predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.
include_non_missing (bool (optional)) – Flag to indicate whether columns without missing value need fitting of strategies. Default is set to False.
- strategies :Dict[str, imputr.strategy._base._BaseStrategy]#
- ordered_columns :List[imputr.domain.Column]#
- included_columns :List[imputr.domain.Column]#