imputr#
Subpackages#
Submodules#
Package Contents#
Classes#
Automatic imputation class that implements the RandomForest strategy |
|
Simple imputation class that uses average imputation |
- class imputr.AutoImputer(data: pandas.DataFrame, predefined_order: Dict[str, int] = None, predefined_strategies: Dict[str, Dict] = None, predefined_datatypes: Dict[str, Union[str, imputr.domain.DataType]] = None, include_non_missing: bool = False)#
Bases:
imputr.imputers._base._BaseImputerAutomatic imputation class that implements the RandomForest strategy as main imputation method. Can be configured to implement other strategies for specific columns and a custom imputation order.
- Variables
predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2
strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.
predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.
- Parameters
data (pd.DataFrame) – The dataframe which undergoes imputation.
predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2
predefined_strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.
predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.
include_non_missing (bool (optional)) – Flag to indicate whether columns without missing value need fitting of strategies. Default is set to False.
- strategies :Dict[str, imputr.strategy._base._BaseStrategy]#
- ordered_columns :List[imputr.domain.Column]#
- included_columns :List[imputr.domain.Column]#
- class imputr.MeanImputer(data: pandas.DataFrame, predefined_order: Dict[str, int] = None, predefined_strategies: Dict[str, Dict] = None, predefined_datatypes: Dict[str, Union[str, imputr.domain.DataType]] = None, include_non_missing: bool = False)#
Bases:
imputr.imputers._base._BaseImputerSimple imputation class that uses average imputation as main imputation method. Uses mode for categorical and mean for continuous columns. Can be configured to implement other strategies for specific columns and a custom imputation order.
- Parameters
data (pd.DataFrame) – The dataframe which undergoes imputation.
predefined_order (Dict[int, str] (optional)) – Dictionary of column names and their order for imputation. Keys must be incremental starting from zero: 0, 1, 2
predefined_strategies (Dict[str, Dict] (optional)) – Dictionary of column name and strategy kwargs.
predefined_datatypes (Dict[str, Union[str, DataType]] (optional)) – Dictionary that has column names as key and the data type as specified in the Column constructor as value.
include_non_missing (bool (optional)) – Flag to indicate whether columns without missing value need fitting of strategies. Default is set to False.
- predefined_order :Dict[str, int]#
- predefined_strategies :Dict[str, Dict]#
- strategies :Dict[str, imputr.strategy._base._BaseStrategy]#
- ordered_columns :List[imputr.domain.Column]#
- include_non_missing :bool#