Change Log
0.4.2
Fixed
- pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv():- bucket_fractionargument behavior.
Changed
- pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv(): Return- dict[dfs_iv]from a spark dataframe to- dict[df_iv]to a pandas df.
0.4.1
Fixed
- pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv(): behavior with- num_featuresand- cat_featuresarguments.
0.4.0
Added
- Added the - pyspark_ds_toolbox.ml.feature_selection.information_valuemodule and all its functionalities- feature_selection_with_iv()
- compute_woe_iv()
- WeightOfEvidenceComputer()
 
0.3.4
Breaking Changes
- pyspark_ds_toolbox.ml.data_prep.features_vector.get_features_vector: Now returns a list with pyspark indexers, encoders and assemblers, to used with pipelines.
- pyspark_ds_toolbox.ml.classification.baseline_classifiers.py: Models now are returned as pipelines.
0.3.3
Changed
- pyspark_ds_toolbox.ml.classification.baseline_binary_classfiershas a- mlflow_experiment_nameargument.
Fixed
- pyspark_ds_toolbox.ml.feature_importance.native_spark.
0.3.2
Changed
- Fuctionalities from module - pyspark_ds_toolbox.wranglingwas refactored into- pyspark_ds_toolbox.wrangling.reshape.pyand- pyspark_ds_toolbox.wrangling.data_quality.py;
- Fuctionalities from module - pyspark_ds_toolbox.ml.data_prepwas refactored into- pyspark_ds_toolbox.ml.data_prep.class_weights.pyand- pyspark_ds_toolbox.ml.data_prep.features_vector.py.
0.3.1
Changed
- Module - pyspark_ds_toolbox.ml.classification.baseline_binary_classfiersnow algo return features scores.
0.3.0
Added
- Module - pyspark_ds_toolbox.ml.feature_importancewith the functions:- extract_features_score()
 
Changed
- Module - pyspark_ds_toolbox.ml.shap_valuesbecame- pyspark_ds_toolbox.ml.feature_importance.shap_values
0.2.0
Added
- Module pyspark_ds_toolbox.ml.classification 
changed
- Module pyspark_ds_toolbox.ml.eval became pyspark_ds_toolbox.ml.classification.eval