Change Log
0.4.2
Fixed
pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv():bucket_fractionargument behavior.
Changed
pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv(): Returndict[dfs_iv]from a spark dataframe todict[df_iv]to a pandas df.
0.4.1
Fixed
pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv(): behavior withnum_featuresandcat_featuresarguments.
0.4.0
Added
Added the
pyspark_ds_toolbox.ml.feature_selection.information_valuemodule and all its functionalitiesfeature_selection_with_iv()compute_woe_iv()WeightOfEvidenceComputer()
0.3.4
Breaking Changes
pyspark_ds_toolbox.ml.data_prep.features_vector.get_features_vector: Now returns a list with pyspark indexers, encoders and assemblers, to used with pipelines.pyspark_ds_toolbox.ml.classification.baseline_classifiers.py: Models now are returned as pipelines.
0.3.3
Changed
pyspark_ds_toolbox.ml.classification.baseline_binary_classfiershas amlflow_experiment_nameargument.
Fixed
pyspark_ds_toolbox.ml.feature_importance.native_spark.
0.3.2
Changed
Fuctionalities from module
pyspark_ds_toolbox.wranglingwas refactored intopyspark_ds_toolbox.wrangling.reshape.pyandpyspark_ds_toolbox.wrangling.data_quality.py;Fuctionalities from module
pyspark_ds_toolbox.ml.data_prepwas refactored intopyspark_ds_toolbox.ml.data_prep.class_weights.pyandpyspark_ds_toolbox.ml.data_prep.features_vector.py.
0.3.1
Changed
Module
pyspark_ds_toolbox.ml.classification.baseline_binary_classfiersnow algo return features scores.
0.3.0
Added
Module
pyspark_ds_toolbox.ml.feature_importancewith the functions:extract_features_score()
Changed
Module
pyspark_ds_toolbox.ml.shap_valuesbecamepyspark_ds_toolbox.ml.feature_importance.shap_values
0.2.0
Added
Module pyspark_ds_toolbox.ml.classification
changed
Module pyspark_ds_toolbox.ml.eval became pyspark_ds_toolbox.ml.classification.eval