Change Log
0.4.2
Fixed
pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()
:bucket_fraction
argument behavior.
Changed
pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()
: Returndict[dfs_iv]
from a spark dataframe todict[df_iv]
to a pandas df.
0.4.1
Fixed
pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()
: behavior withnum_features
andcat_features
arguments.
0.4.0
Added
Added the
pyspark_ds_toolbox.ml.feature_selection.information_value
module and all its functionalitiesfeature_selection_with_iv()
compute_woe_iv()
WeightOfEvidenceComputer()
0.3.4
Breaking Changes
pyspark_ds_toolbox.ml.data_prep.features_vector.get_features_vector
: Now returns a list with pyspark indexers, encoders and assemblers, to used with pipelines.pyspark_ds_toolbox.ml.classification.baseline_classifiers.py
: Models now are returned as pipelines.
0.3.3
Changed
pyspark_ds_toolbox.ml.classification.baseline_binary_classfiers
has amlflow_experiment_name
argument.
Fixed
pyspark_ds_toolbox.ml.feature_importance.native_spark
.
0.3.2
Changed
Fuctionalities from module
pyspark_ds_toolbox.wrangling
was refactored intopyspark_ds_toolbox.wrangling.reshape.py
andpyspark_ds_toolbox.wrangling.data_quality.py
;Fuctionalities from module
pyspark_ds_toolbox.ml.data_prep
was refactored intopyspark_ds_toolbox.ml.data_prep.class_weights.py
andpyspark_ds_toolbox.ml.data_prep.features_vector.py
.
0.3.1
Changed
Module
pyspark_ds_toolbox.ml.classification.baseline_binary_classfiers
now algo return features scores.
0.3.0
Added
Module
pyspark_ds_toolbox.ml.feature_importance
with the functions:extract_features_score()
Changed
Module
pyspark_ds_toolbox.ml.shap_values
becamepyspark_ds_toolbox.ml.feature_importance.shap_values
0.2.0
Added
Module pyspark_ds_toolbox.ml.classification
changed
Module pyspark_ds_toolbox.ml.eval became pyspark_ds_toolbox.ml.classification.eval