# Change Log

## 0.4.2

### Fixed
* `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: `bucket_fraction` argument behavior.

### Changed
* `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: Return `dict[dfs_iv]` from a spark dataframe to `dict[df_iv]` to a pandas df.

## 0.4.1

### Fixed
* `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: behavior with `num_features` and `cat_features` arguments.

## 0.4.0

### Added

* Added the `pyspark_ds_toolbox.ml.feature_selection.information_value` module and all its functionalities
    * `feature_selection_with_iv()`
    * `compute_woe_iv()`
    * `WeightOfEvidenceComputer()`

## 0.3.4

### Breaking Changes

* `pyspark_ds_toolbox.ml.data_prep.features_vector.get_features_vector`: Now returns a list with pyspark indexers, encoders and assemblers, to used with pipelines.
* `pyspark_ds_toolbox.ml.classification.baseline_classifiers.py`: Models now are returned as pipelines.

## 0.3.3

### Changed

* `pyspark_ds_toolbox.ml.classification.baseline_binary_classfiers` has a `mlflow_experiment_name` argument.


### Fixed

* `pyspark_ds_toolbox.ml.feature_importance.native_spark`.

## 0.3.2

## Changed

* Fuctionalities from module `pyspark_ds_toolbox.wrangling` was refactored into `pyspark_ds_toolbox.wrangling.reshape.py` and `pyspark_ds_toolbox.wrangling.data_quality.py`;
* Fuctionalities from module `pyspark_ds_toolbox.ml.data_prep` was refactored into `pyspark_ds_toolbox.ml.data_prep.class_weights.py` and `pyspark_ds_toolbox.ml.data_prep.features_vector.py`.

## 0.3.1

### Changed

* Module `pyspark_ds_toolbox.ml.classification.baseline_binary_classfiers` now algo return features scores.

## 0.3.0

### Added 

* Module `pyspark_ds_toolbox.ml.feature_importance` with the functions:
    * `extract_features_score()`

### Changed

* Module `pyspark_ds_toolbox.ml.shap_values` became `pyspark_ds_toolbox.ml.feature_importance.shap_values`


## 0.2.0

### Added

* Module pyspark_ds_toolbox.ml.classification

### changed

* Module pyspark_ds_toolbox.ml.eval became pyspark_ds_toolbox.ml.classification.eval

## 0.1.4

### Changed

* [fix] Class pyspark_ds_toolbox.stats.association.Association now can properly receive only numerical or only categorical features.


## 0.1.3

### Added

* CHANGELOG.md file

### Changed

* pyspark dependency is now >=3.2
* Class pyspark_ds_toolbox.stats.association.Association now uses pyspark.pandas.frame.DataFrame instead of databricks.koalas.frame.DataFrame.