ds_toolbox.ml package

Submodules

ds_toolbox.ml.evaluator module

Evaluator

This module contains functionalities to help the evaluation of machine learning models.

ds_toolbox.ml.evaluator.binary_classifier_metrics(df_prediction: Union[pyspark.sql.dataframe.DataFrame, pandas.core.frame.DataFrame], col_target: str, col_prediction: str, spark: Optional[pyspark.sql.session.SparkSession] = None, max_mem: int = 3, n_cores: int = 2)dict

Computes Evaluation metrics of a binary classification result on pandas and spark df.

Args:

df_prediction (Union[pyspark.sql.dataframe.DataFrame, pd.DataFrame]): DataFrame with observed and predicted values. col_target (str): Column name of ground truth class. col_prediction (str): Column name with predicted class. spark (Union[pyspark.sql.session.SparkSession, None], optional): Spark session where computation will take place.

If none, then a local is created. Defaults to None.

max_mem (int, optional): Max memory to be allocated to spark. Defaults to 3. n_cores (int, optional): Number os cores to be allocated to spark. Defaults to 2.

Raises:

Exception: Errors.

Returns:

dict: Dict with: confusion matrix, accuracy, f1 score, precision, recall, auroc, aupr.

Module contents