Oven logo

Oven

Published

PerMetrics: A Framework of Performance Metrics for Machine Learning Models

PERMETRICS


GitHub release Wheel PyPI version PyPI - Python Version PyPI - Status PyPI - Downloads Downloads Tests & Publishes to PyPI GitHub Release Date Documentation Status Chat GitHub contributors GitTutorial DOI License: GPL v3

PerMetrics is a python library for performance metrics of machine learning models. We aim to implement all performance metrics for problems such as regression, classification, clustering, ... problems. Helping users in all field access metrics as fast as possible. The number of available metrics include 111 (47 regression metrics, 20 classification metrics, 44 clustering metrics)

Citation Request

Please include these citations if you plan to use this library:

@software{nguyen_van_thieu_2023_8220489,
  author       = {Nguyen Van Thieu},
  title        = {PerMetrics: A Framework of Performance Metrics for Machine Learning Models},
  month        = aug,
  year         = 2023,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3951205},
  url          = {https://github.com/thieu1995/permetrics}
}

Installation

Install the current PyPI release:

$ pip install permetrics

After installation, you can import Permetrics as any other Python module:

$ python
>>> import permetrics
>>> permetrics.__version__

Example

Below is the most efficient and effective way to use this library compared to other libraries. The example below returns the values of metrics such as root mean squared error, mean absolute error...

from permetrics import RegressionMetric

y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]

evaluator = RegressionMetric(y_true, y_pred)
results = evaluator.get_metrics_by_list_names(["RMSE", "MAE", "MAPE", "R2", "NSE", "KGE"])
print(results["RMSE"])
print(results["KGE"])

In case your y_true and y_pred data have multiple columns, and you want to return multiple outputs, something that other libraries cannot do, you can do it in Permetrics as follows:

import numpy as np
from permetrics import RegressionMetric

y_true = np.array([[0.5, 1], [-1, 1], [7, -6]])
y_pred = np.array([[0, 2], [-1, 2], [8, -5]])

evaluator = RegressionMetric(y_true, y_pred)

## The 1st way
results = evaluator.get_metrics_by_dict({
  "RMSE": {"multi_output": "raw_values"},
  "MAE": {"multi_output": "raw_values"},
  "MAPE": {"multi_output": "raw_values"},
})

## The 2nd way
results = evaluator.get_metrics_by_list_names(
  list_metric_names=["RMSE", "MAE", "MAPE", "R2", "NSE", "KGE"],
  list_paras=[{"multi_output": "raw_values"},] * 6
)

## The 3rd way
result01 = evaluator.RMSE(multi_output="raw_values")
result02 = evaluator.MAE(multi_output="raw_values")

The more complicated cases in the folder: examples. You can also read the documentation for more detailed installation instructions, explanations, and examples.

Contributing

There are lots of ways how you can contribute to Permetrics's development, and you are welcome to join in! For example, you can report problems or make feature requests on the issues pages. To facilitate contributions, please check for the guidelines in the CONTRIBUTING.md file.

Official channels

Note

  • Currently, there is a huge misunderstanding among frameworks around the world about the notation of R, R2, and R^2.

  • Please read the file R-R2-Rsquared.docx to understand the differences between them and why there is such confusion.

  • My recommendation is to denote the Coefficient of Determination as COD or R2, while the squared Pearson's Correlation Coefficient should be denoted as R^2 or RSQ (as in Excel software).


Developed by: Thieu @ 2023