Wiki Categories

Model Evaluation

Confusion Matrix

Also known as an error matrix, a confusion matrix is a two dimensional matrix that allows visualization of the algorithm’s performance. It is a summary of the results of predictions on a classification problem. Correct and incorrect predictions are highlighted and divided by class. The results are then compared with the actual values.

This matrix helps to understand how the classification model is not optimized while making predictions. It helps to find out what errors are being made and to determine their exact type.

In addition to machine learning, the confusion matrix is used in the fields of statistics, data mining, and artificial intelligence. In general, it speeds up the analysis of statistical data and makes the results easier to decipher via data visualization. This matrix offers the opportunity to analyze errors in statistics, data mining, and even certain medical examinations.

How to calculate a Confusion Matrix ?

confusion matrix

To compute a confusion matrix, it is necessary to have a set of test data (test dataset) or a set of validation data (validation dataset) along with the expected result values. A prediction is then made for each line of the "test dataset".

The matrix indicates the number of correct and incorrect predictions for each class, which are organized according to the expected results and predictions. Each row of the table corresponds to a predicted class, and each column corresponds to a real class.

Predictions or results are entered in the rows under the actual classes. These results can be the correct indication of a positive prediction as "true positive" and a negative prediction as "true negative", or an incorrect positive prediction as "false positive" and an incorrect negative prediction as "false negative".

The advantage of these matrices is that they are very simple to read and understand. They allow you to visualize data and statistics quickly to analyze model performance and identify trends that may help change specific settings. It’s also possible to use a Confusion Matrix by adding rows and columns for classification problems with three or more classes.


Additional Resources:

Explorium delivers the end-game of every data science process - from raw, disconnected data to game-changing insights, features, and predictive models. Better than any human can.
Request a demo
Get started with Explorium External Data Cloud Start for free