How to Calculate Confusion Matrix in Excel

Deepanshu Bhalla Add Comment

This tutorial explains how to compute confusion matrix in Excel, along with examples.

Please ensure you have the following two columns in Excel:

  1. Binary Dependent Variable (0 / 1)
  2. Predicted probabilities

Let's say you have data for dependent variable in cells A2:A100 and probabilities in cells B2:B100.

Sample Data

The confusion matrix is shown in the table below for your reference.

Predicted Negative(0)Predicted Positive(1)
Actual Negative(0)True Negative (TN)False Positive (FP)
Actual Positive(1)False Negative (FN)True Positive (TP)

Step 1 : Classify Predictions

The first step is to create a column to classify predictions. Let's say probability cutoff is 0.5 which means assigning 1 if the predicted probability is 0.5 or higher, otherwise 0. Enter the following formula in cell C2.

=IF(B2 >= 0.5, 1, 0)

Drag the formula down to the last row (i.e C100) to cover all rows.

Classify Predictions

Step 2 : Count Positives and Negatives

True Positives, True Negatives, False Negatives and False Positives
True Positives

Enter the following formula in cell D2.

=COUNTIFS(A2:A100, 1, C2:C100, 1)
True Negatives

Enter the following formula in cell E2.

=COUNTIFS(A2:A100,0, C2:C100, 0)
False Negatives

Enter the following formula in cell F2.

=COUNTIFS(A2:A100,1, C2:C100, 0)
False Positives

Enter the following formula in cell G2.

=COUNTIFS(A2:A100,0, C2:C100, 1)

Step 3 : Calculate Confusion Matrix

Calculating Confusion Matrix in Excel
True Positive Rate (TPR) / Recall

The True Positive Rate is also referred to as Recall in the confusion matrix. Enter the following formula in cell H2.

=D2 / (D2 + F2)
False Positive Rate (FPR)

Enter the following formula in cell I2.

=G2 / (G2 + E2)
True Negative Rate (TPR)

Enter the following formula in cell J2.

=E2/(E2+G2)
False Negative Rate (FNR)

Enter the following formula in cell K2.

=F2 / (F2 + D2)
Accuracy

Enter the following formula in cell L2.

=(D2+E2)/COUNT(C2:C100)
Precision

Enter the following formula in cell M2.

=D2/(D2+G2)
F1 Score

F1 score is the harmonic mean of precision and recall. It performs well on a dataset having low event rate.

=2 * (H2 * M2) / (H2 + M2)
Excel Macro to Calculate Confusion Matrix

The following VBA code is a user-defined function that automates all the steps shown above to compute confusion matrix.


Function ConfusionMatrix(predictions As Range, actuals As Range, threshold As Double) As Variant
    
    Dim n As Long
    Dim tpr As Double, fpr As Double, tnr As Double, fnr As Double, accuracy As Double
    Dim sorted_indices() As Long
    Dim i As Long

    ' Check if the predictions and actuals ranges have the same number of elements
    If predictions.Count <> actuals.Count Then
        MsgBox "Predictions and actuals must have the same number of elements."
        Exit Function
    End If

    n = predictions.Count
    ReDim sorted_indices(1 To n)

    ' Sort predictions and actuals in descending order of predictions
    sorted_indices = SortIndicesDescending(predictions)

    Dim num_pos As Long, num_neg As Long
    num_pos = WorksheetFunction.CountIf(actuals, 1)
    num_neg = n - num_pos

    If num_pos = 0 Or num_neg = 0 Then
        MsgBox "There must be both positive and negative actual values."
        Exit Function
    End If

    Dim tp_count As Long, fp_count As Long, tn_count As Long, fn_count As Long
    tp_count = 0
    fp_count = 0
    tn_count = 0
    fn_count = 0

    ' Calculate TP, FP, TN, FN for the specified threshold
    For i = 1 To n
        If predictions.Cells(sorted_indices(i)).Value >= threshold Then
            If actuals.Cells(sorted_indices(i)).Value = 1 Then
                tp_count = tp_count + 1
            Else
                fp_count = fp_count + 1
            End If
        Else
            If actuals.Cells(sorted_indices(i)).Value = 1 Then
                fn_count = fn_count + 1
            Else
                tn_count = tn_count + 1
            End If
        End If
    Next i

    tpr = Round(tp_count / num_pos, 4)
    fpr = Round(fp_count / num_neg, 4)
    fnr = Round(fn_count / num_pos, 4)
    tnr = Round(tn_count / num_neg, 4)
    accuracy = Round((tp_count + tn_count) / n, 4)

    ' Display TPR, FPR, TNR, and FNR in one message box
    ConfusionMatrix = "Accuracy: " & accuracy & ", " & _
           "True Positive Rate: " & tpr & ", " & _
           "False Positive Rate: " & fpr & ", " & _
           "True Negative Rate: " & tnr & ", " & _
           "False Negative Rate: " & fnr

End Function

How to Use
  1. In Excel, open the VBA editor by pressing Alt + F11 keyboard shortcut key.
  2. Select Insert > Module to create a module.
  3. Paste the above VBA code in the module.
  4. Close the VBA editor by clicking the 'X' in the top-right corner.
  5. In any cell, enter this formula :
    =ConfusionMatrix(B2:B100,A2:A100,0.5)
    It follows this syntax - =ConfusionMatrix(probability_range,dependent_variable_range,cutoff)
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "How to Calculate Confusion Matrix in Excel"
Next →