Confusion Matrix and Cyber Attacks
What is Confusion Matrix?
A confusion matrix is a performance measurement technique for Machine learning classification. It is a kind of table that helps you to know the performance of the classification model on a set of test data for that the true values are known. The term confusion matrix itself is very simple, but its related terminology can be a little confusing.
The confusion matrix visualizes the accuracy of a classifier by comparing the actual and predicted classes.
True Positive (TP)
- The predicted value matches the actual value
- The actual value was positive and the model predicted a positive value
True Negative (TN)
- The predicted value matches the actual value
- The actual value was negative and the model predicted a negative value
False Positive (FP) — Type 1 error
- The predicted value was falsely predicted
- The actual value was negative but the model predicted a positive value
- Also known as the Type 1 error
False Negative (FN) — Type 2 error
- The predicted value was falsely predicted
- The actual value was positive but the model predicted a negative value
- Also known as the Type 2 error
You can compute the accuracy-test from the confusion matrix:
Precision vs. Recall
Precision tells us how many of the correctly predicted cases actually turned out to be positive
Recall tells us how many of the actual positive cases we were able to predict correctly with our model.
EXAMPLE:
Precision is a useful metric in cases where False Positive is a higher concern than False Negatives.
Precision is important in music or video recommendation systems, e-commerce websites, etc. Wrong results could lead to customer churn and be harmful to the business.
Recall is a useful metric in cases where False Negative trumps False Positive.
Recall is important in medical cases where it doesn’t matter whether we raise a false alarm but the actual positive cases should not go undetected!
Confusion matrix in Cyberattacks
True-Positive (TP): Correctly classify an anomalous sample as an attack.
True-Negative (TN): Correctly classify a non-attack sample as an ordinary instance.
False-Positive (FP): Incorrectly classify an ordinary sample as an anomalous instance.
False-Negative (FN): Incorrectly classify an attack sample as an ordinary instance.
Reduction of False negatives and false positives is a major research problem as these have very negative effects on the overall security of networks. Reduction of False negatives and false positives is a major research problem as these have very negative effects on the overall security of networks.
In the present scenario, intrusion detection remains critical for network security and machine learning-based applications which have given a major boost in finding novel attacks. The application of Multiple classifiers i.e. hybrid systems and ensemble learning methods in recent years have given a major boost in increasing the accuracy of attack detection techniques. But the rate of false positives and false negatives still needs to be addressed.