1

I have two binary images, one of them is an output produced by my algorithm and the other one is the ground truth. For this two images I calculated the values true positive, false negative, true negative and false positive. I want to interpret this result. For example I have this values:

TP (true positive) = 2739
TN (true negative) = 103217 
FP (false positive) = 43423 
FN (false negative) = 5022

After I calculated this parameters I compute the accuracy $$accuracy = \frac{TP +TN}{TP+TN+FP+FN}$$ In this case the accuracy is 0.68. Can I say that I have low accuracy because the value false positive is high? There is any relathion between false positive and the parameters true positive or true negative?

Alex544
  • 21
  • 3

1 Answers1

1

You might be interested in the false positive rate. This is

FPR=FP/(TN+FP)=43423/(43423+103217)=.3
FNR=FN/(TP+FN)=5022/(2739+5022)=.65

Your false positive rate is actually very low. You are mostly classifying negatives correctly as negative. But your false negative rate is higher at .65. You are not detecting enough images that are actually true images. For a more detailed discussion, read on.

The accuracy is an overall summary of the success of your tests. It is sometimes helpful to look at a breakdown of it into sensitivity and specificity. Sensitivity is the true positive rate, i.e. the chance your program predicts a positive given that it's supposed to be positive. Specificity is the true negative rate, i.e. the proportion of negatives correctly predicted as negative.

$sensitivity=\frac{TP}{TP+FN}=\frac{2739}{2739+5022}=.35$

$specificity=\frac{TN}{TN+FP}=\frac{103217}{103217+ 43423}=.7$

Based on these results, it seems that your test is not as good as identifying positives as it is at correctly identifying negatives. One possible problem could be that there is a difference in the prevalence of images from the truth and produced by your computer algorithm. Here a confusion matrix of the data you have collected:

$\begin{matrix}&actual&\\ predicted&truth&algorithm\\ truth&2739& 43423\\ algorithm & 5022& 103217\end{matrix}$

If you compute the accuracies separately for the true images and the algorithm images, you find that the accuracy for the true images is $\frac {2739}{2739+5022}=.35$ and the accuracy for your algorithmic images is $\frac{103217}{103217+ 43423}=.7$. (Notice these are the definitions of sensitivity and specificity, the same above.) The reason that your overall accuracy is high is because you have a much higher prevalence of algorithm-generated images; your algorithm tends to guess conservatively, and the fact that there are more algorithm generated images pushes your overall accuracy up.

A useful one-number summary combining sensitivity and specificity is the $F_1$ score, defined as $$\frac {1}{\frac{\beta^2}{\beta^2+1}\frac{1}{sensitivity}+\frac{1}{1+\beta^2}\frac{1}{specificity}}$$

When $\beta=1$, a common choice, this is the harmonic average of sensitivity and specificity:

$$\frac {1}{\frac{1}{2}\left(\frac{1}{sensitivity}+\frac{1}{specificity}\right)}$$

The $F_1$ score of your test is .47. The closer to 1, the better. You might try to optimize your algorithm based on the $F_1$ score.

There is some difference between your $F_1$ score and accuracy:

F_1: .47
accuracy: .68
Vons
  • 11,004