\begin{equation} Accuracy = \frac{Tp + Tn}{P + N} \end{equation}
\begin{equation} Precision = \frac{Tp}{Tp + Fp} \end{equation}
\begin{equation} Recall = TruePositiveRate = Sensitivity = \frac{Tp}{Tp + Fn} \end{equation}
\begin{equation} Specificity = \frac{Tn}{Tn + Fp} \end{equation}
\begin{equation} FalsePositiveRate = 1 - Specificity = \frac{Fp}{Tn + Fp} \end{equation}
\begin{equation} F_1 =\frac{2}{precision^{-1} + recall^{-1}} = 2 \frac{precision \; recall}{precision + recall} \end{equation}
\begin{equation} F_\beta = (1 + \beta^2) \frac{precision \; recall}{\beta^2 precision + recall} \end{equation}
Here is an example how Logistic Regression curve. Changing the threshold leads to different Confusion Matrices and respectively to different FPR and TPR.
mAP is popular metric in object detection and instance segmentation problems.
Average Precision is the area under precision-recall curve. To get multiple precision-recall pairs in case of instance segmentation and object detection IoU is calculated using different thresholds. mAP is AP values averaged over different categories.
overfitting: when empirical risk is low but the true risk is too high. It can happen if the dataset is too small, if model is too powerful.
underfitting: when empirical risk is high and the true risk is high. This can happen if the model is too weak, or for, example if there are some problems with optimization parameters.