π§ ArcFace Industrial Face Recognition
ResNet-50 backbone trained with ArcFace loss on VGGFace2 β production-grade face embeddings for real-world identity verification.
This model is part of a broader comparative study of deep face recognition loss functions (ArcFace, SphereFace, Triplet Loss). After systematic evaluation across three experimental rounds, ArcFace was selected as the production model based on its training stability, numerical reliability at scale, and consistent generalization gains as data grows.
π Model Performance
| Metric | Value |
|---|---|
| Training Accuracy | 99% |
| Validation Accuracy | 85% |
| Training Loss | 0.03 |
| Validation Loss | 4.00 |
| Training Identities | 3,000 |
| Images per Identity | ~200 |
| Epochs | 100 |
Validation loss reflects expected open-set generalization behavior β the model is trained on a closed identity set and evaluated against unseen faces. This gap narrows with more training data.
ποΈ Architecture
Input Image (112Γ112Γ3)
β
βΌ
ResNet-50 Backbone
β
βΌ
512-D L2-Normalized Embedding
β
βΌ
ArcFace Head
(Additive Angular Margin, m=0.5, s=64)
| Component | Details |
|---|---|
| Backbone | ResNet-50 |
| Embedding Dimension | 512 |
| Loss Function | ArcFace (m=0.5, s=64) |
| Input Resolution | 112 Γ 112 |
| Embedding Normalization | L2 |
| Optimizer | SGD with cosine decay |
| Learning Rate | 0.1 |
π Usage
Load the Model
import torch
from src.config import load_config
from src.models.face_model import build_face_model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cfg = load_config("configs/base.yaml")
model = build_face_model(cfg, num_classes=None) # inference mode
ckpt = torch.load("model.pt", map_location=device)
state_dict = ckpt.get("model_state", ckpt)
backbone_state = {
k.replace("backbone.", "", 1): v
for k, v in state_dict.items()
if k.startswith("backbone.")
}
model.backbone.load_state_dict(backbone_state, strict=False)
model.eval().to(device)
Extract Embeddings
import cv2
import numpy as np
from src.data.preprocessing import PreprocessingPipeline
pipeline = PreprocessingPipeline(
preproc_cfg=cfg.preprocessing,
image_size=cfg.data.image_size,
apply_detection=False,
)
@torch.no_grad()
def get_embedding(img_bgr: np.ndarray) -> np.ndarray:
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
chw = pipeline(img_rgb)
tensor = torch.from_numpy(chw).unsqueeze(0).to(device)
emb = model(tensor)
return emb.squeeze().cpu().numpy().astype(np.float32)
img = cv2.imread("face.jpg")
embedding = get_embedding(img) # shape: (512,), L2-normalized
Compute Similarity
from numpy.linalg import norm
def cosine_similarity(emb1: np.ndarray, emb2: np.ndarray) -> float:
return float(np.dot(emb1, emb2) / (norm(emb1) * norm(emb2)))
sim = cosine_similarity(embedding_a, embedding_b)
# sim β [-1, 1] β higher = more similar
# Typical threshold for same-person: β₯ 0.5
ποΈ Production Pipeline
This model powers a complete two-stage attendance system:
Stage 1 β Database Population (build_database.py)
Registers known identities by computing gallery embeddings and storing them in ChromaDB (vector similarity search) linked to identity metadata in MongoDB. Runs a built-in Top-K evaluation on held-out probe images after registration.
Stage 2 β Real-Time Inference (realtime_attendance.py)
Reads a live webcam feed, detects faces with MTCNN, embeds each crop through this model, and queries ChromaDB for the nearest registered identity. Recognized faces are labeled with name and similarity score; unknown faces are flagged. A cooldown timer prevents duplicate attendance logs.
π Gallery Evaluation Results
Evaluated on 50 registered identities (250 held-out probe images, never seen during training or registration):
| Metric | Result |
|---|---|
| Top-1 Accuracy | 92.00% (230 / 250) |
| Top-3 Accuracy | 96.00% (240 / 250) |
| Top-5 Accuracy | 96.80% (242 / 250) |
| Failed reads | 0 / 250 |
Probe images are drawn from the same VGGFace2 distribution as the gallery but are a completely separate split β never used during model training or gallery registration.
π¬ Why ArcFace for Production
ArcFace was selected over SphereFace (the Round 3 LFW leader) based on engineering considerations critical for deployment:
- Additive angular margin has a direct geometric interpretation on the hypersphere β the decision boundary is fixed and predictable, making threshold calibration reliable across unseen identities.
- Numerical stability at scale β SphereFace's multiplicative margin becomes sensitive as class count increases. ArcFace's formulation remains stable regardless.
- Consistent data scaling β validation accuracy improved monotonically from 83% (1,000 identities) to 85% (3,000 identities), confirming predictable generalization gains as the training set grows.
- Industry standard β ArcFace is the de facto choice in production face recognition systems, with extensive tooling for quantization, ONNX export, and edge deployment.
π¦ Training Data
| Dataset | Identities | Images | Resolution |
|---|---|---|---|
| VGGFace2 (subset) | 3,000 | ~600,000 | 112 Γ 112 |
Full dataset: VGGFace2 112Γ112 on Kaggle
π Related Resources
| Resource | Link |
|---|---|
| π Training & Evaluation Notebook | Kaggle β ArcFace Training |
| π ArcFace Paper | arXiv:1801.07698 |
| π€ Triplet Loss Model | AbdoSaad24/TripletLossModels |
| π€ SphereFace Model | AbdoSaad24/BestSphereFaceModel |
| π€ ArcFace (Research, R3) | AbdoSaad24/BestArcFaceModel |
π Citation
@inproceedings{deng2019arcface,
title = {ArcFace: Additive Angular Margin Loss for Deep Face Recognition},
author = {Deng, Jiankang and Guo, Jia and Xue, Niannan and Zafeiriou, Stefanos},
booktitle = {CVPR},
year = {2019}
}
β οΈ Limitations & Responsible Use
- This model was trained on a subset of VGGFace2. Performance may degrade on faces from demographics underrepresented in the training data.
- The model is intended for attendance and access control systems where subjects have consented to enrollment.
- Do not use for surveillance, tracking, or identification of individuals without explicit consent.
- Threshold selection (default: 0.5 cosine similarity) should be calibrated to your deployment environment β lower thresholds increase false acceptances, higher thresholds increase false rejections.
Part of the Face Recognition Comparative Study β ArcFace Β· SphereFace Β· Triplet Loss.