🧠 ArcFace Industrial Face Recognition

ResNet-50 backbone trained with ArcFace loss on VGGFace2 — production-grade face embeddings for real-world identity verification.

This model is part of a broader comparative study of deep face recognition loss functions (ArcFace, SphereFace, Triplet Loss). After systematic evaluation across three experimental rounds, ArcFace was selected as the production model based on its training stability, numerical reliability at scale, and consistent generalization gains as data grows.

📊 Model Performance

Metric	Value
Training Accuracy	99%
Validation Accuracy	85%
Training Loss	0.03
Validation Loss	4.00
Training Identities	3,000
Images per Identity	~200
Epochs	100

Validation loss reflects expected open-set generalization behavior — the model is trained on a closed identity set and evaluated against unseen faces. This gap narrows with more training data.

🏗️ Architecture

Input Image (112×112×3)
        │
        ▼
  ResNet-50 Backbone
        │
        ▼
  512-D L2-Normalized Embedding
        │
        ▼
   ArcFace Head
   (Additive Angular Margin, m=0.5, s=64)

Component	Details
Backbone	ResNet-50
Embedding Dimension	512
Loss Function	ArcFace (m=0.5, s=64)
Input Resolution	112 × 112
Embedding Normalization	L2
Optimizer	SGD with cosine decay
Learning Rate	0.1

🚀 Usage

Load the Model

import torch
from src.config import load_config
from src.models.face_model import build_face_model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cfg = load_config("configs/base.yaml")

model = build_face_model(cfg, num_classes=None)  # inference mode
ckpt = torch.load("model.pt", map_location=device)
state_dict = ckpt.get("model_state", ckpt)

backbone_state = {
    k.replace("backbone.", "", 1): v
    for k, v in state_dict.items()
    if k.startswith("backbone.")
}
model.backbone.load_state_dict(backbone_state, strict=False)
model.eval().to(device)

Extract Embeddings

import cv2
import numpy as np
from src.data.preprocessing import PreprocessingPipeline

pipeline = PreprocessingPipeline(
    preproc_cfg=cfg.preprocessing,
    image_size=cfg.data.image_size,
    apply_detection=False,
)

@torch.no_grad()
def get_embedding(img_bgr: np.ndarray) -> np.ndarray:
    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
    chw = pipeline(img_rgb)
    tensor = torch.from_numpy(chw).unsqueeze(0).to(device)
    emb = model(tensor)
    return emb.squeeze().cpu().numpy().astype(np.float32)

img = cv2.imread("face.jpg")
embedding = get_embedding(img)  # shape: (512,), L2-normalized

Compute Similarity

from numpy.linalg import norm

def cosine_similarity(emb1: np.ndarray, emb2: np.ndarray) -> float:
    return float(np.dot(emb1, emb2) / (norm(emb1) * norm(emb2)))

sim = cosine_similarity(embedding_a, embedding_b)
# sim ∈ [-1, 1] — higher = more similar
# Typical threshold for same-person: ≥ 0.5

🗄️ Production Pipeline

This model powers a complete two-stage attendance system:

Stage 1 — Database Population (`build_database.py`)

Registers known identities by computing gallery embeddings and storing them in ChromaDB (vector similarity search) linked to identity metadata in MongoDB. Runs a built-in Top-K evaluation on held-out probe images after registration.

Stage 2 — Real-Time Inference (`realtime_attendance.py`)

Reads a live webcam feed, detects faces with MTCNN, embeds each crop through this model, and queries ChromaDB for the nearest registered identity. Recognized faces are labeled with name and similarity score; unknown faces are flagged. A cooldown timer prevents duplicate attendance logs.

📈 Gallery Evaluation Results

Evaluated on 50 registered identities (250 held-out probe images, never seen during training or registration):

Metric	Result
Top-1 Accuracy	92.00% (230 / 250)
Top-3 Accuracy	96.00% (240 / 250)
Top-5 Accuracy	96.80% (242 / 250)
Failed reads	0 / 250

Probe images are drawn from the same VGGFace2 distribution as the gallery but are a completely separate split — never used during model training or gallery registration.

🔬 Why ArcFace for Production

ArcFace was selected over SphereFace (the Round 3 LFW leader) based on engineering considerations critical for deployment:

Additive angular margin has a direct geometric interpretation on the hypersphere — the decision boundary is fixed and predictable, making threshold calibration reliable across unseen identities.
Numerical stability at scale — SphereFace's multiplicative margin becomes sensitive as class count increases. ArcFace's formulation remains stable regardless.
Consistent data scaling — validation accuracy improved monotonically from 83% (1,000 identities) to 85% (3,000 identities), confirming predictable generalization gains as the training set grows.
Industry standard — ArcFace is the de facto choice in production face recognition systems, with extensive tooling for quantization, ONNX export, and edge deployment.

📦 Training Data

Dataset	Identities	Images	Resolution
VGGFace2 (subset)	3,000	~600,000	112 × 112

Full dataset: VGGFace2 112×112 on Kaggle

🔗 Related Resources

Resource	Link
📓 Training & Evaluation Notebook	Kaggle — ArcFace Training
📄 ArcFace Paper	arXiv:1801.07698
🤗 Triplet Loss Model	AbdoSaad24/TripletLossModels
🤗 SphereFace Model	AbdoSaad24/BestSphereFaceModel
🤗 ArcFace (Research, R3)	AbdoSaad24/BestArcFaceModel

📋 Citation

@inproceedings{deng2019arcface,
  title     = {ArcFace: Additive Angular Margin Loss for Deep Face Recognition},
  author    = {Deng, Jiankang and Guo, Jia and Xue, Niannan and Zafeiriou, Stefanos},
  booktitle = {CVPR},
  year      = {2019}
}

⚠️ Limitations & Responsible Use

This model was trained on a subset of VGGFace2. Performance may degrade on faces from demographics underrepresented in the training data.
The model is intended for attendance and access control systems where subjects have consented to enrollment.
Do not use for surveillance, tracking, or identification of individuals without explicit consent.
Threshold selection (default: 0.5 cosine similarity) should be calibrated to your deployment environment — lower thresholds increase false acceptances, higher thresholds increase false rejections.

Part of the Face Recognition Comparative Study — ArcFace · SphereFace · Triplet Loss.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image Feature Extraction

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for AbdoSaad24/IndustrialFaceRecognition

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Paper • 1801.07698 • Published Jan 23, 2018 • 1