Sentin-Edge AI Models
Sentin-Edge AI is a suite of privacy-preserving, edge-optimized neural networks designed to detect human stress, emotional states, and gaze direction in real-time using only a standard webcam.
This repository contains the INT8 Quantized TensorFlow Lite models. These models are mathematically optimized for deployment on CPU/NPU edge hardware (like Android devices or Raspberry Pi) via the XNNPACK delegate, requiring less than 2 MB of total storage.
Model Inventory
This repository hosts three interconnected models:
1. Affective CNN (affective_cnn_int8.tflite)
- Task: 7-Class Emotion Classification & Feature Extraction
- Architecture: 4-Layer Convolutional Neural Network (32โ64โ128โ128)
- Input:
1x48x48x1(Grayscale Face Crop) - Output:
1x7(Emotion Probabilities) +1x256(Deep Feature Embedding) - Size: 1.4 MB
- Quantization: INT8 Full Integer
2. Stress TCN (affective_tcn_int8.tflite)
- Task: Temporal Stress Level Estimation
- Architecture: Multi-Modal Temporal Convolutional Network (TCN)
- Input:
1x15x268Sequence (256-dim CNN Embedding + 12-dim temporal geometry metrics) - Output:
1x1(Stress Score [0.0 - 1.0]) - Size: 137 KB
- Quantization: INT8 Full Integer
3. Gaze Hybrid (gaze_hybrid_int8.tflite)
- Task: Screen Coordinate Gaze Regression
- Architecture: MLP Feature Extractor + TCN + Head Pose Fusion
- Input:
1x15x10(Eye/Iris Geometry Sequence) +1x3(Head Pose Pitch/Yaw/Roll) - Output:
1x2(Normalized Screen X, Y) - Size: 57 KB
- Quantization: INT8 Full Integer
System Architecture & Pipeline
Sentin-Edge AI uses a Dual-Head Multi-Task Learning approach. All processing runs entirely on-device without any network calls.
High-Level Architecture
Frame Processing Sequence
Benchmarks & Quantization To achieve real-time performance on edge devices, these models undergo strict INT8 quantization.
Performance Metrics The graph below illustrates the trade-off between model footprint and accuracy preservation compared to the original FP32 PyTorch models.
Model Original FP32 Size Quantized INT8 Size Storage Reduction Accuracy Variance Affective CNN ~5.6 MB 1.4 MB ~75% < 5% drop Stress TCN ~540 KB 137 KB ~75% < 3% drop Gaze Hybrid ~220 KB 57 KB ~74% < 4% drop
Intended Use These models are intended for privacy-first edge computing research. Because they process frames entirely in-memory and return lightweight heuristic scores, they are ideal for:
HCI (Human-Computer Interaction) accessibility research On-device cognitive load and stress monitoring Privacy-preserving biometric telemetry Out-of-Scope Use Cases:
Medical diagnosis Automated proctoring or disciplinary surveillance without consent High-stakes biometric security authentication
Python Usage Example You can download and run these models via the Hugging Face Hub using the TensorFlow Lite Interpreter.
from huggingface_hub import hf_hub_download
import tensorflow as tf
import numpy as np
# 1. Download the Edge Model
model_path = hf_hub_download(
repo_id="YourUsername/sentin-edge-ai",
filename="affective_cnn_int8.tflite"
)
# 2. Load the TFLite Interpreter
interpreter = tf.lite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# 3. Prepare Input (48x48 Grayscale normalized to [-1, 1])
dummy_face = np.zeros((1, 48, 48, 1), dtype=np.float32)
# Manually quantize input if model expects INT8
scale, zero_point = input_details[0]['quantization']
if scale > 0:
dummy_face = np.clip(np.round(dummy_face / scale) + zero_point, -128, 127).astype(np.int8)
interpreter.set_tensor(input_details[0]['index'], dummy_face)
interpreter.invoke()
# 4. Extract Output
emotion_probs = interpreter.get_tensor(output_details[0]['index'])
print("Emotion Probabilities:", emotion_probs)
Training Data & Limitations
Affective CNN was trained on the FER-2013 dataset. Stress TCN was trained on synthetic sequences correlating facial micro-tremors and AU4/AU12 activity with emotional arousal. Gaze Hybrid was pre-trained on MPIIGaze.
Limitations: Due to INT8 quantization, the models exhibit up to a 5% variance compared to their FP32 PyTorch equivalents. Gaze tracking requires per-user calibration (a 5-point screen registration) for sub-degree accuracy. Accuracy degrades significantly in extreme low-light environments where MediaPipe cannot resolve the iris contours.
- Downloads last month
- 49


