TrOCR: Optimized for Qualcomm Devices
End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.
This is based on the implementation of TrOCR found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.45, ONNX Runtime 1.25.0 | Download |
| QNN_DLC | float | Universal | QAIRT 2.45 | Download |
| TFLITE | float | Universal | QAIRT 2.45 | Download |
For more device-specific assets and performance metrics, visit TrOCR on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for TrOCR on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.image_to_text
Model Stats:
- Model checkpoint: trocr-small-stage1
- Input resolution: 320x320
- Number of parameters (decoder): 38.3M
- Model size (decoder) (float): 146 MB
- Number of parameters (encoder): 23.0M
- Model size (encoder) (float): 87.8 MB
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| decoder | ONNX | float | Snapdragon® X2 Elite | 1.27 ms | 205 - 205 MB | NPU |
| decoder | ONNX | float | Snapdragon® X Elite | 2.106 ms | 174 - 174 MB | NPU |
| decoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 1.5 ms | 0 - 272 MB | NPU |
| decoder | ONNX | float | Snapdragon® 8 Gen 1 Mobile | 2.738 ms | 1 - 209 MB | NPU |
| decoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 2.124 ms | 1 - 4 MB | NPU |
| decoder | ONNX | float | Qualcomm® QCS8450 | 2.738 ms | 1 - 209 MB | NPU |
| decoder | ONNX | float | Snapdragon® 8 Elite Mobile | 1.302 ms | 0 - 192 MB | NPU |
| decoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.206 ms | 1 - 173 MB | NPU |
| decoder | ONNX | float | Qualcomm® QCS9075 | 2.684 ms | 7 - 51 MB | NPU |
| decoder | ONNX | float | Qualcomm® QCS8750 | 1.302 ms | 0 - 192 MB | NPU |
| decoder | ONNX | float | Qualcomm® QCS7181 | 2.106 ms | 174 - 174 MB | NPU |
| decoder | QNN_DLC | float | Snapdragon® X2 Elite | 1.636 ms | 7 - 7 MB | NPU |
| decoder | QNN_DLC | float | Snapdragon® X Elite | 2.229 ms | 7 - 7 MB | NPU |
| decoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 1.367 ms | 0 - 274 MB | NPU |
| decoder | QNN_DLC | float | Snapdragon® 8 Gen 1 Mobile | 2.663 ms | 2 - 217 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® QCS8275 | 4.237 ms | 5 - 105 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 2.025 ms | 3 - 174 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® QCS8450 | 2.663 ms | 2 - 217 MB | NPU |
| decoder | QNN_DLC | float | Snapdragon® 8 Elite Mobile | 1.237 ms | 0 - 187 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® SA8295P | 2.606 ms | 7 - 49 MB | NPU |
| decoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.164 ms | 1 - 173 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® SA7255P | 4.237 ms | 5 - 105 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® QCS9075 | 2.573 ms | 7 - 16 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® QCS8750 | 1.237 ms | 0 - 187 MB | NPU |
| decoder | QNN_DLC | float | Qualcomm® QCS7181 | 2.229 ms | 7 - 7 MB | NPU |
| decoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 1.381 ms | 0 - 280 MB | NPU |
| decoder | TFLITE | float | Snapdragon® 8 Gen 1 Mobile | 2.439 ms | 0 - 217 MB | NPU |
| decoder | TFLITE | float | Qualcomm® QCS8275 | 4.249 ms | 0 - 104 MB | NPU |
| decoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 1.987 ms | 0 - 2 MB | NPU |
| decoder | TFLITE | float | Qualcomm® SA8775P | 8.85 ms | 7 - 25 MB | GPU |
| decoder | TFLITE | float | Qualcomm® SA8650P | 8.85 ms | 7 - 25 MB | GPU |
| decoder | TFLITE | float | Qualcomm® SA8255P | 8.85 ms | 7 - 25 MB | GPU |
| decoder | TFLITE | float | Qualcomm® QCS8450 | 2.439 ms | 0 - 217 MB | NPU |
| decoder | TFLITE | float | Snapdragon® 8 Elite Mobile | 1.237 ms | 0 - 189 MB | NPU |
| decoder | TFLITE | float | Qualcomm® SA8295P | 2.666 ms | 0 - 42 MB | NPU |
| decoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.145 ms | 0 - 175 MB | NPU |
| decoder | TFLITE | float | Qualcomm® SA7255P | 4.249 ms | 0 - 104 MB | NPU |
| decoder | TFLITE | float | Qualcomm® QCS9075 | 2.581 ms | 0 - 83 MB | NPU |
| decoder | TFLITE | float | Qualcomm® QCS8750 | 1.237 ms | 0 - 189 MB | NPU |
| encoder | ONNX | float | Snapdragon® X2 Elite | 5.484 ms | 211 - 211 MB | NPU |
| encoder | ONNX | float | Snapdragon® X Elite | 11.431 ms | 180 - 180 MB | NPU |
| encoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 7.839 ms | 16 - 205 MB | NPU |
| encoder | ONNX | float | Snapdragon® 8 Gen 1 Mobile | 20.027 ms | 16 - 322 MB | NPU |
| encoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 10.915 ms | 0 - 59 MB | NPU |
| encoder | ONNX | float | Qualcomm® QCS8450 | 20.027 ms | 16 - 322 MB | NPU |
| encoder | ONNX | float | Snapdragon® 8 Elite Mobile | 5.294 ms | 16 - 141 MB | NPU |
| encoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 4.13 ms | 2 - 124 MB | NPU |
| encoder | ONNX | float | Qualcomm® QCS9075 | 14.176 ms | 15 - 60 MB | NPU |
| encoder | ONNX | float | Qualcomm® QCS8750 | 5.294 ms | 16 - 141 MB | NPU |
| encoder | ONNX | float | Qualcomm® QCS7181 | 11.431 ms | 180 - 180 MB | NPU |
| encoder | QNN_DLC | float | Snapdragon® X2 Elite | 5.178 ms | 2 - 2 MB | NPU |
| encoder | QNN_DLC | float | Snapdragon® X Elite | 12.181 ms | 2 - 2 MB | NPU |
| encoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 8.016 ms | 0 - 190 MB | NPU |
| encoder | QNN_DLC | float | Snapdragon® 8 Gen 1 Mobile | 19.436 ms | 0 - 306 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® QCS8275 | 37.03 ms | 2 - 107 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 11.295 ms | 2 - 170 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® QCS8450 | 19.436 ms | 0 - 306 MB | NPU |
| encoder | QNN_DLC | float | Snapdragon® 8 Elite Mobile | 5.448 ms | 2 - 113 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® SA8295P | 19.688 ms | 2 - 233 MB | NPU |
| encoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 4.328 ms | 2 - 116 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® SA7255P | 37.03 ms | 2 - 107 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® QCS9075 | 15.108 ms | 2 - 12 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® QCS8750 | 5.448 ms | 2 - 113 MB | NPU |
| encoder | QNN_DLC | float | Qualcomm® QCS7181 | 12.181 ms | 2 - 2 MB | NPU |
| encoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 7.705 ms | 3 - 195 MB | NPU |
| encoder | TFLITE | float | Snapdragon® 8 Gen 1 Mobile | 19.541 ms | 7 - 311 MB | NPU |
| encoder | TFLITE | float | Qualcomm® QCS8275 | 36.805 ms | 7 - 119 MB | NPU |
| encoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 10.741 ms | 7 - 9 MB | NPU |
| encoder | TFLITE | float | Qualcomm® SA8775P | 122.908 ms | 6 - 34 MB | GPU |
| encoder | TFLITE | float | Qualcomm® SA8650P | 122.908 ms | 6 - 34 MB | GPU |
| encoder | TFLITE | float | Qualcomm® SA8255P | 122.908 ms | 6 - 34 MB | GPU |
| encoder | TFLITE | float | Qualcomm® QCS8450 | 19.541 ms | 7 - 311 MB | NPU |
| encoder | TFLITE | float | Snapdragon® 8 Elite Mobile | 5.202 ms | 6 - 118 MB | NPU |
| encoder | TFLITE | float | Qualcomm® SA8295P | 19.12 ms | 7 - 236 MB | NPU |
| encoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 4.102 ms | 0 - 105 MB | NPU |
| encoder | TFLITE | float | Qualcomm® SA7255P | 36.805 ms | 7 - 119 MB | NPU |
| encoder | TFLITE | float | Qualcomm® QCS9075 | 13.921 ms | 5 - 65 MB | NPU |
| encoder | TFLITE | float | Qualcomm® QCS8750 | 5.202 ms | 6 - 118 MB | NPU |
License
- The license for the original implementation of TrOCR can be found here.
References
- TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
- Source Model Implementation
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
