Image-to-Text
PyTorch
android

TrOCR: Optimized for Qualcomm Devices

End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.

This is based on the implementation of TrOCR found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.45, ONNX Runtime 1.25.0 Download
QNN_DLC float Universal QAIRT 2.45 Download
TFLITE float Universal QAIRT 2.45 Download

For more device-specific assets and performance metrics, visit TrOCR on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for TrOCR on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.image_to_text

Model Stats:

  • Model checkpoint: trocr-small-stage1
  • Input resolution: 320x320
  • Number of parameters (decoder): 38.3M
  • Model size (decoder) (float): 146 MB
  • Number of parameters (encoder): 23.0M
  • Model size (encoder) (float): 87.8 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
decoder ONNX float Snapdragon® X2 Elite 1.27 ms 205 - 205 MB NPU
decoder ONNX float Snapdragon® X Elite 2.106 ms 174 - 174 MB NPU
decoder ONNX float Snapdragon® 8 Gen 3 Mobile 1.5 ms 0 - 272 MB NPU
decoder ONNX float Snapdragon® 8 Gen 1 Mobile 2.738 ms 1 - 209 MB NPU
decoder ONNX float Qualcomm® QCS8550 (Proxy) 2.124 ms 1 - 4 MB NPU
decoder ONNX float Qualcomm® QCS8450 2.738 ms 1 - 209 MB NPU
decoder ONNX float Snapdragon® 8 Elite Mobile 1.302 ms 0 - 192 MB NPU
decoder ONNX float Snapdragon® 8 Elite Gen 5 Mobile 1.206 ms 1 - 173 MB NPU
decoder ONNX float Qualcomm® QCS9075 2.684 ms 7 - 51 MB NPU
decoder ONNX float Qualcomm® QCS8750 1.302 ms 0 - 192 MB NPU
decoder ONNX float Qualcomm® QCS7181 2.106 ms 174 - 174 MB NPU
decoder QNN_DLC float Snapdragon® X2 Elite 1.636 ms 7 - 7 MB NPU
decoder QNN_DLC float Snapdragon® X Elite 2.229 ms 7 - 7 MB NPU
decoder QNN_DLC float Snapdragon® 8 Gen 3 Mobile 1.367 ms 0 - 274 MB NPU
decoder QNN_DLC float Snapdragon® 8 Gen 1 Mobile 2.663 ms 2 - 217 MB NPU
decoder QNN_DLC float Qualcomm® QCS8275 4.237 ms 5 - 105 MB NPU
decoder QNN_DLC float Qualcomm® QCS8550 (Proxy) 2.025 ms 3 - 174 MB NPU
decoder QNN_DLC float Qualcomm® QCS8450 2.663 ms 2 - 217 MB NPU
decoder QNN_DLC float Snapdragon® 8 Elite Mobile 1.237 ms 0 - 187 MB NPU
decoder QNN_DLC float Qualcomm® SA8295P 2.606 ms 7 - 49 MB NPU
decoder QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 1.164 ms 1 - 173 MB NPU
decoder QNN_DLC float Qualcomm® SA7255P 4.237 ms 5 - 105 MB NPU
decoder QNN_DLC float Qualcomm® QCS9075 2.573 ms 7 - 16 MB NPU
decoder QNN_DLC float Qualcomm® QCS8750 1.237 ms 0 - 187 MB NPU
decoder QNN_DLC float Qualcomm® QCS7181 2.229 ms 7 - 7 MB NPU
decoder TFLITE float Snapdragon® 8 Gen 3 Mobile 1.381 ms 0 - 280 MB NPU
decoder TFLITE float Snapdragon® 8 Gen 1 Mobile 2.439 ms 0 - 217 MB NPU
decoder TFLITE float Qualcomm® QCS8275 4.249 ms 0 - 104 MB NPU
decoder TFLITE float Qualcomm® QCS8550 (Proxy) 1.987 ms 0 - 2 MB NPU
decoder TFLITE float Qualcomm® SA8775P 8.85 ms 7 - 25 MB GPU
decoder TFLITE float Qualcomm® SA8650P 8.85 ms 7 - 25 MB GPU
decoder TFLITE float Qualcomm® SA8255P 8.85 ms 7 - 25 MB GPU
decoder TFLITE float Qualcomm® QCS8450 2.439 ms 0 - 217 MB NPU
decoder TFLITE float Snapdragon® 8 Elite Mobile 1.237 ms 0 - 189 MB NPU
decoder TFLITE float Qualcomm® SA8295P 2.666 ms 0 - 42 MB NPU
decoder TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 1.145 ms 0 - 175 MB NPU
decoder TFLITE float Qualcomm® SA7255P 4.249 ms 0 - 104 MB NPU
decoder TFLITE float Qualcomm® QCS9075 2.581 ms 0 - 83 MB NPU
decoder TFLITE float Qualcomm® QCS8750 1.237 ms 0 - 189 MB NPU
encoder ONNX float Snapdragon® X2 Elite 5.484 ms 211 - 211 MB NPU
encoder ONNX float Snapdragon® X Elite 11.431 ms 180 - 180 MB NPU
encoder ONNX float Snapdragon® 8 Gen 3 Mobile 7.839 ms 16 - 205 MB NPU
encoder ONNX float Snapdragon® 8 Gen 1 Mobile 20.027 ms 16 - 322 MB NPU
encoder ONNX float Qualcomm® QCS8550 (Proxy) 10.915 ms 0 - 59 MB NPU
encoder ONNX float Qualcomm® QCS8450 20.027 ms 16 - 322 MB NPU
encoder ONNX float Snapdragon® 8 Elite Mobile 5.294 ms 16 - 141 MB NPU
encoder ONNX float Snapdragon® 8 Elite Gen 5 Mobile 4.13 ms 2 - 124 MB NPU
encoder ONNX float Qualcomm® QCS9075 14.176 ms 15 - 60 MB NPU
encoder ONNX float Qualcomm® QCS8750 5.294 ms 16 - 141 MB NPU
encoder ONNX float Qualcomm® QCS7181 11.431 ms 180 - 180 MB NPU
encoder QNN_DLC float Snapdragon® X2 Elite 5.178 ms 2 - 2 MB NPU
encoder QNN_DLC float Snapdragon® X Elite 12.181 ms 2 - 2 MB NPU
encoder QNN_DLC float Snapdragon® 8 Gen 3 Mobile 8.016 ms 0 - 190 MB NPU
encoder QNN_DLC float Snapdragon® 8 Gen 1 Mobile 19.436 ms 0 - 306 MB NPU
encoder QNN_DLC float Qualcomm® QCS8275 37.03 ms 2 - 107 MB NPU
encoder QNN_DLC float Qualcomm® QCS8550 (Proxy) 11.295 ms 2 - 170 MB NPU
encoder QNN_DLC float Qualcomm® QCS8450 19.436 ms 0 - 306 MB NPU
encoder QNN_DLC float Snapdragon® 8 Elite Mobile 5.448 ms 2 - 113 MB NPU
encoder QNN_DLC float Qualcomm® SA8295P 19.688 ms 2 - 233 MB NPU
encoder QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 4.328 ms 2 - 116 MB NPU
encoder QNN_DLC float Qualcomm® SA7255P 37.03 ms 2 - 107 MB NPU
encoder QNN_DLC float Qualcomm® QCS9075 15.108 ms 2 - 12 MB NPU
encoder QNN_DLC float Qualcomm® QCS8750 5.448 ms 2 - 113 MB NPU
encoder QNN_DLC float Qualcomm® QCS7181 12.181 ms 2 - 2 MB NPU
encoder TFLITE float Snapdragon® 8 Gen 3 Mobile 7.705 ms 3 - 195 MB NPU
encoder TFLITE float Snapdragon® 8 Gen 1 Mobile 19.541 ms 7 - 311 MB NPU
encoder TFLITE float Qualcomm® QCS8275 36.805 ms 7 - 119 MB NPU
encoder TFLITE float Qualcomm® QCS8550 (Proxy) 10.741 ms 7 - 9 MB NPU
encoder TFLITE float Qualcomm® SA8775P 122.908 ms 6 - 34 MB GPU
encoder TFLITE float Qualcomm® SA8650P 122.908 ms 6 - 34 MB GPU
encoder TFLITE float Qualcomm® SA8255P 122.908 ms 6 - 34 MB GPU
encoder TFLITE float Qualcomm® QCS8450 19.541 ms 7 - 311 MB NPU
encoder TFLITE float Snapdragon® 8 Elite Mobile 5.202 ms 6 - 118 MB NPU
encoder TFLITE float Qualcomm® SA8295P 19.12 ms 7 - 236 MB NPU
encoder TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 4.102 ms 0 - 105 MB NPU
encoder TFLITE float Qualcomm® SA7255P 36.805 ms 7 - 119 MB NPU
encoder TFLITE float Qualcomm® QCS9075 13.921 ms 5 - 65 MB NPU
encoder TFLITE float Qualcomm® QCS8750 5.202 ms 6 - 118 MB NPU

License

  • The license for the original implementation of TrOCR can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using qualcomm/TrOCR 1

Paper for qualcomm/TrOCR