Adapt-Ar

This model is adapted from the DeepAr model. While structurally identical, it was trained for additional epochs on augmented data to improve robustness against noisy audio, silence, and variations introduced by augmentation.

The purpose of this model is to address the main limitations of the original Whisper-Large-v3-Turbo model:

  • Silent or low-volume segments โ†’ often caused the model to hallucinate text.

  • Noisy environments and variable speech speed โ†’ reduced transcription accuracy and stability.

For details on model usage and the dataset, please refer to DeepAr. Adapt-Ar is architecturally identical to DeepAR, but trained for additional half an epoch on augmented data to improve robustness.

Both models share the same usage; the only difference lies in the training process and naming.

Downloads last month
8
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including CUAIStudents/Adapt-Ar