Feature Extraction
Transformers
Safetensors
AuriStream
audio
speech
language-model
auristream
custom_code
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("feature-extraction", model="TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_500k", trust_remote_code=True)# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_500k", trust_remote_code=True, dtype="auto")Quick Links
AuriStream7BDeep_40Pred_BigAudioDataset_500k
AuriStream is a speech language model by Greta Tuckute and Klemen Kotar.
This model predicts cochlear tokens from a tokenizer such as WavCochCausalV8192.
Model Details
| Parameter | Value |
|---|---|
| Parameters | ~8.39B |
| Layers | 96 |
| Hidden Size | 2560 |
| Attention Heads | 32 |
| Vocab Size | 8192 |
| Prediction Steps | 40 |
Usage
from transformers import AutoModel, AutoConfig
# Load with trust_remote_code for custom model
model = AutoModel.from_pretrained(
"TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_500k",
trust_remote_code=True,
)
# Or load config first
config = AutoConfig.from_pretrained("TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_500k", trust_remote_code=True)
Base Model Code
This checkpoint uses shared model code from TuKoResearch/AuriStream-base.
Tokenizer
This model uses cochlear tokens from WavCochCausalV8192.
- Downloads last month
- 340
# Gated model: Login with a HF token with gated access permission hf auth login