working on PR

from transformers import GlmasrForConditionalGeneration, AutoProcessor
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
repo_id = "ZHANGYUXUAN-zR/GLM-ASR-HF-Support"

processor = AutoProcessor.from_pretrained(repo_id)
model = GlmasrForConditionalGeneration.from_pretrained(repo_id, dtype=torch.bfloat16, device_map=device)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "audio",
                "url": "example_zh.wav",
            },
            {"type": "text", "text": "Please transcribe this audio into text"},
        ],
    }
]

inputs = processor.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt"
)
inputs = inputs.to(device, dtype=torch.bfloat16)

outputs = model.generate(**inputs, max_new_tokens=128,do_sample=False)
print(processor.batch_decode(outputs[:, inputs.input_ids.shape[1]:], skip_special_tokens=True))

Downloads last month: 269

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ZHANGYUXUAN-zR/GLM-ASR-HF-Support

Base model

zai-org/GLM-ASR-Nano-2512

Finetuned

(2)

this model