YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

DistilViT2 for Transformers.js

This model is compatible with transformers.js image-to-text pipeline.

Usage

import { pipeline } from '@huggingface/transformers';

const captioner = await pipeline('image-to-text', 'tarekziade/distilvit2');
const result = await captioner('path/to/image.jpg');
console.log(result);

Architecture

  • Vision: SigLIP-base-patch16-224 (frozen during training)
  • Projector: Trained linear/MLP projection (768 → 576)
  • Text: SmolLM-135M with merged LoRA adapters

Training

  • Dataset: Flickr30k, COCO
  • Task: Image captioning
  • Trainable parameters: 2.2M (1% of total)
Downloads last month
45
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support