Instructions to use vrajdetrojapes/chartqa-qwen2vl-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use vrajdetrojapes/chartqa-qwen2vl-lora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-VL-2B-Instruct")
model = PeftModel.from_pretrained(base_model, "vrajdetrojapes/chartqa-qwen2vl-lora")

Transformers

How to use vrajdetrojapes/chartqa-qwen2vl-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="vrajdetrojapes/chartqa-qwen2vl-lora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("vrajdetrojapes/chartqa-qwen2vl-lora", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use vrajdetrojapes/chartqa-qwen2vl-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "vrajdetrojapes/chartqa-qwen2vl-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vrajdetrojapes/chartqa-qwen2vl-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/vrajdetrojapes/chartqa-qwen2vl-lora

SGLang

How to use vrajdetrojapes/chartqa-qwen2vl-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "vrajdetrojapes/chartqa-qwen2vl-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vrajdetrojapes/chartqa-qwen2vl-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "vrajdetrojapes/chartqa-qwen2vl-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vrajdetrojapes/chartqa-qwen2vl-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use vrajdetrojapes/chartqa-qwen2vl-lora with Docker Model Runner:
```
docker model run hf.co/vrajdetrojapes/chartqa-qwen2vl-lora
```

ChartQA Multimodal Fine-Tuning using Qwen2-VL

This repository contains LoRA adapters fine-tuned for visual question answering on chart images using the ChartQA dataset.

The adapters were trained on top of the base model:

Qwen/Qwen2-VL-2B-Instruct

using parameter-efficient fine-tuning techniques.

Model Details

Model Name: chartqa-qwen2vl-lora

Developed by: Vraj Detroja

Model Type: Vision-Language Model (Multimodal Transformer)

Base Model: Qwen/Qwen2-VL-2B-Instruct

Fine-Tuning Method: LoRA (Low Rank Adaptation)

Training Platform: Kaggle

GPU: Tesla T4 (16GB VRAM)

Frameworks Used

PyTorch
Hugging Face Transformers
PEFT
BitsAndBytes
Accelerate

Model Description

This model is a LoRA fine-tuned adapter for the Qwen2-VL vision-language model trained on the ChartQA dataset.

The model learns to answer questions about chart images.

Example task:

Input:

Image + Question

Is the value of Favorable 38 in 2015?

Output:

Yes

The model processes both visual and textual information to generate answers.

Dataset

Training dataset:

ChartQA

Dataset link:
https://huggingface.co/datasets/HuggingFaceM4/ChartQA

ChartQA is a visual question answering dataset for chart understanding.

Dataset structure:

Field	Description
image	Chart image
query	Question about chart
label	Ground truth answer
human_or_machine	Annotation type

Dataset splits:

Split	Samples
Train	28,299
Validation	1,920
Test	2,500

For this project, 1000 samples from the training set were used for fine-tuning.

Training Details

Fine-Tuning Method

The model was fine-tuned using LoRA (Low Rank Adaptation).

Instead of training the entire model, LoRA trains small adapter layers inserted into the transformer architecture.

Training statistics:

Metric	Value
Total parameters	~2.21B
Trainable parameters	~2.17M
Trainable percentage	~0.1%

This significantly reduces GPU memory usage.

Quantization

The model was loaded using 4-bit quantization via BitsAndBytes.

Configuration:

load_in_4bit = True
bnb_4bit_quant_type = "nf4"
bnb_4bit_compute_dtype = float16

Benefits:

Reduced VRAM usage
Faster training
Enables training on T4 GPU

Training Configuration

Hyperparameters used:

Batch size: 1
Gradient accumulation: 4
Learning rate: 2e-4
Epochs: 1
Training samples: 1000
Precision: FP16

Gradient checkpointing was enabled to reduce memory consumption.

Training results:

Metric	Value
Training steps	250
Final training loss	~7.59

Hardware Used

Component	Value
GPU	Tesla T4
VRAM	16 GB
Platform	Kaggle
Framework	PyTorch

Training time:

~9 minutes for 250 training steps.

How to Use the Model

This repository contains LoRA adapters, not the full model.

You must load the base model first.

Example:

from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from peft import PeftModel

base_model = Qwen2VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2-VL-2B-Instruct",
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "vrajdetrojapes/chartqa-qwen2vl-lora"
)

processor = AutoProcessor.from_pretrained(
    "Qwen/Qwen2-VL-2B-Instruct"
)

Example Inference

from PIL import Image

image = Image.open("sample_chart.png")

question = "Is the value of Favorable 38 in 2015?"

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": question}
        ],
    }
]

text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = processor(
    text=[text],
    images=[image],
    return_tensors="pt"
).to(model.device)

output = model.generate(
    **inputs,
    max_new_tokens=50
)

print(processor.decode(output[0]))

Intended Use

The model is intended for:

Chart question answering
Multimodal reasoning research
Vision-language experimentation
Educational purposes

Limitations

The model has several limitations:

Trained on a small subset of the dataset
May struggle with complex chart reasoning
Limited generalization beyond chart datasets
Not suitable for production systems

Ethical Considerations

Users should be aware that:

The model may generate incorrect answers.
Chart interpretation errors are possible.
Outputs should be validated for critical applications.

Citation

@misc{chartqa_qwen2vl_lora,
  author = {Detroja, Vraj},
  title = {ChartQA Multimodal Fine-Tuning using Qwen2-VL},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/vrajdetrojapes/chartqa-qwen2vl-lora}
}

Author

Vraj Detroja

Natural Language Processing with Deep Learning
Multimodal Fine-Tuning Project

Downloads last month: -

Model tree for vrajdetrojapes/chartqa-qwen2vl-lora

Base model

Qwen/Qwen2-VL-2B

Finetuned

Qwen/Qwen2-VL-2B-Instruct

Adapter

(160)

this model

vrajdetrojapes
/

chartqa-qwen2vl-lora