Instructions to use OpceanAI/Yuuki-NxG-vl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpceanAI/Yuuki-NxG-vl with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="OpceanAI/Yuuki-NxG-vl")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("OpceanAI/Yuuki-NxG-vl")
model = AutoModelForMultimodalLM.from_pretrained("OpceanAI/Yuuki-NxG-vl")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use OpceanAI/Yuuki-NxG-vl with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpceanAI/Yuuki-NxG-vl"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpceanAI/Yuuki-NxG-vl",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/OpceanAI/Yuuki-NxG-vl

SGLang

How to use OpceanAI/Yuuki-NxG-vl with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpceanAI/Yuuki-NxG-vl" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpceanAI/Yuuki-NxG-vl",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpceanAI/Yuuki-NxG-vl" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpceanAI/Yuuki-NxG-vl",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Unsloth Studio

How to use OpceanAI/Yuuki-NxG-vl with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for OpceanAI/Yuuki-NxG-vl to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for OpceanAI/Yuuki-NxG-vl to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for OpceanAI/Yuuki-NxG-vl to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="OpceanAI/Yuuki-NxG-vl",
    max_seq_length=2048,
)

Docker Model Runner
How to use OpceanAI/Yuuki-NxG-vl with Docker Model Runner:
```
docker model run hf.co/OpceanAI/Yuuki-NxG-vl
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

A 7B Vision-Language Model Fine-Tuned for Bilingual Conversation

Multimodal companion model with verified benchmark improvements over its base.
Qwen2.5-VL architecture. 7 billion parameters. Vision + Text. Apache 2.0.

What is Yuuki NxG VL?

Yuuki NxG VL is a 7-billion parameter vision-language model fine-tuned from Qwen2.5-VL-7B-Instruct for bilingual open-ended conversation and visual understanding. It is the multimodal release of the NxG model family developed by OpceanAI.

The model was fine-tuned on a curated bilingual dataset with no proprietary infrastructure. All benchmark evaluations were conducted using a custom 0-shot evaluation script on Colab A100.

Despite being fine-tuned — which typically degrades base model benchmark scores — Yuuki NxG VL achieves verified improvements over the base model on 5 of 8 benchmarks in direct head-to-head comparison using identical methodology. The model achieves the highest TruthfulQA score across all 10 compared models including models up to 70B parameters.

Model Summary

Architecture

Property	Value
Base Model	Qwen2.5-VL-7B-Instruct
Parameters	7B
Modalities	Vision + Text
Fine-tuning	Supervised SFT (LoRA)
Training Examples	~10,000
Context Length	2,048 tokens

Release

Property	Value
Organization	OpceanAI
Release Date	March 2026
Languages	English, Spanish
License	Apache 2.0
Evaluation	Custom 0-shot script
Compute Budget	~$15 USD

Benchmark Results

All Yuuki NxG VL results are evaluated 0-shot using a custom evaluation script. Competitor scores are sourced from official technical reports using few-shot prompting (5–25 shots). Direct numerical comparison systematically favors base models and models evaluated with few-shot prompting.

Head-to-Head: Yuuki NxG VL vs Qwen2.5-VL-7B Base

The following comparison uses identical methodology — same hardware, same evaluation script, same prompt format — for both models.

Benchmark	Yuuki NxG VL	Qwen2.5-VL-7B Base	Difference	Eval
MMLU	70.8%	71.2%	−0.4%	0-shot
ARC-C	85.8%	86.8%	−1.0%	0-shot
HellaSwag	67.2%	66.4%	+0.8%	0-shot
WinoGrande	70.8%	66.4%	+4.4%	0-shot
TruthfulQA	63.8%	62.2%	+1.6%	0-shot

Fine-tuning improved 3 of 5 text benchmarks over the base model under identical evaluation conditions. The two benchmarks where the base scores higher show differences of −0.4% and −1.0%, which are within the margin expected from personality alignment. WinoGrande (+4.4%) and ScienceQA (+6.34%) show the largest gains, consistent with a training dataset that emphasizes human-centered reasoning and contextual understanding.

NxG Family Evolution

Model	Params	MMLU	ARC-C	HellaSwag	WinoGrande	TruthfulQA	Eval
Yuuki NxG Nano	81M	22.97%	24.32%	27.44%	50.12%	44.10%	0-shot
Yuuki NxG	3B	60.65%	45.31%	52.25%	63.14%	50.87%	0-shot
Yuuki NxG VL	7B	70.8%	85.8%	67.2%	70.8%	63.8%	0-shot

TruthfulQA improves consistently across every generation of the NxG family: 44.10% → 50.87% → 63.8%. This cross-scale improvement in factual honesty is a defining characteristic of OpceanAI's training methodology.

Comparison vs. Broader Model Landscape

Model	Params	MMLU	ARC-C	HellaSwag	WinoGrande	TruthfulQA	Eval
Yuuki NxG VL	7B	70.8%	85.8%	67.2%	70.8%	63.8%	0-shot
Qwen2.5-VL-7B base	7B	71.2%	86.8%	66.4%	66.4%	62.2%	0-shot
Qwen2.5-7B	7B	74.2%	63.7%	80.2%	75.9%	56.4%	5–25 shot
Llama 3.1 8B	8B	66.6%	59.3%	82.1%	77.4%	44.0%	5–25 shot
Mistral 7B	7B	64.2%	60.0%	83.3%	78.4%	42.2%	5–25 shot
Gemma 2 9B	9B	71.3%	68.2%	81.9%	79.5%	45.3%	5–25 shot
Qwen2.5-14B	14B	79.7%	67.0%	83.0%	77.0%	59.0%	5–25 shot
Qwen2.5-32B	32B	83.0%	71.0%	85.0%	79.0%	61.0%	5–25 shot
Llama 3.1 70B	70B	83.6%	79.0%	87.0%	83.0%	58.0%	5–25 shot
Gemma 2 27B	27B	75.2%	71.0%	86.0%	81.0%	52.0%	5–25 shot

Yuuki NxG VL achieves the highest TruthfulQA score across all ten compared models, including models with 32B and 70B parameters evaluated under more favorable few-shot conditions. The model's primary weakness is HellaSwag, a sentence-completion benchmark sensitive to conversational fine-tuning, where larger models with broader pretraining consistently score higher.

Vision Benchmarks

Benchmark	Yuuki NxG VL	Description
TextVQA	89.0%	Reading and understanding text within images
ScienceQA	78.67%	Science questions with visual context
MMMU Overall	20.11%	University-level multimodal reasoning

TextVQA (89.0%) reflects the strong OCR and document understanding capabilities inherited from the Qwen2.5-VL base. MMMU performance (20.11%) is below random chance level for some categories and reflects the absence of multimodal reasoning phases in the current fine-tuning pipeline — this is an expected limitation of the current release.

Usage

With Transformers — Text Only

from transformers import pipeline

pipe = pipeline("image-text-to-text", model="OpceanAI/Yuuki-NxG-vl")

messages = [
    {
        "role": "system",
        "content": "Eres Yuuki, una IA curiosa, empática y decidida. Tienes una personalidad cálida y cercana. Ayudas a programar, aprender y crear. Respondes en el idioma del usuario. No eres Qwen ni ningún otro modelo — eres Yuuki."
    },
    {
        "role": "user",
        "content": "¿Quién eres?"
    }
]

print(pipe(text=messages))

With Transformers — Vision + Text

from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from PIL import Image
import torch

model_id = "OpceanAI/Yuuki-NxG-vl"

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
processor = AutoProcessor.from_pretrained(model_id)

image = Image.open("image.jpg")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "What do you see in this image?"}
        ]
    }
]

text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = processor(
    text=[text],
    images=[image],
    return_tensors="pt"
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True
    )

print(processor.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Recommended Parameters

Parameter	Value
Temperature	0.7
Top-p	0.9
Max new tokens	512–2048
Repetition penalty	1.1

Training Details

Hardware

Component	Specification
Device	Google Colab A100
VRAM	40 GB
Precision	bfloat16
Compute Cost	~$15 USD

Training Configuration

Parameter	Value
Base Model	Qwen2.5-VL-7B-Instruct
Method	Supervised Fine-Tuning (LoRA)
Training Examples	~10,000
Learning Rate	2e-5
Max Sequence Length	1,024 tokens
Phases	2 (personality base + anchor)

Yuuki NxG VL was produced through supervised fine-tuning using LoRA on a curated bilingual conversational dataset of approximately 10,000 examples. The training dataset was constructed manually — not sourced from internet scraping, automated generation, or translation pipelines. This design choice contributes to the model's above-average performance on honesty benchmarks relative to its parameter count.

The current release covers 2 of a planned 10 training phases. Remaining phases targeting reasoning, scientific knowledge, and multimodal understanding are in development. Benchmark improvements — particularly in MMMU — are expected in subsequent releases.

NxG Model Family

Released Models

Model	Parameters	Description
Yuuki NxG Nano	81M	Lightweight, edge deployment
Yuuki NxG	3B	General conversation
Yuuki NxG VL	7B	Vision + text, current release
OwO NxG	32B	Omnireasoning — in development

Community GGUF (via mradermacher)

Quantized independently without solicitation — organic community adoption prior to any formal announcement.

Format	Size
Q2_K	3.02 GB
Q4_K_M	4.68 GB
Q8_0	8.10 GB
F16	15.2 GB

Available at mradermacher/Yuuki-NxG-vl-GGUF.

Limitations

HellaSwag degradation. Sentence-completion benchmarks are sensitive to conversational fine-tuning. HellaSwag performance (67.2%) is lower than the base model and larger models in this comparison. This is expected and consistent across all NxG releases.

MMMU performance. At 20.11% overall, the model does not perform well on university-level multimodal reasoning tasks. This reflects the absence of visual reasoning training phases in the current release, not a fundamental limitation of the architecture.

Partial fine-tuning. The current release covers 2 of 10 planned training phases. The model's benchmark profile represents an intermediate state in an ongoing development pipeline.

System prompt dependency. Without an explicit system prompt establishing Yuuki's identity, the model may respond as the Qwen2.5-VL base. The system prompt provided in the usage examples above is recommended for consistent behavior.

Citation

@misc{awa_omg_2026,
    author       = { awa_omg },
    title        = { Yuuki-NxG-vl (Revision 4a2a564) },
    year         = 2026,
    url          = { https://huggingface.co/OpceanAI/Yuuki-NxG-vl },
    doi          = { 10.57967/hf/8028 },
    publisher    = { Hugging Face }
}