Instructions to use barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4")
model = AutoModelForCausalLM.from_pretrained("barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4

SGLang

How to use barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4",
    max_seq_length=2048,
)

Docker Model Runner
How to use barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4 with Docker Model Runner:
```
docker model run hf.co/barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4
```

GPT-OSS-120B Multilingual Reasoner (with Turkish)

This model is a fine-tuned version of openai/gpt-oss-120b that can generate chain-of-thought reasoning in multiple languages, including Turkish.

Model Description

Large reasoning models like OpenAI o3 generate a chain-of-thought to improve the accuracy and quality of their responses. However, most of these models reason in English, even when a question is asked in another language.

This fine-tuned model addresses this limitation by adding a "reasoning language" option to the system prompt, enabling the model to think step-by-step in the user's preferred language. This improves interpretability for non-English speakers who want to understand the model's reasoning process.

Supported Reasoning Languages

🇺🇸 English
🇪🇸 Spanish
🇫🇷 French
🇮🇹 Italian
🇩🇪 German
🇹🇷 Turkish

Example Usage

Below is an example usage with unsloth but you can also use this model with HF, vLLM, SGLANG, llama.cpp.
Refer to your desired inference engine's gpt-oss documentation for further information

from unsloth import FastLanguageModel
from transformers import TextStreamer

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4", 
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

REASONING_LANGUAGE = "Turkish" # can be anything
DEVELOPER_PROMPT = "You are a helpful assistant. Respond in Turkish to the user." # using english here is recommended.
USER_PROMPT = "mükemmeliyetçilik kelimesinde kaç tane m harfi vardır ?"

messages = [
    {"role": "system", "content": f"reasoning language: {REASONING_LANGUAGE}\n\n{DEVELOPER_PROMPT}"}, # this should be the format of our system prompt.
    {"role": "user", "content": USER_PROMPT},
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt = True,
    return_tensors = "pt",
    return_dict = True,
    reasoning_effort = "medium",
).to("cuda")

_ = model.generate(**inputs, max_new_tokens = 2048, temperature=1.0, top_p=1.0, top_k=0, do_sample=True, streamer=TextStreamer(tokenizer))

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-12-28

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.
Calls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>developer<|message|># Instructions

reasoning language: Turkish

You are a helpful assistant. Respond in Turkish to the user.<|end|><|start|>user<|message|>mükemmeliyetçilik kelimesinde kaç tane m harfi vardır ?<|end|><|start|>assistant<|channel|>analysis<|message|>Tamam, kullanıcı "mükemmeliyetçilik" kelimesinde kaç tane "m" harfi olduğunu soruyor. Adım adım düşünelim.

Öncelikle kelimeyi ayıralım ve her bir harfe bakalım. Şunu yazarak başlayabilirim: "mükemmeliyetçilik". Şimdi harfleri parçalara ayıralım:
- m
- ü
- k
- e
- m
- m
- e
- l
- i
- y
- e
- t
- ç
- i
- l
- i
- k

Verilen listede görünen "m" harfleri:
- Başlangıçta bir tane "m".
- Daha sonra "?m" olarak işaretlenmiş bir öğe var.
- Ayrıca bir tane daha "m" var.

Öyleyse toplam üç tane "m" harfi bulunuyor.

Dolayısıyla, kullanıcı için cevap basit ve açıklansız olabilir. Sonuçlanan söyle? Hayır, "mükemmeliyetçilik" kelimesinde 3 "m" harfi vardır.<|end|><|start|>assistant<|channel|>final<|message|>"Mükemmeliyetçilik" kelimesinde 3 tane **'m'** harfi bulunur. <|return|>

Training Details

Dataset

This model was fine-tuned on barandinho/Multilingual-Thinking-with-Turkish-1185, an extended version of HuggingFaceH4/Multilingual-Thinking that includes Turkish reasoning examples.

Total examples: 1,185
Languages: English, Spanish, French, Italian, German, Turkish

Training Configuration

Parameter	Value
Base Model	gpt-oss-120b
Fine-tuning Method	LoRA (via Unsloth)
LoRA Rank (r)	16
LoRA Alpha	32
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable Parameters	11,943,936 (0.01% of 116B)
Max Sequence Length	3,072
Quantization (Training)	4-bit

Training Hyperparameters

Parameter	Value
Learning Rate	2e-4
Batch Size (per device)	1
Gradient Accumulation Steps	4
Effective Batch Size	4
Epochs	1
Warmup Ratio	0.05
LR Scheduler	Linear
Optimizer	AdamW 8-bit
Weight Decay	0.001
Total Steps	297

Training Infrastructure

Hardware: NVIDIA H100 80GB
Training Time: ~3 hours
Framework: Unsloth + TRL

Related Models

Lora Adapter Version: barandinho/gpt-oss-120b-multilingual-reasoner - unmerged lora adapter version of the model.

Acknowledgments

This work is inspired by the OpenAI Cookbook tutorial on fine-tuning gpt-oss-20b

Special thanks to :

OpenAI for releasing the gpt-oss model family
HuggingFace for the TRL library and Multilingual-Thinking dataset
Unsloth for efficient fine-tuning optimizations
TRUBA for providing H100 GPU infrastructure

Limitations

The model has been fine-tuned on a relatively small dataset (1,185 examples), so performance may vary across different reasoning tasks.
While the model can generalize to other languages not explicitly in the training set, quality may be lower compared to the six supported languages.
Complex mathematical or scientific reasoning performance may be degraded.

Citation

If you use this model, please cite:

@misc{gpt-oss-120b-multilingual-reasoner,
  author = {Baran Bingöl},
  title = {GPT-OSS-120B Multilingual Reasoner with Turkish},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/barandinho/gpt-oss-120b-multilingual-reasoner}
}

License

This model inherits the license from the base model openai/gpt-oss-120b. Please refer to the base model's license for usage terms.

Downloads last month: 12

Safetensors

Model size

120B params

Tensor type

BF16

Model tree for barandinho/gpt-oss-120b-multilingual-reasoner-MXFP4

Base model

openai/gpt-oss-120b

Quantized

unsloth/gpt-oss-120b-unsloth-bnb-4bit

Quantized

(31)

this model