Text Generation
Transformers
Safetensors
GGUF
Korean
English
llama
3b
korean
from-scratch
orpo
instruction-tuned
preference-aligned
fp8
b200
Eval Results (legacy)
text-generation-inference
Instructions to use pathcosmos/frankenstallm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use pathcosmos/frankenstallm with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="pathcosmos/frankenstallm")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("pathcosmos/frankenstallm") model = AutoModelForCausalLM.from_pretrained("pathcosmos/frankenstallm") - llama-cpp-python
How to use pathcosmos/frankenstallm with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="pathcosmos/frankenstallm", filename="gguf/frankenstallm-3b-Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use pathcosmos/frankenstallm with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Use Docker
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use pathcosmos/frankenstallm with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "pathcosmos/frankenstallm" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pathcosmos/frankenstallm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
- SGLang
How to use pathcosmos/frankenstallm with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "pathcosmos/frankenstallm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pathcosmos/frankenstallm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "pathcosmos/frankenstallm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pathcosmos/frankenstallm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Ollama
How to use pathcosmos/frankenstallm with Ollama:
ollama run hf.co/pathcosmos/frankenstallm:Q4_K_M
- Unsloth Studio
How to use pathcosmos/frankenstallm with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for pathcosmos/frankenstallm to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for pathcosmos/frankenstallm to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for pathcosmos/frankenstallm to start chatting
- Docker Model Runner
How to use pathcosmos/frankenstallm with Docker Model Runner:
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
- Lemonade
How to use pathcosmos/frankenstallm with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull pathcosmos/frankenstallm:Q4_K_M
Run and chat with the model
lemonade run user.frankenstallm-Q4_K_M
List all available models
lemonade list
docs: add Korean model card + contact info
Browse files
README.md
CHANGED
|
@@ -82,6 +82,305 @@ model-index:
|
|
| 82 |
|
| 83 |
# FRANKENSTALLM 3B
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
> **A Korean 3B LLM built entirely from scratch โ tokenizer, pretraining, SFT, and ORPO โ on 8ร NVIDIA B200 GPUs.**
|
| 86 |
|
| 87 |
| | |
|
|
@@ -366,7 +665,8 @@ ollama run frankenstallm
|
|
| 366 |
|
| 367 |
---
|
| 368 |
|
| 369 |
-
## Links
|
| 370 |
|
| 371 |
- **GitHub**: [pathcosmos/FRANKENSTALLM](https://github.com/pathcosmos/FRANKENSTALLM) โ Full source code, training scripts, and builder's log
|
| 372 |
- **HuggingFace**: [pathcosmos/frankenstallm](https://huggingface.co/pathcosmos/frankenstallm)
|
|
|
|
|
|
| 82 |
|
| 83 |
# FRANKENSTALLM 3B
|
| 84 |
|
| 85 |
+
> **ํ๊ตญ์ด 3B LLM์ ์ฒ์๋ถํฐ ์ง์ ๋ง๋ค์์ต๋๋ค โ ํ ํฌ๋์ด์ ํ์ต๋ถํฐ ์ฌ์ ํ์ต, SFT, ORPO๊น์ง, 8ร NVIDIA B200 GPU ์์์.**
|
| 86 |
+
|
| 87 |
+
| | |
|
| 88 |
+
|---|---|
|
| 89 |
+
| **๊ฐ๋ฐ์** | [pathcosmos](https://huggingface.co/pathcosmos) |
|
| 90 |
+
| **ํ๋ผ๋ฏธํฐ** | ~24์ต (weight tying ์ ์ฉ, 3B๊ธ) |
|
| 91 |
+
| **์ธ์ด** | ํ๊ตญ์ด (์ฃผ), ์์ด (๋ถ) |
|
| 92 |
+
| **๋ผ์ด์ ์ค** | Apache 2.0 |
|
| 93 |
+
| **ํ์ต** | 3๋จ๊ณ: ์ฌ์ ํ์ต โ SFT โ ORPO |
|
| 94 |
+
| **ํ๋์จ์ด** | 8ร NVIDIA B200 (FP8), ์ด ~86์๊ฐ |
|
| 95 |
+
|
| 96 |
+
---
|
| 97 |
+
|
| 98 |
+
## ๋น ๋ฅธ ์์
|
| 99 |
+
|
| 100 |
+
### Transformers
|
| 101 |
+
|
| 102 |
+
```python
|
| 103 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 104 |
+
import torch
|
| 105 |
+
|
| 106 |
+
model_id = "pathcosmos/frankenstallm"
|
| 107 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 108 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 109 |
+
model_id, torch_dtype=torch.bfloat16, device_map="auto"
|
| 110 |
+
)
|
| 111 |
+
|
| 112 |
+
inputs = tokenizer(
|
| 113 |
+
"ํ๊ตญ์ ์ ํต ์์ ์ค ๊น์น์ ๋ํด ์ค๋ช
ํด์ฃผ์ธ์.",
|
| 114 |
+
return_tensors="pt"
|
| 115 |
+
).to(model.device)
|
| 116 |
+
|
| 117 |
+
with torch.no_grad():
|
| 118 |
+
outputs = model.generate(
|
| 119 |
+
**inputs,
|
| 120 |
+
do_sample=True,
|
| 121 |
+
temperature=0.7,
|
| 122 |
+
repetition_penalty=1.2, # ๊ถ์ฅ
|
| 123 |
+
top_p=0.9,
|
| 124 |
+
max_new_tokens=512,
|
| 125 |
+
pad_token_id=tokenizer.eos_token_id,
|
| 126 |
+
)
|
| 127 |
+
|
| 128 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 129 |
+
```
|
| 130 |
+
|
| 131 |
+
### Ollama (GGUF)
|
| 132 |
+
|
| 133 |
+
```bash
|
| 134 |
+
# GGUF + Modelfile ๋ค์ด๋ก๋
|
| 135 |
+
huggingface-cli download pathcosmos/frankenstallm \
|
| 136 |
+
gguf/frankenstallm-3b-v2-Q4_K_M.gguf \
|
| 137 |
+
gguf/Modelfile.3b-v2-Q4_K_M \
|
| 138 |
+
--local-dir ./frankenstallm
|
| 139 |
+
|
| 140 |
+
# Modelfile ๋ด FROM ๊ฒฝ๋ก ์์ ํ ์์ฑ
|
| 141 |
+
ollama create frankenstallm -f ./frankenstallm/gguf/Modelfile.3b-v2-Q4_K_M
|
| 142 |
+
|
| 143 |
+
# ์คํ
|
| 144 |
+
ollama run frankenstallm
|
| 145 |
+
```
|
| 146 |
+
|
| 147 |
+
---
|
| 148 |
+
|
| 149 |
+
## ๋ชจ๋ธ ํน์ง
|
| 150 |
+
|
| 151 |
+
- **์ฒ์๋ถํฐ ๋ง๋ ํ๊ตญ์ด ํ ํฌ๋์ด์ **: SentencePiece Unigram, 64K ์ดํ, ํ๊ตญ์ด ๋ฌธ์ ์ปค๋ฒ๋ฆฌ์ง 99.95%
|
| 152 |
+
- **3๋จ๊ณ ํ์ต ํ์ดํ๋ผ์ธ**: ์ฌ์ ํ์ต (57K ์คํ
, ~600์ต ํ ํฐ) โ SFT (25.5K ์คํ
, 240๋ง ์ํ) โ ORPO (10K ์คํ
, 63๋ง ์ ํธ๋ ์)
|
| 153 |
+
- **B200 FP8 ๋ค์ดํฐ๋ธ ํ์ต**: TransformerEngine MXFP8 โ BF16 ๋๋น ์ด๋ก ์ 2๋ฐฐ ์ฒ๋ฆฌ๋
|
| 154 |
+
- **GGUF ๋ฐฐํฌ ์ง์**: Q4_K_M (757MB), Q8_0 (1.2GB), F16 (2.3GB) + Ollama Modelfile ์ ๊ณต
|
| 155 |
+
|
| 156 |
+
---
|
| 157 |
+
|
| 158 |
+
## ์ํคํ
์ฒ
|
| 159 |
+
|
| 160 |
+
| ๊ตฌ์ฑ ์์ | ๊ฐ |
|
| 161 |
+
|-----------|-----|
|
| 162 |
+
| ๊ตฌ์กฐ | Decoder-only Transformer (LLaMA ์คํ์ผ) |
|
| 163 |
+
| Hidden size | 3,072 |
|
| 164 |
+
| ๋ ์ด์ด ์ | 28 |
|
| 165 |
+
| ์ดํ
์
ํค๋ | 24 |
|
| 166 |
+
| KV ํค๋ | 8 (GQA 3:1) |
|
| 167 |
+
| FFN ์ฐจ์ | 8,192 (SwiGLU) |
|
| 168 |
+
| ์ดํ ํฌ๊ธฐ | 64,000 |
|
| 169 |
+
| ์ปจํ
์คํธ ๊ธธ์ด | 4,096 (ํ์ต ์ 2,048) |
|
| 170 |
+
| ์์น ์ธ์ฝ๋ฉ | RoPE (ฮธ=500,000) |
|
| 171 |
+
| ์ ๊ทํ | Pre-norm RMSNorm |
|
| 172 |
+
| ์ดํ
์
๊ตฌํ | FlashAttention-2 |
|
| 173 |
+
| ์ ๋ฐ๋ | FP8 (TransformerEngine MXFP8) |
|
| 174 |
+
| Weight tying | ์ ์ฉ (embedding โ lm_head) |
|
| 175 |
+
|
| 176 |
+
---
|
| 177 |
+
|
| 178 |
+
## ํ์ต ํ์ดํ๋ผ์ธ
|
| 179 |
+
|
| 180 |
+
### Phase 1: ์ฌ์ ํ์ต
|
| 181 |
+
|
| 182 |
+
| ํญ๋ชฉ | ๊ฐ |
|
| 183 |
+
|------|-----|
|
| 184 |
+
| ์คํ
์ | 57,000 |
|
| 185 |
+
| ์ต์ข
loss | 1.466 |
|
| 186 |
+
| ํ์ต ํ ํฐ | ~600์ต (385์ต ๊ณ ์ ร ~1.5 ์ํญ) |
|
| 187 |
+
| ์์ ์๊ฐ | ~63์๊ฐ |
|
| 188 |
+
| ๋ฐ์ดํฐ | CC-100 KO, HPLT KO, C4 KO, ๋๋ฌด์ํค, ์ํคํผ๋์ KO, Cosmopedia (EN) |
|
| 189 |
+
| ๋ฐฐ์น ํฌ๊ธฐ | 5 ร 8 GPU ร 8 accum ร 2,048 seq = ~65๋ง ํ ํฐ/์คํ
|
|
| 190 |
+
|
| 191 |
+
### Phase 2: SFT (์ง๋ ๋ฏธ์ธ์กฐ์ )
|
| 192 |
+
|
| 193 |
+
| ํญ๋ชฉ | ๊ฐ |
|
| 194 |
+
|------|-----|
|
| 195 |
+
| ์คํ
์ | 25,500 (77.3% ์ง์ ์์ ์กฐ๊ธฐ ์ข
๋ฃ) |
|
| 196 |
+
| ์ต์ val_loss | 1.8851 (step 23,000) |
|
| 197 |
+
| ์์ ์๊ฐ | ~15.5์๊ฐ |
|
| 198 |
+
| ๋ฐ์ดํฐ | 24๊ฐ ์์ค, 243๋ง 9,397 ์ํ (7.48 GB) |
|
| 199 |
+
| ๊ตฌ์ฑ | SFT 70% + ์ฌ์ ํ์ต ๋ฆฌํ๋ ์ด 30% (์น๋ช
์ ๋ง๊ฐ ๋ฐฉ์ง) |
|
| 200 |
+
| ์ง์ ๋ง๊ฐ๋ฅ | 0.9% (19๊ฐ ๋ฐ์ดํฐ์
๊ธฐ์ค) |
|
| 201 |
+
|
| 202 |
+
### Phase 3: ORPO (์ ํธ๋ ์ต์ ํ)
|
| 203 |
+
|
| 204 |
+
| ํญ๋ชฉ | ๊ฐ |
|
| 205 |
+
|------|-----|
|
| 206 |
+
| ์คํ
์ | 9,997 (์กฐ๊ธฐ ์๋ ด) |
|
| 207 |
+
| ์ต์ eval_loss | 1.625 |
|
| 208 |
+
| ์ ํธ๋ ์ ํ๋ | 76.02% |
|
| 209 |
+
| ๋ณด์ ๋ง์ง | 0.6100 |
|
| 210 |
+
| ์์ ์๊ฐ | ~7์๊ฐ |
|
| 211 |
+
| ๋ฐ์ดํฐ | ํ๊ตญ์ด HF ๋ฐ์ดํฐ์
7์ข
, ~63๋ง ์ ํธ๋ ์ |
|
| 212 |
+
| ํ์ดํผํ๋ผ๋ฏธํฐ | beta=0.25, lr=1.2e-5, eff_batch=128 |
|
| 213 |
+
|
| 214 |
+
**์ด ํ์ต ์๊ฐ: 8ร B200์์ ์ฝ 86์๊ฐ**
|
| 215 |
+
|
| 216 |
+
---
|
| 217 |
+
|
| 218 |
+
## ๋ฒค์น๋งํฌ
|
| 219 |
+
|
| 220 |
+
### ํ์ต ๋จ๊ณ๋ณ ์ฑ๋ฅ ๋ณํ (Base โ SFT โ ORPO)
|
| 221 |
+
|
| 222 |
+
| ๋ฒค์น๋งํฌ | Base | SFT | ORPO | ๋ณํ (BaseโORPO) |
|
| 223 |
+
|-----------|:----:|:---:|:----:|:---:|
|
| 224 |
+
| **KoBEST ํ๊ท (0-shot)** | 43.7% | 43.3% | **52.8%** | **+9.1pp** |
|
| 225 |
+
| KoBEST COPA | 49.3% | 48.6% | **63.9%** | +14.6pp |
|
| 226 |
+
| KoBEST HellaSwag-KO | 21.6% | 19.8% | **38.0%** | +16.4pp |
|
| 227 |
+
| KoBEST SentiNeg | 48.6% | 49.1% | **62.5%** | +13.9pp |
|
| 228 |
+
| KoBEST BoolQ | 50.3% | 50.1% | 50.6% | +0.3pp |
|
| 229 |
+
| PIQA | 52.5% | 52.6% | **59.9%** | +7.3pp |
|
| 230 |
+
| ARC-Easy | 25.6% | 25.9% | **36.0%** | +10.4pp |
|
| 231 |
+
| HAE-RAE | 19.7% | 19.9% | 21.8% | +2.1pp |
|
| 232 |
+
| HellaSwag EN | 26.2% | 26.1% | 29.2% | +3.0pp |
|
| 233 |
+
| Greedy 3-gram ๋ฐ๋ณต๋ฅ | 61.0% | 73.0% | **30.9%** | -30.1pp |
|
| 234 |
+
| EOS ์ข
๋ฃ์จ | 0% | 60% | **67%** | +67pp |
|
| 235 |
+
| PPL ๋ง๊ฐ๋ฅ | โ | 0.9% | 4.1% | 15% ์ด๋ด โ
|
|
| 236 |
+
|
| 237 |
+
### 3B๊ธ ๋ชจ๋ธ ๋น๊ต (Ollama, 35๊ฐ ํ
์คํธ)
|
| 238 |
+
|
| 239 |
+
| ๋ชจ๋ธ | ํ๋ผ๋ฏธํฐ | ํ๊ตญ์ด NLU | ์ง์ | ์ง์ ์ํ | ์ถ๋ก | ํ๊ท ๏ฟฝ๏ฟฝ์ |
|
| 240 |
+
|-------|:------:|:----------:|:----:|:---------:|:----:|:---------:|
|
| 241 |
+
| Qwen 2.5 3B | 3B | 100.0 | 20.8 | 55.6 | 62.5 | **63.4** |
|
| 242 |
+
| Phi-4 Mini | 3.8B | 66.7 | 29.2 | 33.3 | **87.5** | 60.6 |
|
| 243 |
+
| **FRANKENSTALLM 3B** | **3B** | **100.0** | **75.0** | **66.7** | 50.0 | 46.7 |
|
| 244 |
+
|
| 245 |
+
> FRANKENSTALLM์ **ํ๊ตญ์ด NLU** (Qwen๊ณผ ๋๋ฅ ), **ํ๊ตญ์ด ์ง์** (75.0 vs 20.8/29.2), **์ง์ ์ํ** (66.7 vs 55.6/33.3)์์ ์์ญ๋๋ค.
|
| 246 |
+
|
| 247 |
+
### ์ถ๋ก ์๋ (Ollama, Q4_K_M)
|
| 248 |
+
|
| 249 |
+
| ๋ชจ๋ธ | ํ๊ท TTFT | TPS | ๋น๊ณ |
|
| 250 |
+
|-------|:--------:|:---:|------|
|
| 251 |
+
| **FRANKENSTALLM 3B** | **16.7ms** | **142.5** | ๊ฐ์ฅ ๋น ๋ฆ |
|
| 252 |
+
| Phi-4 Mini 3.8B | 25.6ms | 100.4 | |
|
| 253 |
+
| Qwen 2.5 3B | 28.2ms | 93.8 | |
|
| 254 |
+
|
| 255 |
+
### Perplexity ๋ณด์กด์จ (ORPO ์ง์ ์ ์ง)
|
| 256 |
+
|
| 257 |
+
| ๋ฐ์ดํฐ์
| Base PPL | ORPO PPL | ๋ง๊ฐ๋ฅ |
|
| 258 |
+
|---------|:--------:|:--------:|:------:|
|
| 259 |
+
| Korean C4 | 5.72 | 5.87 | +2.7% |
|
| 260 |
+
| Korean Wiki | 11.84 | 12.21 | +3.2% |
|
| 261 |
+
| ์ต๋ ๋ง๊ฐ๋ฅ | โ | โ | 4.1% โ
|
|
| 262 |
+
|
| 263 |
+
---
|
| 264 |
+
|
| 265 |
+
## ํ์ต ๋ฐ์ดํฐ
|
| 266 |
+
|
| 267 |
+
### ์ฌ์ ํ์ต (~385์ต ํ ํฐ)
|
| 268 |
+
|
| 269 |
+
| ๋ถ๋ฅ | ์์ค | ์ถ์ ํ ํฐ ์ |
|
| 270 |
+
|------|------|:-----------:|
|
| 271 |
+
| ํ๊ตญ์ด ์น ํฌ๋กค | C4 KO, CC-100 KO, HPLT KO | ~172์ต |
|
| 272 |
+
| ํ๊ตญ์ด ๋ฐฑ๊ณผ์ฌ์ | ์ํคํผ๋์ KO, ๋๋ฌด์ํค (2๊ฐ ๋ฒ์ ) | ~28์ต |
|
| 273 |
+
| ์์ด ๊ต์ก | Cosmopedia (Stories, Web, Stanford, WikiHow, OpenStax, Khan) | ~57์ต |
|
| 274 |
+
| ์์ด ์ํยท๊ณผํ | AutoMathText, OpenWebMath, Proof-Pile-2 | ~85์ต |
|
| 275 |
+
| ์ฝ๋ | StarCoder (ํํฐ๋ง) | ~43์ต |
|
| 276 |
+
|
| 277 |
+
### SFT (240๋ง ์ํ, 24๊ฐ ์์ค)
|
| 278 |
+
|
| 279 |
+
| ์์ญ | ๋น์จ | ์ฃผ์ ๋ฐ์ดํฐ์
|
|
| 280 |
+
|------|:----:|-------------|
|
| 281 |
+
| ์ถ๋ก /CoT | 38% | reasoning_r1_1.4m, magpie_reasoning |
|
| 282 |
+
| ํ๊ตญ์ด ์ง์๋ฌธ | 23% | korean_instruction_mix, open_korean_instructions, kullm_v2 |
|
| 283 |
+
| ์์ด ์ผ๋ฐ | 16% | openhermes_2.5, ultrachat_200k |
|
| 284 |
+
| ์ํ | 12% | NuminaMath-CoT, orca-math-ko |
|
| 285 |
+
| ๋ํ/์ฝ๋/๊ธฐํ | 11% | smol-koreantalk, Evol-Instruct-Code-80k-ko |
|
| 286 |
+
|
| 287 |
+
### ORPO (~63๋ง ์ ํธ๋ ์, 7๊ฐ ์์ค)
|
| 288 |
+
|
| 289 |
+
| ๋ฐ์ดํฐ์
| ์ฉ๋ | ์์ญ |
|
| 290 |
+
|---------|:----:|------|
|
| 291 |
+
| nayohan/preference-collection-ko-full | 4.9GB | ์ผ๋ฐ ์ ํธ๋ |
|
| 292 |
+
| heegyu/orca-math-korean-preference-cleaned | 1.6GB | ์ํ ์ถ๋ก |
|
| 293 |
+
| kuotient/orca-math-korean-dpo-pairs | 750MB | ์ํ DPO |
|
| 294 |
+
| maywell/ko_Ultrafeedback_binarized | 394MB | ํผ๋๋ฐฑ ์ ๋ ฌ |
|
| 295 |
+
| tellang/yeji-preference-ko-v1 | 171MB | ์ผ๋ฐ ์ ํธ๋ |
|
| 296 |
+
| jojo0217/korean_rlhf_dataset | 137MB | RLHF ์ |
|
| 297 |
+
| lemon-mint/korean-realqa-reasoning-v01-preference | 58MB | QA ์ถ๋ก |
|
| 298 |
+
|
| 299 |
+
---
|
| 300 |
+
|
| 301 |
+
## GGUF & Ollama
|
| 302 |
+
|
| 303 |
+
### ์ ๊ณต ์์ํ ํ์ผ
|
| 304 |
+
|
| 305 |
+
| ํ์ผ | ํฌ๊ธฐ | ์ค๋ช
|
|
| 306 |
+
|------|:----:|------|
|
| 307 |
+
| `gguf/frankenstallm-3b-v2-Q4_K_M.gguf` | 757MB | **๊ถ์ฅ** โ ํฌ๊ธฐ ๋๋น ์ต์ ํ์ง |
|
| 308 |
+
| `gguf/frankenstallm-3b-v2-Q8_0.gguf` | 1.2GB | ๋์ ํ์ง |
|
| 309 |
+
| `gguf/frankenstallm-3b-v2-f16.gguf` | 2.3GB | ์ ์ฒด ์ ๋ฐ๋ |
|
| 310 |
+
| `model.safetensors` | 4.76GB | Transformers ๋ค์ดํฐ๋ธ (ORPO best, byte-fallback ์์ ์๋ฃ) |
|
| 311 |
+
|
| 312 |
+
### ๊ถ์ฅ ์ํ๋ง ํ๋ผ๋ฏธํฐ
|
| 313 |
+
|
| 314 |
+
| ํ๋ผ๋ฏธํฐ | ๊ฐ | ๋น๊ณ |
|
| 315 |
+
|---------|:---:|------|
|
| 316 |
+
| `temperature` | 0.7 | ํ๊ตญ์ด ์์ฑ ํ์ง ์ต์ |
|
| 317 |
+
| `repeat_penalty` | 1.2 | **ํ์** โ ๋ฏธ์ ์ฉ ์ greedy ๋ฐ๋ณต๋ฅ 30.9% |
|
| 318 |
+
| `top_p` | 0.9 | Nucleus ์ํ๋ง |
|
| 319 |
+
| `top_k` | 50 | Top-k ํ๋ณด ์ |
|
| 320 |
+
| `max_tokens` | 512 | ์ต๋ ์์ฑ ๊ธธ์ด |
|
| 321 |
+
| `num_ctx` | 4096 | ์ปจํ
์คํธ ์๋์ฐ (์ด๊ณผ ๊ธ์ง) |
|
| 322 |
+
|
| 323 |
+
> โ ๏ธ ๋ฐ๋์ `repeat_penalty >= 1.2`๋ฅผ ์ฌ์ฉํ์ธ์. ์ ์ฉํ๋ฉด ๋ฐ๋ณต๋ฅ ์ด **0%** ๋ก ๋จ์ด์ง๋๋ค. ๋ฏธ์ ์ฉ ์ greedy ๋์ฝ๋ฉ์์ ~31% 3-gram ๋ฐ๋ณต์ด ๋ฐ์ํฉ๋๋ค.
|
| 324 |
+
|
| 325 |
+
---
|
| 326 |
+
|
| 327 |
+
## ์ ํ ์ฌํญ
|
| 328 |
+
|
| 329 |
+
- **์์ด ์ฑ๋ฅ ์ ํ**: MMLU-EN ~23%, HellaSwag-EN ~29% โ ํ๊ตญ์ด ํนํ ๋ชจ๋ธ์
๋๋ค
|
| 330 |
+
- **์ฝ๋ ์์ฑ**: ๊ฑฐ์ ๋ถ๊ฐ๋ฅ (ํ์ต ๋ฐ์ดํฐ์ ์ฝ๋ ๋น์ค์ด ๋ฎ์)
|
| 331 |
+
- **Greedy ๋ฐ๋ณต**: `repeat_penalty` ๋ฏธ์ฌ์ฉ ์ 30.9% 3-gram ๋ฐ๋ณต โ ๋ฐ๋์ `repeat_penalty >= 1.2` ์ฌ์ฉ
|
| 332 |
+
- **์์ ์ฑ**: ์์ ์ ๋ ฌ(safety alignment) ๋ฐ์ดํฐ๊ฐ ํ์ต์ ํฌํจ๋์ง ์์์ผ๋ฏ๋ก ์ ์ ํ ๊ฐ๋๋ ์ผ๊ณผ ํจ๊ป ์ฌ์ฉํ์ธ์
|
| 333 |
+
- **๊ท๋ชจ ์ฐจ์ด**: ์์กฐ ํ ํฐ์ผ๋ก ํ์ต๋ ์์ฉ 3B ๋ชจ๋ธ ๋๋น ~600์ต ํ ํฐ์ผ๋ก ํ์ต โ ์ ๋ฐ์ ๋ฒค์น๋งํฌ ์ ์๋ ๋ฎ์ ์ ์์ต๋๋ค
|
| 334 |
+
|
| 335 |
+
---
|
| 336 |
+
|
| 337 |
+
## ํ๋์จ์ด ๋ฐ ํ์ต ํ๊ฒฝ
|
| 338 |
+
|
| 339 |
+
| ๊ตฌ์ฑ ์์ | ์ฌ์ |
|
| 340 |
+
|-----------|------|
|
| 341 |
+
| GPU | 8ร NVIDIA B200 (183GB HBM3e ร 8, ์ด ~1.47TB) |
|
| 342 |
+
| FP8 ์ฐ์ฐ | 2,250 TFLOPS/GPU (์ด 18,000 TFLOPS) |
|
| 343 |
+
| ์ธํฐ์ปค๋ฅํธ | NVLink 5.0, NVSwitch all-to-all mesh |
|
| 344 |
+
| CPU | 2ร AMD EPYC 9365 (72์ฝ์ด, Zen 5) |
|
| 345 |
+
| RAM | 2.21 TB DDR5 |
|
| 346 |
+
| PyTorch | 2.10.0a0+b4e4ee81d3.nv25.12 (NVIDIA ์ปค์คํ
) |
|
| 347 |
+
| TransformerEngine | 2.10.0 |
|
| 348 |
+
| FlashAttention | 2.7.4 |
|
| 349 |
+
| NCCL | 2.28.9 |
|
| 350 |
+
| CUDA | 13.1 |
|
| 351 |
+
| ์ด ํ์ต ์๊ฐ | ~86์๊ฐ (์ฌ์ ํ์ต 63h + SFT 15.5h + ORPO 7h) |
|
| 352 |
+
|
| 353 |
+
---
|
| 354 |
+
|
| 355 |
+
## ์ธ์ฉ
|
| 356 |
+
|
| 357 |
+
```bibtex
|
| 358 |
+
@misc{frankenstallm2026,
|
| 359 |
+
title={FRANKENSTALLM: A Korean 3B LLM Built From Scratch on B200 GPUs},
|
| 360 |
+
author={pathcosmos},
|
| 361 |
+
year={2026},
|
| 362 |
+
url={https://huggingface.co/pathcosmos/frankenstallm},
|
| 363 |
+
note={3-phase training (Pretrain, SFT, ORPO) with FP8 on 8x NVIDIA B200}
|
| 364 |
+
}
|
| 365 |
+
```
|
| 366 |
+
|
| 367 |
+
---
|
| 368 |
+
|
| 369 |
+
## ๋งํฌ ๋ฐ ์ฐ๋ฝ์ฒ
|
| 370 |
+
|
| 371 |
+
- **GitHub**: [pathcosmos/FRANKENSTALLM](https://github.com/pathcosmos/FRANKENSTALLM) โ ์ ์ฒด ์์ค์ฝ๋, ํ์ต ์คํฌ๋ฆฝํธ, ๋น๋ ๋ก๊ทธ
|
| 372 |
+
- **HuggingFace**: [pathcosmos/frankenstallm](https://huggingface.co/pathcosmos/frankenstallm)
|
| 373 |
+
- **์ฐ๋ฝ์ฒ**: pathcosmos@gmail.com
|
| 374 |
+
|
| 375 |
+
---
|
| 376 |
+
---
|
| 377 |
+
|
| 378 |
+
> ๐บ๐ธ **English version below**
|
| 379 |
+
|
| 380 |
+
---
|
| 381 |
+
|
| 382 |
+
# FRANKENSTALLM 3B
|
| 383 |
+
|
| 384 |
> **A Korean 3B LLM built entirely from scratch โ tokenizer, pretraining, SFT, and ORPO โ on 8ร NVIDIA B200 GPUs.**
|
| 385 |
|
| 386 |
| | |
|
|
|
|
| 665 |
|
| 666 |
---
|
| 667 |
|
| 668 |
+
## Links & Contact
|
| 669 |
|
| 670 |
- **GitHub**: [pathcosmos/FRANKENSTALLM](https://github.com/pathcosmos/FRANKENSTALLM) โ Full source code, training scripts, and builder's log
|
| 671 |
- **HuggingFace**: [pathcosmos/frankenstallm](https://huggingface.co/pathcosmos/frankenstallm)
|
| 672 |
+
- **Contact**: pathcosmos@gmail.com
|