Gianloko
/

apex-coder-1.5b

@@ -9,7 +9,7 @@ pipeline_tag: text-generation
 # ApexCoder-1.5B · Merged 16-bit Model
-*Last updated: 2026-03-20 — Cycle 1*
 Production-ready merged model (base + LoRA fused into 16-bit weights).
 Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
@@ -18,16 +18,16 @@ Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
 > Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the
 > [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama.
-## 📊 Evaluation — Cycle 1
 | Metric | Value |
 |---|---|
-| **LLM-as-judge (avg)** | **12.9/15** |
-| **Perplexity** | **1.17** |
-| **Δ vs previous cycle** | **+12.9** |
-| Training loss | 0.2447 |
-| Training samples | 5,832 |
-| Training steps | 1056 |
 ### By reasoning type
@@ -40,6 +40,7 @@ Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
 | Cycle | Date | Score | PPL | Δ | vs Published |
 |---|---|---|---|---|---|
 | 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 |
 ## 🚀 Quick start
@@ -50,7 +51,7 @@ import torch
 model = AutoModelForCausalLM.from_pretrained(
     "Gianloko/apex-coder-1.5b",
-    dtype=torch.bfloat16,       # FIX 1: torch_dtype is deprecated, use dtype
     device_map="auto",
 )
 tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")
@@ -59,21 +60,12 @@ messages = [
     {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
     {"role": "user",   "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},
 ]
 inputs = tokenizer.apply_chat_template(
-    messages,
-    return_tensors="pt",
-    add_generation_prompt=True,
-    return_dict=True,           # FIX 2: returns BatchEncoding with input_ids + attention_mask
 ).to(model.device)
-output = model.generate(
-    **inputs,                   # FIX 2 (cont): unpack dict, don't pass as positional arg
-    max_new_tokens=512,
-    do_sample=False,            # FIX 3: removed temperature — invalid when do_sample=False
-)
-print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
 ```
 ## 🦙 Ollama (GGUF — recommended for local use)

 # ApexCoder-1.5B · Merged 16-bit Model
+*Last updated: 2026-03-20 — Cycle 2*
 Production-ready merged model (base + LoRA fused into 16-bit weights).
 Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
 > Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the
 > [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama.
+## 📊 Evaluation — Cycle 2
 | Metric | Value |
 |---|---|
+| **LLM-as-judge (avg)** | **12.6/15** |
+| **Perplexity** | **1.14** |
+| **Δ vs previous cycle** | **+12.6** |
+| Training loss | 0.2274 |
+| Training samples | 8,990 |
+| Training steps | 1100 |
 ### By reasoning type
 | Cycle | Date | Score | PPL | Δ | vs Published |
 |---|---|---|---|---|---|
 | 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 |
+| 2 | 2026-03-20 | 12.6/15 | 1.14 | +12.6 | 13.2 |
 ## 🚀 Quick start
 model = AutoModelForCausalLM.from_pretrained(
     "Gianloko/apex-coder-1.5b",
+    torch_dtype=torch.bfloat16,
     device_map="auto",
 )
 tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")
     {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
     {"role": "user",   "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},
 ]
 inputs = tokenizer.apply_chat_template(
+    messages, return_tensors="pt", add_generation_prompt=True
 ).to(model.device)
+output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False)
+print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True))
 ```
 ## 🦙 Ollama (GGUF — recommended for local use)