Gianloko commited on
Commit
e27b2ec
·
verified ·
1 Parent(s): 4dbaa4f

Update README — cycle 2

Browse files
Files changed (1) hide show
  1. README.md +13 -21
README.md CHANGED
@@ -9,7 +9,7 @@ pipeline_tag: text-generation
9
 
10
  # ApexCoder-1.5B · Merged 16-bit Model
11
 
12
- *Last updated: 2026-03-20 — Cycle 1*
13
 
14
  Production-ready merged model (base + LoRA fused into 16-bit weights).
15
  Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
@@ -18,16 +18,16 @@ Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
18
  > Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the
19
  > [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama.
20
 
21
- ## 📊 Evaluation — Cycle 1
22
 
23
  | Metric | Value |
24
  |---|---|
25
- | **LLM-as-judge (avg)** | **12.9/15** |
26
- | **Perplexity** | **1.17** |
27
- | **Δ vs previous cycle** | **+12.9** |
28
- | Training loss | 0.2447 |
29
- | Training samples | 5,832 |
30
- | Training steps | 1056 |
31
 
32
  ### By reasoning type
33
 
@@ -40,6 +40,7 @@ Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
40
  | Cycle | Date | Score | PPL | Δ | vs Published |
41
  |---|---|---|---|---|---|
42
  | 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 |
 
43
 
44
 
45
  ## 🚀 Quick start
@@ -50,7 +51,7 @@ import torch
50
 
51
  model = AutoModelForCausalLM.from_pretrained(
52
  "Gianloko/apex-coder-1.5b",
53
- dtype=torch.bfloat16, # FIX 1: torch_dtype is deprecated, use dtype
54
  device_map="auto",
55
  )
56
  tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")
@@ -59,21 +60,12 @@ messages = [
59
  {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
60
  {"role": "user", "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},
61
  ]
62
-
63
  inputs = tokenizer.apply_chat_template(
64
- messages,
65
- return_tensors="pt",
66
- add_generation_prompt=True,
67
- return_dict=True, # FIX 2: returns BatchEncoding with input_ids + attention_mask
68
  ).to(model.device)
69
 
70
- output = model.generate(
71
- **inputs, # FIX 2 (cont): unpack dict, don't pass as positional arg
72
- max_new_tokens=512,
73
- do_sample=False, # FIX 3: removed temperature — invalid when do_sample=False
74
- )
75
-
76
- print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
77
  ```
78
 
79
  ## 🦙 Ollama (GGUF — recommended for local use)
 
9
 
10
  # ApexCoder-1.5B · Merged 16-bit Model
11
 
12
+ *Last updated: 2026-03-20 — Cycle 2*
13
 
14
  Production-ready merged model (base + LoRA fused into 16-bit weights).
15
  Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
 
18
  > Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the
19
  > [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama.
20
 
21
+ ## 📊 Evaluation — Cycle 2
22
 
23
  | Metric | Value |
24
  |---|---|
25
+ | **LLM-as-judge (avg)** | **12.6/15** |
26
+ | **Perplexity** | **1.14** |
27
+ | **Δ vs previous cycle** | **+12.6** |
28
+ | Training loss | 0.2274 |
29
+ | Training samples | 8,990 |
30
+ | Training steps | 1100 |
31
 
32
  ### By reasoning type
33
 
 
40
  | Cycle | Date | Score | PPL | Δ | vs Published |
41
  |---|---|---|---|---|---|
42
  | 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 |
43
+ | 2 | 2026-03-20 | 12.6/15 | 1.14 | +12.6 | 13.2 |
44
 
45
 
46
  ## 🚀 Quick start
 
51
 
52
  model = AutoModelForCausalLM.from_pretrained(
53
  "Gianloko/apex-coder-1.5b",
54
+ torch_dtype=torch.bfloat16,
55
  device_map="auto",
56
  )
57
  tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")
 
60
  {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
61
  {"role": "user", "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},
62
  ]
 
63
  inputs = tokenizer.apply_chat_template(
64
+ messages, return_tensors="pt", add_generation_prompt=True
 
 
 
65
  ).to(model.device)
66
 
67
+ output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False)
68
+ print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True))
 
 
 
 
 
69
  ```
70
 
71
  ## 🦙 Ollama (GGUF — recommended for local use)