typomonster commited on
Commit
3cb37a0
·
verified ·
1 Parent(s): 4b24114

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -27,9 +27,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
  *.tar.* filter=lfs diff=lfs merge=lfs -text
28
  *.tar filter=lfs diff=lfs merge=lfs -text
29
  *.tflite filter=lfs diff=lfs merge=lfs -text
 
 
30
  *.tgz filter=lfs diff=lfs merge=lfs -text
31
  *.wasm filter=lfs diff=lfs merge=lfs -text
32
  *.xz filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
27
  *.tar.* filter=lfs diff=lfs merge=lfs -text
28
  *.tar filter=lfs diff=lfs merge=lfs -text
29
  *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.litertlm filter=lfs diff=lfs merge=lfs -text
31
+ *.task filter=lfs diff=lfs merge=lfs -text
32
  *.tgz filter=lfs diff=lfs merge=lfs -text
33
  *.wasm filter=lfs diff=lfs merge=lfs -text
34
  *.xz filter=lfs diff=lfs merge=lfs -text
35
  *.zip filter=lfs diff=lfs merge=lfs -text
36
  *.zst filter=lfs diff=lfs merge=lfs -text
37
  *tfevents* filter=lfs diff=lfs merge=lfs -text
38
+ supergemma4-e4b-abliterated.litertlm filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ library_name: litert-lm
4
+ base_model:
5
+ - google/gemma-4-E4B-it
6
+ - Jiunsong/supergemma4-e4b-abliterated
7
+ tags:
8
+ - litert-lm
9
+ - gemma
10
+ - gemma4
11
+ - apple-silicon
12
+ - text-generation
13
+ - on-device
14
+ - abliterated
15
+ pipeline_tag: text-generation
16
+ ---
17
+
18
+ # SuperGemma 4 E4B Abliterated — LiteRT-LM
19
+
20
+ > ⚠️ **Unofficial build.** This LiteRT-LM package is **not** published by
21
+ > Google, the LiteRT team, or `Jiunsong` — it is a community conversion.
22
+ > Because of int4/int8 re-quantisation during packaging, outputs may
23
+ > differ from the source
24
+ > [`Jiunsong/supergemma4-e4b-abliterated`](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated)
25
+ > checkpoint and, in particular, the abliterated behaviour may be
26
+ > attenuated on some prompts. For reference behaviour, run the source
27
+ > checkpoint directly (e.g. via `transformers`) or use the
28
+ > [MLX 4-bit companion build](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated-mlx)
29
+ > on Apple Silicon.
30
+
31
+ On-device build of
32
+ [`Jiunsong/supergemma4-e4b-abliterated`](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated)
33
+ in the `.litertlm` format for the
34
+ [LiteRT-LM](https://github.com/google-ai-edge/LiteRT-LM) runtime
35
+ (Android, iOS, macOS, Linux, Windows, Web).
36
+
37
+ Companion Apple-Silicon-targeted builds for this fine-tune:
38
+
39
+ | Target | Repo |
40
+ |---|---|
41
+ | **LiteRT-LM** (this repo) — CPU + Metal GPU via LiteRT-LM CLI / Android / iOS / Desktop | `litert-lm` format, 3.65 GB |
42
+ | [MLX 4-bit](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated-mlx) | MLX safetensors, Mac Studio class local serving |
43
+
44
+ The LiteRT-LM bundle is the path for cross-platform on-device deployment
45
+ (Android, iOS, Desktop, Web — all through the same file). The MLX build
46
+ above is the fastest option for local serving on macOS specifically.
47
+
48
+ ## Base models
49
+
50
+ - [`google/gemma-4-E4B-it`](https://huggingface.co/google/gemma-4-E4B-it) —
51
+ the original Gemma 4 E4B instruct checkpoint.
52
+ - [`Jiunsong/supergemma4-e4b-abliterated`](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated) —
53
+ the abliterated fine-tune. Abliteration removes the refusal direction
54
+ from the residual stream; see the source card for release behaviour notes.
55
+
56
+ ## Run it
57
+
58
+ ### CLI (fastest way to try it)
59
+
60
+ ```bash
61
+ uv tool install litert-lm # one-time
62
+
63
+ # Run directly from this HF repo, on Apple Silicon Metal GPU:
64
+ litert-lm run --from-huggingface-repo typomonster/supergemma4-e4b-abliterated-litert-lm \
65
+ supergemma4-e4b-abliterated.litertlm \
66
+ --prompt "Write a three-sentence story about a robot who discovers music." \
67
+ --backend gpu \
68
+ --enable-speculative-decoding=false
69
+ ```
70
+
71
+ `--backend gpu` routes through `libLiteRtMetalAccelerator` on macOS ARM64;
72
+ measured ~40 tok/s decode on Metal vs ~13 tok/s on CPU (M-series, your
73
+ mileage will vary).
74
+
75
+ `--enable-speculative-decoding=false` is recommended — see the caveats.
76
+
77
+ ### Platform SDKs
78
+
79
+ - **Android / iOS / Desktop**: see the
80
+ [LiteRT-LM documentation](https://ai.google.dev/edge/litert-lm/overview)
81
+ for platform integration guides.
82
+ - **Python**: see `python/litert_lm/examples/simple_main.py` in the
83
+ [LiteRT-LM repo](https://github.com/google-ai-edge/LiteRT-LM).
84
+
85
+ ## Caveats
86
+
87
+ 1. **Soft-refusal behaviour may be attenuated** vs. the source
88
+ HF fine-tune. If you need the strongest abliterated behaviour on Apple
89
+ Silicon, use the MLX build linked above.
90
+
91
+ 2. **Run with `--enable-speculative-decoding=false`**. The MTP drafter in
92
+ this bundle may have a reduced accept rate against the modified main
93
+ model. Speculative decoding remains correct (the main model verifies
94
+ each proposed token) but is not guaranteed to be a speedup here.
95
+
96
+ ## License
97
+
98
+ This build inherits the [Gemma license](https://ai.google.dev/gemma/terms)
99
+ from its upstream bases. Review the terms there before redistribution.
100
+
101
+ ## Acknowledgements
102
+
103
+ - Upstream model: `google/gemma-4-E4B-it` (Google DeepMind).
104
+ - Abliteration fine-tune: `Jiunsong/supergemma4-e4b-abliterated`.
supergemma4-e4b-abliterated.litertlm ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83399794f0ad10166c0034451e06b9cdb120590eab3665863ff20e443b3f9750
3
+ size 3654467584