---
library_name: transformers
tags:
- generated_from_trainer
- text-generation
- transformers
- meta-math
- qwen2
- symbolic-ai
- symbioticlm
- convergentintel
model-index:
- name: SymLM
  results: []
license: afl-3.0
datasets:
- meta-math/MetaMathQA
- open-thoughts/OpenThoughts2-1M
language:
- en
base_model:
- Qwen/Qwen2.5-0.5B
pipeline_tag: text-generation
metrics:
- accuracy
---

# 🧠 SymLM

**SymbioticLM** is a hybrid symbolic–neural language model that integrates a frozen transformer backbone (`Qwen2ForCausalLM`) with a suite of symbolic cognitive modules for adaptive, interpretable reasoning.

---

## 📐 Model Description

The architecture fuses neural token-level generation with symbolic introspection and reasoning:

- **Dynamic Thought Evolution with Helical Encoding and DNA-Inspired Memory (DTE-HDM)**  
  Enables structured long-term memory and spiral-context encoding across tokens.

- **Multi-Agent Symbiotic Response Mechanisms (M.A.S.R.M)**  
  Coordinates symbolic-neural agents via gated attention and adaptive response layers.

- **QwenExoCortex**  
  Projects contextual hidden states from the Qwen model into a symbolic fusion space for reasoning and memory replay.

- **Symbolic processors**  
  Includes:
  - `ThoughtDynamicsLNN`
  - `Liquid / Crystalline Processors`
  - `Graph Reasoning with DNAConv`
  - A rolling `ThoughtMemory`

This enables real-time fusion of symbolic thinking, token generation, and reasoning-aware language modeling.

---

## 🎯 Intended Uses & Limitations

### ✅ Intended Uses

- **Mathematical reasoning and proof generation**  
  Fine-tuned on *MetaMathQA*, optimized for symbolic Q&A, equation logic, and structured inference.

- **Symbolic-cognitive AI research**  
  Useful for studying attention modulation, memory replay, and neural-symbolic interface dynamics.

- **Low-resource adaptation**  
  Modular memory and projection design enables meaningful performance even with smaller datasets.

- **Building adaptive cognition systems**  
  Can serve as a symbolic kernel for reflective AI agents and knowledge evolution pipelines.

---

### ⚠️ Limitations

- **Limited training scale**  
  Trained on 25,000 MetaMathQA examples. Effective for symbolic form, but not yet broad generalization.

- **No RLHF or alignment**  
  Outputs are not tuned for safety or instruction alignment and may hallucinate.

- **Fluency ≠ correctness**  
  Symbolic fluency does not imply mathematically valid proofs. Verification is recommended.

- **Not optimized for open-domain generation**  
  This model prioritizes logic and structure over conversational depth.

---

## ⚙️ Training Procedure

This checkpoint is currently in experimental phase.

### 🧪 Training Hyperparameters

- **learning_rate**: `3e-5`  
- **train_batch_size**: `16`  
- **eval_batch_size**: `16`  
- **gradient_accumulation_steps**: `64`  
- **total_train_batch_size**: `1024`  
- **optimizer**: `AdamW`, betas=(0.9, 0.999), epsilon=1e-08  
- **lr_scheduler_type**: `cosine`  
- **warmup_steps**: `500`  
- **num_epochs**: `3`  
- **mixed_precision_training**: `Native AMP`

---

## 🧱 Framework Versions

- 🤗 Transformers: `4.51.3`  
- 🧠 PyTorch: `2.7.0+cu126`  
- 📚 Datasets: `3.5.0`  
- 🔤 Tokenizers: `0.21.1`

---

## 📚 Research Foundations

SymbioticLM builds upon a cohesive theoretical framework for dynamic reasoning and neuro-symbolic learning:

### 🔁 Multi-Agent Symbiosis and Dynamic Thought

**Rapid Adaptation via Multi-Agent Symbiotic Response Mechanisms (M.A.S.R.M)**  
> A framework where symbolic and neural agents dynamically adapt via gated feedback, memory modulation, and agent-based specialization.

**Focus**: Multi-agent control, reflective learning, contextual responsiveness

---

### 🧬 Dynamic Thought Evolution with Helical Encoding and DNA-Inspired Memory (DTE-HDM)

> A memory structure inspired by biological helices, enabling thought persistence through spiral-layered contextual encodings across time.

**Focus**: Long-term token evolution, normalized replay, thought continuity

---

### 🧠 Integrating DTE-HDM + M.A.S.R.M for Adaptive AI

> Combines symbolic evolution and multi-agent adaptation to construct an LLM that reflects, adapts, and deepens reasoning through internal dynamics.

**Result**: A system that *learns faster*, *adapts deeper*, and *thinks symbolically*

---

### 📐 Theoretical Underpinning

**The Analytic Foundations Theorem (AFT)**  
> A rigorous, measure-theoretic replacement for classical calculus: replaces pointwise derivatives with discrepancy-driven integral convergence across vanishing sets.

**Applies to**:  
- Symbolic gradients  
- Gradient-free optimization  
- Discrete logic approximation in function spaces

---

These form the **mathematical and architectural core** of SymbioticLM, enabling:

- 🧠 *Neuro-symbolic cognitive evolution*  
- 🔁 *Multi-agent dynamic feedback coordination*  
- 📏 *Formal memory through discrepancy-based logic*

---

---

## Convergent Intelligence Portfolio

*Part of the [Symbiotic AI Series](https://huggingface.co/reaperdoesntknow) by [Convergent Intelligence LLC: Research Division](https://huggingface.co/reaperdoesntknow)*


### Related Models

| Model | Downloads | Format |
|-------|-----------|--------|
| [Symbiotic-1B](https://huggingface.co/reaperdoesntknow/Symbiotic-1B) | 4 | HF |
| [Symbiotic-8B](https://huggingface.co/reaperdoesntknow/Symbiotic-8B) | 4 | HF |
| [Symiotic-14B](https://huggingface.co/reaperdoesntknow/Symiotic-14B) | 3 | HF |

### Top Models from Our Lab

| Model | Downloads |
|-------|-----------|
| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | 501 |
| [LFM2.5-1.2B-Distilled-SFT](https://huggingface.co/reaperdoesntknow/LFM2.5-1.2B-Distilled-SFT) | 342 |
| [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) | 302 |
| [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF) | 203 |
| [Qwen3-1.7B-Coder-Distilled-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF) | 194 |

**Total Portfolio: 41 models | 2,781 total downloads**


*Last updated: 2026-03-28 12:57 UTC*

<!-- CIX-CROSSLINK-START -->

---

## From the Convergent Intelligence Portfolio

**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Our only BF16 series. Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B on H100. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. The rest of the portfolio proves structure beats scale on CPU. This collection shows what happens when you give the methodology real hardware.

Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads

Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)

*Convergent Intelligence LLC: Research Division*

<!-- CIX-CROSSLINK-END -->

## Discrepancy Calculus Foundation

This model is part of the [Convergent Intelligence LLC: Research Division](https://huggingface.co/reaperdoesntknow) portfolio. All models in this portfolio are developed under the Discrepancy Calculus (DISC) framework — a measure-theoretic approach to understanding and controlling the gap between what a model *should* produce and what it *actually* produces.

DISC treats training singularities (loss plateaus, mode collapse, catastrophic forgetting) not as failures to be smoothed over, but as **structural signals** that reveal the geometry of the learning problem. Key concepts:

- **Discrepancy Operator (D):** Measures the gap between expected and observed behavior at each training step
- **Jump Sets:** Boundaries where model behavior changes discontinuously — these are *features*, not bugs
- **Ghost Imprinting:** Teacher knowledge that transfers to student models through weight-space topology rather than explicit distillation signal

For the full mathematical treatment, see [Discrepancy Calculus: Foundations and Core Theory](https://huggingface.co/reaperdoesntknow/Discrepancy_Calculus) (DOI: 10.57967/hf/8194).

**Citation chain:** [Structure Over Scale](https://huggingface.co/reaperdoesntknow/Structure-Over-Scale) (DOI: 10.57967/hf/8165) → [Three Teachers to Dual Cognition](https://huggingface.co/reaperdoesntknow/DualMind_Methodolgy) (DOI: 10.57967/hf/8184) → [Discrepancy Calculus](https://huggingface.co/reaperdoesntknow/Discrepancy_Calculus) (DOI: 10.57967/hf/8194)