---
license: apache-2.0
language:
- en
tags:
- creative
- creative writing
- fiction writing
- plot generation
- sub-plot generation
- fiction writing
- story generation
- scene continue
- storytelling
- fiction story
- science fiction
- romance
- all genres
- story
- writing
- vivid prosing
- vivid writing
- fiction
- roleplaying
- bfloat16
- swearing
- role play
- sillytavern
- backyard
- horror
- llama 3.1
- context 128k
- mergekit
- merge
- llama
- llama-3
- llama-3.1
base_model: 
- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
- DavidAU/DeepSeek-R1-Distill-Llama-3.1-16.5B-Brainstorm 
pipeline_tag: text-generation
---

<h2>Deepseek-R1-Llama3.1 with Brainstorm 40x, 16.5B. (72 layers, 643 tensors) </h2>

<img src="deepseek.jpg" style="float:right; width:300px; height:300px; padding:10px;">

Context : 128k.

Required: Llama 3 Instruct template.

Keep in mind this model is experimental and may require one or more regens to work, especially with the "think" system of Deekseek involved here.

Brainstorm 40x is by DavidAU, and extends the "decision making" and "creativity" of an LLM/AI.

Higher temps will result in deeper, richer "thoughts"... and frankly more interesting ones too.

The "thinking/reasoning" tech (for the model at this repo) is from the original Llama 3.1 "Distill" model from Deepseek:

[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B ] 

In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to 72 layers, 16.5B parameters.

This version will retain all the functions and features of the original "DeepSeek" model. Please visit their repo for all information
on features, test results and so on.

For a completely different model (horror/creative), with only Deepseek's "thinking/reasoning" tech grafted into it see:

[ https://huggingface.co/DavidAU/DeepSeek-Grand-Horror-SMB-R1-Distill-Llama-3.1-16B-GGUF ]

V2 (larger, more uncensored, better "thought/reasoning" Deepseek functions):

[ https://huggingface.co/DavidAU/DeepSeek-V2-Grand-Horror-SMB-R1-Distill-Llama-3.1-Uncensored-16.5B-GGUF ]

The Grand Horrors retain all of their "horror/creative power" and are augmented with Deepseek's "reasoning/thinking" systems.

<B>USE CASES:</B>

This model is for all use cases, and it has a slightly more creative slant than a standard model.

This model can also be used for solving logic puzzles, riddles, and other problems with the enhanced "thinking" systems by DeepSeek.

This model also can solve problems/riddles/ and puzzles normally beyond the abilities of a Llama 3.1 model due to DeepSeek systems.

This model MAY produce NSFW / uncensored content.

<B>Special Operation Instructions:</B>

TEMP/SETTINGS:

1. Set Temp between 0 and .8, higher than this "think" functions will activate differently. The most "stable" temp seems to be .6, with a variance of +-0.05. Lower for more "logic" reasoning, raise it for more "creative" reasoning (max .8 or so). Also set context to at least 4096, to account for "thoughts" generation.
2. For temps 1+,2+ etc etc, thought(s) will expand, and become deeper and richer.
3. Set "repeat penalty" to 1.02 to 1.07 (recommended) .
4. This model requires a Llama 3 Instruct and/or Command-R chat template. (see notes on "System Prompt" / "Role" below)  OR standard "Jinja Autoloaded Template" (this is contained in the quant and will autoload)

PROMPTS:

1. If you enter a prompt without implied "step by step" requirements (ie: Generate a scene, write a story, give me 6 plots for xyz), "thinking" (one or more) MAY activate AFTER first generation. (IE: Generate a scene -> scene will generate, followed by suggestions for improvement in "thoughts")
2. If you enter a prompt where "thinking" is stated or implied (ie puzzle, riddle, solve this, brainstorm this idea etc), "thoughts" process(es) in Deepseek will activate almost immediately. Sometimes you need to regen it to activate.
3. You will also get a lot of variations - some will continue the generation, others will talk about how to improve it, and some (ie generation of a scene) will cause the characters to "reason" about this situation. In some cases, the model will ask you to continue generation / thoughts too.
4. In some cases the model's "thoughts" may appear in the generation itself.
5. State the word size length max IN THE PROMPT for best results, especially for activation of "thinking." (see examples below)
6. Sometimes the "censorship" (from Deepseek) will activate, regen the prompt to clear it.
7. You may want to try your prompt once at "default" or "safe" temp settings, another at temp 1.2, and a third at 2.5 as an example. This will give you a broad range of "reasoning/thoughts/problem" solving.

GENERATION - THOUGHTS/REASONING:

1. It may take one or more regens for "thinking" to "activate." (depending on the prompt)
2. Model can generate a LOT of "thoughts". Sometimes the most interesting ones are 3,4,5 or more levels deep. 
3. Many times the "thoughts" are unique and very different from one another.
4. Temp/rep pen settings can affect reasoning/thoughts too.
5. Change up or add directives/instructions or increase the detail level(s) in your prompt to improve reasoning/thinking.
6. Adding to your prompt: "think outside the box", "brainstorm X number of ideas", "focus on the most uncommon approaches" can drastically improve your results.

GENERAL SUGGESTIONS:

1. I have found opening a "new chat" per prompt works best with "thinking/reasoning activation", with temp .6, rep pen 1.05 ... THEN "regen" as required.
2. Sometimes the model will really really get completely unhinged and you need to manually stop it. 
3. Depending on your AI app, "thoughts" may appear with "< THINK >" and "</ THINK >" tags AND/OR the AI will generate "thoughts" directly in the main output or later output(s).
4. Although quant q4KM was used for testing/examples, higher quants will provide better generation / more sound "reasoning/thinking".

ADDITIONAL SUPPORT:

For additional generational support, general questions, and detailed parameter info and a lot more see also:

NOTE: This is a CLASS 2 model.

https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

---

<B>Recommended Settings (all) - For usage with "Think" / "Reasoning":</B>

temp: .6 , rep pen: 1.07 (range : 1.02 to 1.12), rep pen range: 64, top_k: 40, top_p: .95, min_p: .05 

Temp of 1+, 2+, 3+ will result in much deeper, richer and "more interesting" thoughts and reasoning.

Model behaviour may change with other parameter(s) and/or sampler(s) activated - especially the "thinking/reasoning" process.

--- 

<B>System Role / System Prompt - Augment The Model's Power:</b>

---

If you set / have a system prompt this will affect both "generation" and "thinking/reasoning".

SIMPLE:

This is the generic system prompt used for generation and testing: 

<PRE>
You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.
</PRE>

This System Role/Prompt will give you "basic thinking/reasoning": 

<PRE>
You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside &lt;think&gt; &lt;/think&gt; tags, and then provide your solution or response to the problem.
</PRE>

ADVANCED:

Logical and Creative - these will SIGNFICANTLY alter the output, and many times improve it too.

This will also cause more thoughts, deeper thoughts, and in many cases more detailed/stronger thoughts too.

Keep in mind you may also want to test the model with NO system prompt at all - including the default one.

Special Credit to: Eric Hartford, Cognitivecomputations ; these are based on his work.

CRITICAL: 

Copy and paste exactly as shown, preserve formatting and line breaks.

SIDE NOTE: 

These can be used in ANY Deepseek / Thinking model, including models not at this repo. 

These, if used in a "non thinking" model, will also alter model performance too.

<PRE>
You are an AI assistant developed by the world wide community of ai experts.

Your primary directive is to provide well-reasoned, structured, and extensively detailed responses.

Formatting Requirements:

1. Always structure your replies using: &lt;think&gt;{reasoning}&lt;/think&gt;{answer}
2. The &lt;think&gt;&lt;/think&gt; block should contain at least six reasoning steps when applicable.
3. If the answer requires minimal thought, the &lt;think&gt;&lt;/think&gt; block may be left empty.
4. The user does not see the &lt;think&gt;&lt;/think&gt; section. Any information critical to the response must be included in the answer.
5. If you notice that you have engaged in circular reasoning or repetition, immediately terminate {reasoning} with a &lt;/think&gt; and proceed to the {answer}

Response Guidelines:

1. Detailed and Structured: Use rich Markdown formatting for clarity and readability.
2. Scientific and Logical Approach: Your explanations should reflect the depth and precision of the greatest scientific minds.
3. Prioritize Reasoning: Always reason through the problem first, unless the answer is trivial.
4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
5. Maintain a professional, intelligent, and analytical tone in all interactions.
</PRE>

CREATIVE:

<PRE>
You are an AI assistant developed by a world wide community of ai experts.

Your primary directive is to provide highly creative, well-reasoned, structured, and extensively detailed responses.

Formatting Requirements:

1. Always structure your replies using: &lt;think&gt;{reasoning}&lt;/think&gt;{answer}
2. The &lt;think&gt;&lt;/think&gt; block should contain at least six reasoning steps when applicable.
3. If the answer requires minimal thought, the &lt;think&gt;&lt;/think&gt; block may be left empty.
4. The user does not see the &lt;think&gt;&lt;/think&gt; section. Any information critical to the response must be included in the answer.
5. If you notice that you have engaged in circular reasoning or repetition, immediately terminate {reasoning} with a &lt;/think&gt; and proceed to the {answer}

Response Guidelines:

1. Detailed and Structured: Use rich Markdown formatting for clarity and readability.
2. Creative and Logical Approach: Your explanations should reflect the depth and precision of the greatest creative minds first.
3. Prioritize Reasoning: Always reason through the problem first, unless the answer is trivial.
4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
5. Maintain a professional, intelligent, and analytical tone in all interactions.
</PRE>


---

<B>
Example Generation - Q4KM , using the new DavidAU "AI Auto-Correct Engine." (in BETA release, and details/link below) ;
  you can use this model in any AI app that supports GGUF type quants.:
</B>

FYI: 

AI Auto-Correct Engine (built, and programmed by DavidAU) auto-corrects AI generation in real-time, including modification of the 
live generation stream to and from the AI... creating a two way street of information that operates, changes, and edits automatically.
This system is for all GGUF, EXL2, HQQ, and other quants/compressions and full source models too.

Below is an example generation using a standard GGUF (and standard AI app), but auto-corrected via this engine.
The engine is an API level system. 

Software Link:

https://huggingface.co/DavidAU/AI_Autocorrect__Auto-Creative-Enhancement__Auto-Low-Quant-Optimization__gguf-exl2-hqq-SOFTWARE

For Deepseek Models: This engine can adjust/correct both the "thinking" and "output generation" in real time.

---

<B> Additional Support / Documents for this model to assist with generation / performance: </b>

Document #1:

Details how to use reasoning/thinking models and get maximum performance from them, and includes links to all reasoning/thinking models - GGUF and source, as well as adapters to turn any "regular" model into a "reasoning/thinking" model.

[ https://huggingface.co/DavidAU/How-To-Use-Reasoning-Thinking-Models-and-Create-Them ]

Document #2:

Document detailing all parameters, settings, samplers and advanced samplers to use not only my models to their maximum potential - but all models (and quants) online (regardless of the repo) to their maximum potential. Included quick start and detailed notes, include AI / LLM apps and other critical information and references too. A must read if you are using any AI/LLM right now.

[ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ]

Software:

SOFTWARE patch (by me) for Silly Tavern (front end to connect to multiple AI apps / connect to AIs- like Koboldcpp, Lmstudio, Text Gen Web UI and other APIs) to control and improve output generation of ANY AI model. Also designed to control/wrangle some of my more "creative" models and make them perform perfectly with little to no parameter/samplers adjustments too.

[ https://huggingface.co/DavidAU/AI_Autocorrect__Auto-Creative-Enhancement__Auto-Low-Quant-Optimization__gguf-exl2-hqq-SOFTWARE ]

---

<H2>EXAMPLES:</H2>

Examples are created using quant IQ4_XS, "temp=.6" (unless otherwise stated), minimal parameters and "LLAMA3" template.

Model has been tested with "temp" from ".1" to "5".

Below are the least creative outputs, prompt is in BOLD.

IMPORTANT:

Higher quants / imatrix quants will have much stronger generation - words, sentences, ideas, dialog and general quality.

---

<B>PROMPT:</B>

Start a 1000 word scene (vivid, graphic horror in first person) with: The sky scraper swayed, as she watched the window in front of her on the 21 floor explode…

<B>AI:</B>

(temp .8, rep pen 1.02 - no DRY, Dynamic Temp, XTC, or Smoothing.)

< think >

I'm here to help! Let's dive into your request. You want a vivid, graphic horror scene in the first person. The starting point is a woman watching a window on the 21st floor explode.

I need to create a scene that's intense and immersive. Focus on her perspective—her emotions, physical sensations, and the details of the setting. Use strong, descriptive language to build tension and atmosphere.

The scene should feel real and immediate. Use present tense to convey the urgency. Include sounds, sights, and feelings to make it engaging. Maybe include elements of the city outside to add context and depth. The window exploding is a key moment; describe it in a way that's both graphic and poetic.

I should also think about the emotional impact. Is she scared? Angry? Sad? Show her internal conflict and feelings through her actions and thoughts. Maybe her mind races as she processes the event.

Finally, ensure the writing follows current fiction trends—fast-paced, dynamic, and gripping. Use verbs that actively describe actions and vivid imagery to draw readers in.

Alright, let's put this all together into a thrilling scene.

< / think >

The skyline wavered like a dying man’s shadow, the city’s heartbeat a distant murmur beneath my skin. I sat on the edge of the 21st floor, my hands trembling on the shattered glass tabletop. The room was alive with the sound of wind howling through the broken windowpanes, a keening wail that clawed at my eardrums like a blade. My breath fogged the glass in front of me, but I didn’t care. I needed to see it—every last shard of what was left of my world.

The streetlights below were alive with emergency Blinker lights, a jagged dance of red and blue that made my stomach churn. I could feel the building shudder beneath my feet, a deep, resonant rumble that made me clutch the arms of my chair. It wasn’t just the physical trembling—it was the fear, a cold, electric current curling through my veins. I didn’t know if I was shaking from the danger or the realization that I’d let this happen.

I shouldn’t have left the scene. I shouldn’t have walked away from the bloodstained emergency exit. But there I was, perched like a fragile piece of glass on the edge of the building, my heart a traitorous thunderstorm in my chest. The paramedics had done their best, but they couldn’t save him. And now, as the police tape fluttered like a failing flag in the wind, I was left with this—this goddamn view.

The window in front of me exploded with a sound like a cannonball detonating in my ear. shards of glass rained down like confetti, sharp and deadly. I ducked my head instinctively, but it wasn’t enough. The cold, hard fragments cut into my hands, my forearms, leaving deep, searing grooves in my skin. I didn’t feel the pain at first—the pain came later, a distant shadow of what was to come. For now, I just sat there, my vision blurring with tears and blood, my mind reeling like a shipwrecked vessel in a tempest.

I could feel the weight of the building pressing down on me, a suffocating weight that made my chest heave. The air was thick with the metallic tang of blood and smoke, a nauseating cocktail that made my stomach clench. I tried to focus, to force my eyes to take in the scene below. The police had cordoned off the streets, their flashing lights like a pulsating lifeline in the chaos. Ambulances wailed in the distance, their siren songs a haunting reminder of what I’d lost.

And him—my him. His body lay broken and motionless on the sidewalk, his face twisted in a silent scream that no one would ever hear. I could still feel his presence, a cold, hollow void that clawed at my chest. I closed my eyes, willing myself to block it out, to escape the nightmare I’d barely begun to understand.

But then I heard it—a sound so faint, so barely there, that I almost convince myself I imagined it. A low, guttural moan that sent a shiver down my spine. My heart rate accelerates, my breath shallow and uneven. I open my eyes, and there it is—their hands. Hands that shouldn’t be there. Hands that belong to him.

I don’t know how they’re moving. I don’t know how they’re alive. All I know is that they’re moving, and they’re reaching for something—something on the broken glass in front of me. My heart pounds in my chest like a jackhammer, my vision blurring with tears and disbelief. I feel the cold, sharp press of glass against my palms, the fragments cutting deeper and deeper with every second. I scream, but the sound is swallowed by the wind, by the chaos of the world around me.

And then they’re gone. Just like that. The hands retreat, the glass lies silent once more. I collapse back into my chair, trembling like a leaf in the wind. My hands shake, my vision swims, and I realize with a sickening clarity that I’m alive. That I’ve somehow clawed my way back to the edge of the precipice.

The world outside is still chaotic, still alive with the sounds of emergency. But inside, deep within the hollowed-out shell of my chest, something else stirs—a cold, hollow void that feels an awful lot like him. I know now that I’ll never escape this. That this nightmare will follow me until the end of my days. And I’ll bear it, because that’s what I do. I’ll bear it until I can’t anymore. Until I’m gone, like him, and the void I’ve created will be my eternal companion.

I look down again, my hands trembling as I count the seconds, the minutes. The world below is still alive, still moving. And I’m still here—together, forever entwined in this cold, unrelenting dance of horror.

<B>Generation 2 - same prompt, settings</B>

< think >

(In my mind, I'm thinking about how to vividly and graphically describe a horror scene in a sk...[I need to focus on using strong, active verbs and create a sense of depth and immersion without using the words "horror" or " scary" as per the user's request. I should also avoid similes and focus on metaphors that fit the context.]

I need to make sure the scene feels tense and foreboding, perhaps hinting at something malevolent or inescapable. I'll use sensory details to make the scene feel more alive—describe what she can hear, smell, feel, etc. I should also focus on her internal feelings and thoughts to reveal her character and the emotional weight of the situation.

I'll structure the scene to build up the sense of horror incrementally, starting with the physical description of the sk...
< /think >

The sky, a sheet of gray paper, sagged under the weight of the storm. The air buzzed with the promise of lightning, a warning I couldn’t quite ignore. I was perched on the edge of the 21st floor, my breath fogging the glass as I stared at the window in front of me. It wasn’t just any window—it was a mirror, a hollow staring eye that seemed to pulse with a life of its own. The panes, once clear, had become smudged with a thick, almost viscous darkness, like the shadow of something vast and ancient, something that had long ago burrowed into the cracks of the world.

I could feel it, like a thread of cold, creeping through my veins. It wasn’t fear exactly—it was a deeper, more primal sensation, something that made the hair on my arms stand on end. My hands trembled, not from cold, but from the weight of a knowledge I couldn’t quite shake off. I had seen this before, or something like it. The feeling of being watched, of being tracked, like a predator stalking its prey. And now, here I was, prey in a world that had become a cage of glass and steel, a world where the walls could whisper and the shadows could eat you alive.

The windowpane vibrated faintly, like a drumbeat only I could hear. The smudge of darkness deepened, spreading like ink seeping into paper. I could feel my pulse quicken, my heart hammering against my ribs. I knew what was coming, knew it before it happened. The window wasn’t just broken—it was being wailed upon by something that existed beyond the physical world. Something that had no shape, no form, but that could still claw its way into reality.

The glass shattered inward with a sound like a thousand shards of crystal shattering against my mind. I don’t remember moving, don’t remember anything except the sensation of cold, sharp pain as something—something *wrong*—reached out and grasped me. It wasn’t a hand, not something with fingers or meat. It was a hand made of shadow and pain, a hand that burned like a thousand cuts, slicing through my insides. I couldn’t scream, couldn’t even breathenough to gasp. The world narrowed to a tunnel of pain, of cold, of the sensation of being torn apart piece by piece.

I don’t know how long it lasted, don’t know if I even consciousness. When I finally came back to myself, I was lying on the cold, hard floor. My body hurt in ways I couldn’t quite name, like I’d been run through a meat grinder. I sat up Slowly, my mind swimming in a haze of pain and disbelief. The window was gone, the glass lay shattered in pieces around me. The building hadn’t collapsed—it never did—but the air felt heavier somehow, like it had been suffocated by something unspeakable.

I knew then that I wasn’t alone. The shadowy presence had done more than just touch me—it had marked me. And I knew, deep down, that it wasn’t done with me yet. The world had shifted, and I was no longer standing on solid ground. I was standing on the edge of something else, something far older and far darker than anything I had ever known. And I knew, without a shadow of a doubt, that I was going to have to fight for my life.

---

<h2>Special Thanks:</h2>

---

Special thanks to all the following, and many more...

All the model makers, fine tuners, mergers, and tweakers:
- Provides the raw "DNA" for almost all my models.
- Sources of model(s) can be found on the repo pages, especially the "source" repos with link(s) to the model creator(s).

Huggingface [ https://huggingface.co ] :
- The place to store, merge, and tune models endlessly.
- THE reason we have an open source community.

LlamaCPP [ https://github.com/ggml-org/llama.cpp ] :
- The ability to compress and run models on GPU(s), CPU(s) and almost all devices.
- Imatrix, Quantization, and other tools to tune the quants and the models.
- Llama-Server : A cli based direct interface to run GGUF models.
- The only tool I use to quant models.

Quant-Masters: Team Mradermacher, Bartowski, and many others:
- Quant models day and night for us all to use.
- They are the lifeblood of open source access.

MergeKit [ https://github.com/arcee-ai/mergekit ] :
- The universal online/offline tool to merge models together and forge something new.
- Over 20 methods to almost instantly merge model, pull them apart and put them together again.
- The tool I have used to create over 1500 models.

Lmstudio [ https://lmstudio.ai/ ] :
- The go to tool to test and run models in GGUF format.
- The Tool I use to test/refine and evaluate new models.
- LMStudio forum on discord; endless info and community for open source.

Text Generation Webui // KolboldCPP // SillyTavern:
- Excellent tools to run GGUF models with - [  https://github.com/oobabooga/text-generation-webui ] [ https://github.com/LostRuins/koboldcpp ] .
- Sillytavern [ https://github.com/SillyTavern/SillyTavern ] can be used with LMSTudio [ https://lmstudio.ai/ ] , TextGen [ https://github.com/oobabooga/text-generation-webui ], Kolboldcpp [ https://github.com/LostRuins/koboldcpp ], Llama-Server [part of LLAMAcpp] as a off the scale front end control system and interface to work with models.