CARDS-Qwen3.5-27B

Fine-tuned Qwen3.5-27B for classification of climate-contrarian claims using the CARDS taxonomy from Coan et al. (2026).

This is a merged checkpoint: a LoRA adapter (rank 16) trained on the CARDS SFT dataset has been merged back into the base weights for direct loading with transformers, vLLM, or any standard inference engine.

Results

Evaluated on the held-out CARDS test set (1,436 samples, Level 1, min_support ≥ 3):

Metric	Qwen3.5-27B	Qwen3.5-27B FT	Qwen3.6-27B FT	Claude Opus 4.6	Claude Opus 4.7
Samples F1	0.844	0.884	0.893	0.893	0.882
Macro F1	0.710	0.766	0.748	0.751	0.771
Micro F1	0.854	0.877	0.885	0.881	0.874
Precision	0.870	0.879	0.893	0.863	0.868
Recall	0.838	0.874	0.876	0.900	0.880
Parse failures	86 / 1436	0 / 1436	2 / 1436	0 / 1436	0 / 1436

Improves on the base Qwen3.5-27B across every metric; parse failures drop from 6% to 0.
Best Macro F1 at L1 among the five compared models — strongest rare-label coverage in this group.
Competitive with Claude Opus 4.6/4.7 on Micro F1 and Samples F1; trails on Recall.

Usage

With vLLM

vllm serve C3DS/CARDS-Qwen3.5-27B \
  --port 8000 \
  --max-model-len 4096 \
  --dtype bfloat16 \
  --enable-prefix-caching \
  --served-model-name CARDS-Qwen3.5-27B

Then query with any OpenAI-compatible client. The system prompt (slim_system_instruction) and the user-message suffix (cot_trigger) the model was trained with are bundled in this repo as cards_prompts.json — self-contained, with the CARDS taxonomy already inlined.

import json
from huggingface_hub import hf_hub_download
from openai import OpenAI

prompts = json.load(open(hf_hub_download("C3DS/CARDS-Qwen3.5-27B", "cards_prompts.json")))
slim_system_instruction = prompts["slim_system_instruction"]
cot_trigger             = prompts["cot_trigger"]

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

def classify(text):
    resp = client.chat.completions.create(
        model="CARDS-Qwen3.5-27B",
        messages=[
            {"role": "system", "content": slim_system_instruction},
            {"role": "user",   "content": f"### Text:\n{text}\n\n{cot_trigger}"},
        ],
        temperature=0,
        max_tokens=4000,
    )
    return resp.choices[0].message.content

print(classify("These are only a few renewable energy technologies at work"))

The model produces a reasoning trace inside <think>…</think> followed by a YAML categories: block listing predicted CARDS codes. To parse: take the content after </think> and read the categories: list.

See the project repository for training scripts, evaluation code, and dataset preparation.

Training

Base model: Qwen/Qwen3.5-27B
Method: LoRA (rank 16, α 16, dropout 0) on q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, then merged into base weights
Dataset: C3DS/cards_sft_dataset
Framework: Unsloth + TRL SFTTrainer
Hyperparameters: 3 epochs, per_device_train_batch_size=1, gradient_accumulation_steps=8, lr=2e-4, cosine schedule, 10 warmup steps, max_seq_length=4096, adamw_8bit, bf16
Checkpoint selection: best via load_best_model_at_end=True

Limitations

English only.
Thinking tokens. Training used enable_thinking=True. Either parse output after </think>, or disable thinking at inference via chat_template_kwargs={"enable_thinking": false}. Reserve token budget for the reasoning trace before the final YAML block.

Citation

@article{cards2pO2025,
  title={Large language model reveals an increase in climate contrarian speech in the United States Congress},
  author={Travis G. Coan and Ranadheer Malla and Mirjam O. Nanko and William Kattrup and J. Timmons Roberts and John Cook and Constantine Boussalis},
  journal={Communications Sustainability},
  year={2025}
}