Skip to content

Training New Adapters

A generic workflow for creating LocoLLM adapters. The math, code, and analysis adapters all follow this pattern.

Every adapter goes through five stages:

  1. Prepare data — download a dataset and format it for Qwen3 chat template
  2. Train — QLoRA adapter training on Qwen3-4B and export as merged GGUF
  3. Register — add the adapter to adapters/registry.yaml
  4. Deploy — load the merged GGUF into Ollama via loco setup
  5. Evaluate — run loco eval <adapter> to compare against the base model

Each adapter needs a training_data.jsonl in Qwen3 chat format:

{"conversations": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}

Create a script at scripts/prepare_<name>_data.py following the existing patterns:

  • Math (scripts/prepare_gsm8k.py): Downloads from GSM8K, formats step-by-step reasoning
  • Code (scripts/prepare_code_data.py): Downloads Python instruction→code pairs, filters short outputs
  • Analysis (scripts/prepare_analysis_data.py): Downloads science passages, builds passage+question→answer format

Data prep scripts use the Hugging Face datasets API and output to adapters/<name>/training_data.jsonl.

  • Filter out very short or empty outputs
  • 200-300 examples is enough for a PoC adapter
  • The assistant response format matters: consistent formatting helps the model learn the pattern

Use the generic training script:

Terminal window
source .venv-train/bin/activate
python scripts/train_adapter.py --adapter-name <name>

This handles LoRA setup, adapter training, and GGUF export. Override defaults if needed:

Terminal window
python scripts/train_adapter.py --adapter-name code --epochs 5 --lr 1e-4

The adapter-specific train_math_adapter.py still works for math — the generic script is equivalent with --adapter-name math.

See Training the Math Adapter for detailed walkthrough of the training process.

ParameterValue
LoRA rank16
LoRA alpha32
Learning rate2e-4
Epochs3
Batch size2 (gradient accum 4, effective 8)
Max seq length1024
QuantizationQ4_K_M

Add the adapter to adapters/registry.yaml:

adapters:
your_adapter:
version: "0.1.0"
type: "merged-gguf"
gguf_path: "your_adapter/gguf/unsloth.Q4_K_M.gguf"
description: "Short description"
authors: ["Your Name"]
tags: ["relevant", "tags"]
eval_dataset: "eval_dataset.jsonl"
eval_type: "numeric" # or "code" or "analysis"
router_keywords: ["keyword1", "keyword2"]
training:
base_model: "unsloth/Qwen3-4B-unsloth-bnb-4bit"
method: "qlora"
dataset: "Dataset name (N examples)"
lora_r: 16
lora_alpha: 32
epochs: 3
quantization: "q4_k_m"

Key fields:

  • eval_type: Controls how loco eval scores responses — "numeric" (exact number match), "code" (syntax + keywords), or "analysis" (substring match)
  • router_keywords: Words that trigger this adapter when using automatic routing
Terminal window
loco setup

This creates the Ollama model from the merged GGUF.

Create adapters/<name>/eval_dataset.jsonl with 20 hand-crafted problems. Format depends on eval_type:

Numeric (math):

{"question": "What is 2+2?", "answer": 4}

Code:

{"question": "Write a function...", "answer_keywords": ["def ", "return"], "eval_type": "code"}

Analysis:

{"question": "Read the passage... What is X?", "answer": "expected answer", "eval_type": "analysis"}

Run the benchmark:

Terminal window
loco eval <adapter_name>
AdapterDatasetData prepEval type
mathGSM8K (200 examples)scripts/prepare_gsm8k.pynumeric
codepython_code_instructions_18k_alpaca (300 examples)scripts/prepare_code_data.pycode
analysisallenai/sciq (300 examples)scripts/prepare_analysis_data.pyanalysis

The current adapters are deliberately basic (PoC quality). Here are concrete ways to improve them:

  • Better data prep: Clean up noisy examples, add more training data, improve prompt templates
  • Analysis adapter: The templated reasoning is crude — replace with actual model-generated explanations
  • Code eval: Add execution-based testing (run the code, check outputs) instead of syntax-only checks
  • Hyperparameter tuning: Experiment with LoRA rank, learning rate, number of epochs
  • Router upgrade: Replace keyword matching with an ML classifier (see ADR-0003)
  • New adapters: Creative writing, translation, summarization — follow the same workflow
  • Prompting strategies: Add RE2 prompting or chain-of-thought to the eval harness