Yes, we're a little loco · Open Source · MIT License

LocoLLM

Frontier AI on a budget. Crazy, right? We're building a routed swarm of tiny specialist models and testing whether they can outperform generalists on real tasks. No cloud. No API keys. Just your hardware doing more than you'd expect.

Get Started View on GitHub
terminal
$ loco setup
✓ Pulled qwen3:4b via Ollama (2.5 GB)
$ loco query "solve 2x + 5 = 13"
[router → math]
x = 4
$ loco eval math # how much does the adapter help?
How It Works

One base model. Many specialists.
Sounds loco. Works great.

The idea is simple: instead of one giant model that's okay at everything, route each query to a lightweight specialist fine-tuned for that task. Research suggests this approach has real potential. We're building the tools to find out.

1

Single Base Model

Qwen3-4B quantized to Q4_K_M. Fits in 2.5 GB of VRAM. Tested on GPUs from a GTX 1050 Ti to an RTX 2060 Super. Runs on CPU too, just slower.

2

LoRA Adapters

Tiny specialist layers fine-tuned for specific domains, merged into standalone GGUFs. We're starting with math, code, and analysis.

3

Smart Router

Classifies your query and picks the best adapter automatically. No manual switching. Just ask your question. Keyword-based now, ML classifier next.

We're all a little loco here.

The sensible approach is to pay for API access and let someone else handle it. We'd rather find out what's possible on a GPU we bought secondhand for the price of a nice dinner. Maybe that's loco. We think it's worth finding out.

Skills Over Gear

You don't need a $10k GPU rig. A secondhand graphics card and good training data will get you most of the way there. That's not a limitation. That's the point.

🎓

Built Together

Every adapter, benchmark, and routing improvement makes the whole system smarter. Contribute a specialist and everyone benefits. That's the theory. We're building the evidence.

🔒

Runs Offline

No API keys. No cloud bills. No data leaving your machine. Everything runs locally through Ollama. Your queries, your hardware, your business.

📈

AI Last Resort

LocoLLM is a thinking partner, not an answer machine. Do the work first, then use AI to check, challenge, and sharpen. That's not old-fashioned. That's how you actually learn.

Who It's For

Are you loco enough?

If any of these sound like you, welcome to the club.

💰

The Budget Rebel

You refuse to pay per-token for something your own hardware can do. You've done the math on API costs and it offends you. Good. Channel that energy.

🤖

The Tinkerer

You want to understand how LLMs actually work by cracking them open and rewiring the internals. Fine-tuning a real adapter teaches more than any tutorial ever will.

🔬

The Researcher

You need reproducible local inference for experiments. You want to test whether a team of specialists really can beat a generalist. That's an open question. Help us answer it.

🏫

The Educator

You teach AI or computing and want a real project your classes can contribute to. Not a toy demo. Real infrastructure that grows with every cohort.

🔐

The Vault

Your data doesn't leave your machine. Period. Medical notes, legal research, personal journals, proprietary code. Local means local.

The Scrapper

You know the best gear doesn't make the best work. A $150 secondhand GPU and sharp training data might just surprise you. That's what we're betting on.

Architecture

Deceptively simple.

A query comes in, the router picks an adapter, Ollama loads the matching model, and the response goes out. No orchestration frameworks. No agent graphs. No PhD required. Whether this simplicity is a strength or a limitation is what we're here to find out.

Your Query
Router (classifier)
math
code
analysis
yours?
LoRA Adapters (merged GGUF) · 3 adapters now, more to come
Qwen3-4B · Q4_K_M · 2.5 GB
Each adapter is a merged GGUF · Ollama handles model switching

Early days. Eyes wide open.

LocoLLM is a young project. We're not pretending otherwise. Here's what exists today and where we're headed.

Proof of Concept

Three adapters (math, code, analysis) on Qwen3-4B, CLI with query and chat, keyword router, evaluation harness. It works. We can measure the difference.

Now

MVP: Student Contributions

First student cohort adds new adapters, improves the router (keyword → ML classifier), and publishes benchmark results. The first real test of whether routing beats a generalist.

Building

Validation

Rigorous benchmarks comparing specialist routing vs. base model across domains. Honest results, published openly, whatever they show.

Next

The Vision

A growing ecosystem of community-trained specialist adapters, smart routing, and one-command setup. Competitive AI that runs on hardware you already own.

Vision
Get Involved

Join the loco ones.

LocoLLM is a collaborative project. Every adapter, benchmark, and improvement makes the whole system better for everyone. The barrier to entry is low. The ceiling is high.

🧪 Train an Adapter

Pick a domain you care about — math, code, business writing, security, legal. Curate a dataset, fine-tune, and contribute it back. Your specialisation becomes everyone's tool.

📊 Benchmark & Evaluate

Run standardised evaluations. Compare adapters against base models. Publish reproducible results. Help us prove what works.

💻 Improve the Router

The router is what makes the system feel smart. Better classification means better adapter selection. Bring your NLP skills.

👥 User Experience Study

How do students actually use local AI? Design a study, observe real usage, and turn findings into actionable improvements. Research methods meet real users.

💰 Cost-Benefit Analysis

Is local AI actually cheaper than cloud? Build a cost model, run the numbers, and find out. Financial modelling meets open-source AI.

📝 Write Documentation

Guides, tutorials, video walkthroughs, onboarding redesign. Good docs lower the barrier for the next contributor.

If a project interests you but you're not sure you have the skills, that's probably the right project. The one that stretches you is the one you'll learn the most from.

Part of LocoLab

Six projects. Three layers. One lab.

⛩️ LocoPuente
Contact

Say hello.

LocoLLM is a School of Management and Marketing initiative at Curtin University. Whether you're a student looking for a capstone project, a researcher interested in collaboration, or just curious — we'd love to hear from you.

Project Lead: Michael Borck

Get in Touch View on GitHub