Models

Frontier-class AI, running on your desk.

Open-weight models matched to your Apple Silicon — owned outright, fully offline, and kept current as better models ship. No API keys, no usage meters, no one else reading the traffic.

Matched to your hardware

The right model for the machine you own.

Hardware	RAM	Model range	Capability	Typical installs
Mac mini M4 Pro	24 GB	Up to ~32B	Fast daily assistant, coding copilot, summarization	Qwen 3 32B, Phi-4 14B, Gemma 3 27B
Mac Studio M4 Max	48–128 GB	Up to ~70–80B	Frontier-class chat, reasoning, document analysis	Llama 3.3 70B, Qwen 2.5 72B, DeepSeek R1 70B
Mac Studio cluster	192 GB+	100B+ / multi-model	Run several models simultaneously or the largest MoE	Llama 4 Scout 109B, parallel agents

Curated lineup

Models we install and keep current.

Model	Best at	Size	Min hardware
Llama 3.3 70B	All-round workhorse — chat, reasoning, writing	~70B	48 GB+
Qwen 2.5 72B	Structured tasks, multilingual, instruction following	~72B	48 GB+
DeepSeek R1 70B	Math, logic, step-by-step reasoning	~70B	48 GB+
Llama 4 Scout 109B	Frontier reasoning (MoE — fast for its size)	~109B MoE	64 GB+
Qwen 3 32B	Best quality at 24 GB — daily driver	~32B	24 GB+
Qwen 2.5 Coder 32B	Code generation, refactoring, debugging	~32B	24 GB+
DeepSeek R1 32B	Reasoning specialist, lighter hardware	~32B	24 GB+
Gemma 3 27B	Balanced chat + vision, strong on-device	~27B	24 GB+
Phi-4 14B	Snappy assistant — great speed-to-quality ratio	~14B	16 GB+
Gemma 3 4B	Ultra-light, instant responses, 8 GB machines	~4B	8 GB+

Sizes are approximate (quantized weights). Strengths are qualitative — we test on your hardware before install. This lineup changes as better open models ship.

Philosophy

Curated for your work, kept current.

01.

Matched to your hardware

We benchmark every model on the actual Mac you own — not synthetic scores. You get the largest, highest-quality model that runs comfortably on your silicon.

02.

Swapped as better models ship

Open-weight AI moves fast. When a new release outperforms what you're running, we update your stack — same workflow, better results.

03.

Always offline

Every model runs entirely on your hardware. No API calls, no telemetry, no cloud fallback. Your prompts and your data never leave the box.

Not sure which model fits your hardware?

We assess your machine, match the right stack, and install it — ready the same day.

Book a consultation