Six months ago I was paying $198/month for Claude Pro. Last month my AI bill was $11.23 — and that was electricity.
The Mac Mini has been running continuously since November. I haven't thought about hitting a rate limit in four months. I haven't worried about sensitive client details going anywhere I can't control. And I haven't opened a subscription page wondering whether the price was going to change next quarter.
The setup
Mac Mini M4 Pro, 48GB RAM. Sits on a shelf in the office, runs continuously, connected to the router via ethernet. From any device on my network — laptop, iPad, phone — I open a browser tab to Open WebUI and have a full AI interface. Looks like ChatGPT. Works better for my use cases.
In Telegram, I have OpenClaw configured as my assistant. I send it a message from anywhere — from my phone when I'm away from the office, from my computer without switching windows — and it responds from the Mac Mini at home. The message goes to my hardware. Not to Anthropic, not to OpenAI, to my hardware.
The models currently loaded: Qwen 3 32B as my primary for general work, DeepSeek R1 32B for reasoning-heavy tasks, Qwen 2.5-Coder 32B when I'm in a coding session, and Llama 3.2 8B for quick queries where I just need a fast response. See the full model guide for what each does well.
What got better
No rate limits, for real. I used to hit Claude Pro's limits on heavy days. Not constantly, but enough that I had to think about whether I was "using up" my allocation on something worth it. That friction is gone. I run long tasks, let them run in the background, chain multiple queries together. The hardware doesn't care.
Privacy I actually believe. With cloud AI, "private" means "we promise not to use this for training." With local AI, private means the text physically never leaves the machine. I use AI for sensitive business correspondence, client analysis, financial planning. I don't have to evaluate whether Anthropic's privacy policy covers what I'm doing today. The question doesn't arise.
Speed on simple tasks. For quick lookups, short summaries, and back-and-forth conversation, local models at 22 tok/s are faster in practice than cloud models that route across the internet. Complex reasoning still takes longer than frontier models — but 80% of what I actually use AI for falls in the fast category.
The cost math after break-even. I paid for hardware and setup once. Month seven onward, AI usage costs electricity. Cloud AI is $198/month indefinitely. The compounding advantage is significant and it only grows.
The models are 6–12 weeks behind the frontier. For 85–90% of what I actually use AI for, the quality difference doesn't matter.
What's different (the honest account)
The initial setup wasn't plug-and-play. I didn't configure this myself — I used the Maai Machines setup service. That's not a complaint, it's a description of the current state of things. Local AI on Apple Silicon is accessible but not yet one-click. If you're comfortable with technical setup, guides exist; if not, you need help.
The models are 6–12 weeks behind the frontier. Qwen 3 32B is excellent — meaningfully better than what was frontier-level a year ago. But it's not GPT-5 or Claude Opus. For most of my work, the quality difference doesn't matter. For the occasional task where I need the best available reasoning, I still use a cloud model. The honest breakdown: 85–90% of my AI usage is better or equivalent locally; 10–15% still goes to cloud for tasks where frontier reasoning matters.
Requires a dedicated machine. The Mac Mini runs 24/7. That's the point — it's always available. But it means I have a Mac Mini running 24/7. It draws about 20–25W at idle, fits anywhere, and is quiet. Not nothing, but not much either.
The math for someone considering it
If you're paying $100/month or less for AI and using it casually, local AI probably doesn't pay for itself in 24 months. Wait for hardware to get cheaper or your usage to increase.
If you're at $150–$200/month, you're looking at an 18–24 month break-even on hardware and setup. The economics are reasonable; the privacy and rate-limit arguments close the case.
If you're at $200+/month, the math is clear. My total setup cost: Mac Mini M4 Pro 48GB at $1,999 plus the Maai setup at $1,999. One time. At $198/month, I break even in about 18 months. After that, every month of cloud AI I would have paid for is money I'm not spending. Two years in, I'm approximately $1,800 ahead — and still counting.
See the full cost breakdown at /pricing, which shows the 24-month math across different hardware tiers and different API services you might be replacing.
What I'd do differently
Nothing about the hardware choice. The 48GB M4 Pro is right if you want to run 32B models comfortably with room to spare.
I'd set up Tailscale earlier. The ability to reach the Mac Mini from anywhere — from my phone while traveling, with traffic encrypted end-to-end — is genuinely valuable. I waited two months to configure it. Shouldn't have.
I'd also have a clearer list upfront of which use cases I was replacing. I went in planning to replace everything and ended up replacing most things. The 10–15% that still goes to cloud isn't a failure — it's an accurate description of where frontier models still have an edge. Know what those are for your workflow before you start.