DeepSeek R1 is a reasoning model — a category that changed what people expected from local AI when it emerged in early 2025. It shows its thinking. It works through problems step by step before arriving at an answer. And it runs locally on Apple Silicon.
For anyone doing serious analytical work without wanting to send that work to a cloud service, this is worth understanding in detail.
What DeepSeek R1 actually is
Most language models generate text token by token, predicting what comes next based on patterns in training data. DeepSeek R1 was trained with reinforcement learning specifically to reason. When you give it a problem, it generates a "thinking" section — a chain of reasoning that works through the problem before producing its answer. You can see this reasoning. You can evaluate where it went wrong. You can follow the logic.
This approach was pioneered by OpenAI's o1 model. DeepSeek R1 matches or exceeds o1 on many of those benchmarks, is fully open source, and runs on hardware you own.
DeepSeek R1 matches or exceeds o1 on many reasoning benchmarks. It's open source, runs locally, and costs nothing per query after the hardware investment.
The two sizes
DeepSeek R1 comes in multiple distilled sizes. The two we install and recommend:
DeepSeek R1 32B — runs on 24GB RAM (Mac Mini M4 Pro). The sweet spot: large enough to handle genuinely complex reasoning, small enough to run on accessible hardware at reasonable speeds. We see 15–25 tokens per second on M4 Pro with the MLX backend.
DeepSeek R1 70B — runs on 48GB RAM (Mac Studio M4 Max). DeepSeek's full reasoning capability. Significantly more capable on complex multi-step problems. Slower — typically 8–12 tokens per second — but for work where quality matters more than speed, it's remarkable. A frontier-level reasoning model running on hardware you own.
Performance on Apple Silicon: the MLX advantage
MLX is Apple's machine learning framework, optimized specifically for Apple Silicon's unified memory architecture. Unlike discrete GPU inference on other platforms, MLX uses the full memory bandwidth of the M-series chip — the primary bottleneck for inference at these model sizes.
The practical result: models run significantly faster on Apple Silicon than equivalent hardware from other vendors. A Mac Mini M4 Pro at $1,399 runs DeepSeek R1 32B faster than a comparable NVIDIA-based machine at the same price point. This is a property of the unified memory architecture, not marketing.
What R1 is excellent at
The training methodology produces specific strengths. R1 is notably good at:
- Mathematical reasoning — multi-step problems, proof construction, quantitative analysis
- Logical analysis — if/then reasoning, identifying inconsistencies, formal argument evaluation
- Structured tasks — anything that benefits from working through steps before committing to an answer
- Document analysis — particularly when connecting information across a long document
- Code review — identifying bugs through logical reasoning about program state
What R1 is not ideal for
The reasoning capability adds latency. For a complex question, R1 may generate hundreds of tokens of thinking before producing its answer. This is not a bug — it's the mechanism that produces better answers. But it makes R1 slower and more verbose than other models for casual conversation or quick responses.
For everyday chat, recipe suggestions, or quick lookups, a faster model like Llama 3.2 8B or Qwen 3 8B is the better choice. We configure multiple models in Open WebUI and help you understand which to reach for.
How to use it: Open WebUI interface
On every setup we build, DeepSeek R1 is accessible through Open WebUI — a browser-based interface that works like ChatGPT. You open a tab on any device on your network, select DeepSeek R1 from the model dropdown, and start a conversation.
The reasoning output appears as a collapsible "thinking" section before the answer. You can expand it to see R1's chain of reasoning — genuinely useful when you want to understand not just what the model concluded but why.
R1 responds especially well to problems explicitly framed as reasoning tasks: "Walk through the logic here," "Analyze this step by step," "Work through this calculation and show your reasoning." You don't need to ask it to show its work — it does by default — but framing your prompt as a reasoning task tends to engage the full capability.
The bigger picture
DeepSeek R1 represents something meaningful: a frontier-class reasoning model that runs on hardware that fits in your bag. Six months ago, this level of reasoning capability existed only on cloud servers behind API pricing.
For anyone doing analytical work with data they can't send to a cloud service — lawyers analyzing case files, financial analysts reviewing proprietary models, healthcare professionals with patient data, developers working on proprietary codebases — this changes the calculation.