DeepSeek R1 has taken the AI world by storm, offering advanced reasoning capabilities that rival closed proprietary models. Running it locally guarantees absolute data privacy. In this guide, we will set up the reasoning model locally using Ollama.
Selecting the Right Parameter Size for Your RAM
DeepSeek R1 has been distilled into several model sizes. Running a model that exceeds your system memory will force the runtime to offload layers to your CPU, slowing output to a crawl. Use this table to select your model:
| System RAM / VRAM | Recommended Model | File Size | Speed (approx.) |
|---|---|---|---|
| 8 GB RAM | DeepSeek-R1-Distill-Qwen-1.5B | ~1.1 GB | ~45 tokens/sec |
| 16 GB RAM | DeepSeek-R1-Distill-Qwen-8B | ~4.7 GB | ~30 tokens/sec |
| 32 GB RAM | DeepSeek-R1-Distill-Llama-14B | ~9.0 GB | ~22 tokens/sec |
| 64 GB+ RAM | DeepSeek-R1-Distill-Qwen-32B / 70B | ~20.0 GB+ | ~15 tokens/sec |
Installation Steps (Mac & Windows)
Step 1: Install Ollama
Go to the official Ollama website and download the package for your operating system:
- Mac: Download the zip file, drag the Ollama application into your Applications folder, and launch it.
- Windows: Run the installer and let the background service initialize.
Step 2: Download and Run the Model
Open your terminal (on Mac) or Command Prompt (on Windows) and execute the command for the model size you selected. For a standard 16GB RAM laptop, we recommend the 8B model:
# Pull and run the Qwen 8B distilled model
ollama run deepseek-r1:8b
Ollama will download the model weights (this may take a few minutes depending on your internet speed). Once complete, you will see a prompt interface.
Setting Up Open WebUI (Local Chat Interface)
To escape the terminal interface and enjoy a sleek, ChatGPT-like browser interface, you can run Open WebUI via Docker.
Make sure Docker is installed, then run:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Now, open your browser and navigate to http://localhost:3000. Register a local account, select deepseek-r1:8b from the top dropdown, and start chatting.
Prompting DeepSeek R1: The Thinking Output
DeepSeek R1 is a reasoning model that outputs its chain of thought. You will see its intermediate logical steps enclosed inside <think> and </think> tags.
Avoid writing prompts that force formatting constraints immediately. Let the model output its reasoning process, as this allows the model to calculate logical dependencies and deliver far more accurate, bug-free code or strategic advice.\n