Running DeepSeek R1 Locally: The Complete Step-by-Step Guide

DeepSeek R1 has taken the AI world by storm, offering advanced reasoning capabilities that rival closed proprietary models. Running it locally guarantees absolute data privacy. In this guide, we will set up the reasoning model locally using Ollama.

Selecting the Right Parameter Size for Your RAM

DeepSeek R1 has been distilled into several model sizes. Running a model that exceeds your system memory will force the runtime to offload layers to your CPU, slowing output to a crawl. Use this table to select your model:

System RAM / VRAM	Recommended Model	File Size	Speed (approx.)
8 GB RAM	DeepSeek-R1-Distill-Qwen-1.5B	~1.1 GB	~45 tokens/sec
16 GB RAM	DeepSeek-R1-Distill-Qwen-8B	~4.7 GB	~30 tokens/sec
32 GB RAM	DeepSeek-R1-Distill-Llama-14B	~9.0 GB	~22 tokens/sec
64 GB+ RAM	DeepSeek-R1-Distill-Qwen-32B / 70B	~20.0 GB+	~15 tokens/sec

Installation Steps (Mac & Windows)

Step 1: Install Ollama

Go to the official Ollama website and download the package for your operating system:

Mac: Download the zip file, drag the Ollama application into your Applications folder, and launch it.
Windows: Run the installer and let the background service initialize.

Step 2: Download and Run the Model

Open your terminal (on Mac) or Command Prompt (on Windows) and execute the command for the model size you selected. For a standard 16GB RAM laptop, we recommend the 8B model:

# Pull and run the Qwen 8B distilled model
ollama run deepseek-r1:8b

Ollama will download the model weights (this may take a few minutes depending on your internet speed). Once complete, you will see a prompt interface.

Setting Up Open WebUI (Local Chat Interface)

To escape the terminal interface and enjoy a sleek, ChatGPT-like browser interface, you can run Open WebUI via Docker.

Make sure Docker is installed, then run:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Now, open your browser and navigate to http://localhost:3000. Register a local account, select deepseek-r1:8b from the top dropdown, and start chatting.

Prompting DeepSeek R1: The Thinking Output

DeepSeek R1 is a reasoning model that outputs its chain of thought. You will see its intermediate logical steps enclosed inside <think> and </think> tags.

Avoid writing prompts that force formatting constraints immediately. Let the model output its reasoning process, as this allows the model to calculate logical dependencies and deliver far more accurate, bug-free code or strategic advice.\n

Running DeepSeek R1 Locally: The Complete Step-by-Step Guide

Selecting the Right Parameter Size for Your RAM

Installation Steps (Mac & Windows)

Step 1: Install Ollama

Step 2: Download and Run the Model

Setting Up Open WebUI (Local Chat Interface)

Prompting DeepSeek R1: The Thinking Output

Written by Mehmet Demir

Smart Related Articles

Integrating Llama 3.1 Local API with Node.js: Quickstart

Setting Up a Local RAG System with LangChain and Python