Connect Ollama (Local Models)
Ollama lets you run local models on your own computer: all data stays on your machine, ideal for sensitive information and confidential projects. No API Key needed, works offline.
What You'll Be Able to Do
No hype, just things you can do right away
- Use OpenCode to handle sensitive data in a completely offline environment where data never leaves your machine
- Install and verify that Ollama is working properly
- Choose and download a suitable local model based on your hardware
- Configure and use Ollama local models in OpenCode, completing your first conversation
Your Current Dilemma
If you're experiencing these, this lesson is for you
- You have many sensitive files on your desktop (financial data, customer information, personal privacy, etc.) and don't want to upload them to third-party AI models
- You're developing highly confidential projects where code and documents absolutely cannot leave company computers
- You have strict data privacy requirements—even if AI providers promise not to save data, you don't want any data leaving your machine
- Your network environment is restricted and you need to use AI assistants offline
- You entered
/connectlooking for local Ollama but only see Ollama Cloud and don't know what to do next
When to Use This
Core value of local models: Privacy First
When you need to:
- Handle sensitive data (financial reports, customer data, personal privacy)
- Develop confidential projects (trade secrets, patent code, security systems)
- Ensure data never leaves your machine (zero-trust principle, compliance requirements)
And you don't want to:
- Upload desktop files or Excel content to third parties
- Let project code leave your computer
- Depend on network connection to use AI
💡 Local Models vs Cloud Models
| Scenario | Local Model (Ollama) | Cloud Model |
|---|---|---|
| Processing sensitive Excel data | ✅ Data stays local | ❌ Needs upload |
| Developing confidential projects | ✅ Code processed locally | ❌ Needs upload |
| Daily development tasks | ⚠️ Weaker capability | ✅ Stronger capability |
| Fully offline use | ✅ Supported | ❌ Requires network |
🎒 Before You Start
Make sure you've completed the following, otherwise stop here
- [ ] Completed 1.2 Installation, can run
opencode --version - [ ] Your hard drive has at least 10GB free space (model files get big quickly)
- [ ] You roughly know your machine's RAM/VRAM capacity (not critical if you don't, recommendations below)
Core Concept
First explain "how to think", not commands
- Core advantage of local models: Privacy and Confidentiality
- All data is processed locally, nothing is sent to third-party servers
- Suitable for: financial data, customer information, personal privacy, trade secrets, patent code, etc.
/connectis only for saving API keys (that's why you see Ollama Cloud); local Ollama doesn't need a key, you need to configure the provider inopencode.jsonto make it appear in/models- Running local models mainly depends on RAM/VRAM: larger models need more resources
- Don't chase 70B right away as a beginner: start with 7B/8B to get the workflow working, then upgrade
- When choosing models, consider both:
- Parameter size (B): Generally larger means stronger
- Use case: For coding, prefer
*-coder; for general chat, choosellama/qwen/gemma
Follow Along
Step by step, assuming you might make mistakes
Step 1: Install Ollama
Why
You need a local model runtime that will start a service on your machine, so OpenCode can connect to it.
brew install ollamacurl -fsSL https://ollama.com/install.sh | sh# Download installer from official website
# https://ollama.com/downloadVerify installation:
ollama --versionYou should see: Version number output (e.g., ollama version x.y.z).
Step 2: Choose a Model Suitable for Your Hardware and Download
Why
If you choose a model that's too large, common results are: long download time, slow execution, or even insufficient memory/VRAM.
First, remember this selection rule (experience-based, good enough for beginners):
- 3B~8B: Entry-level/lightweight, can run on 16GB RAM computers
- 14B: Noticeably stronger, usually needs 16GB VRAM or 32GB+ RAM for comfortable use
- 27B~32B: High-quality local assistant, recommends 24GB~32GB VRAM or 48GB+ RAM
- 70B/72B: Local "ceiling" level, usually needs 48GB+ VRAM (or multi-card/CPU hybrid offload, speed will drop noticeably)
Download Size ≠ Runtime Usage
The GB shown on Ollama model pages is "download file size". Runtime also needs additional KV cache (longer context uses more RAM/VRAM).
Recommend reserving at least 1.2~1.5x "download size" as available RAM/VRAM headroom.
Popular Model Recommendations (by Use Case):
- General chat (lightweight):
llama3.2:3b(~2.0GB download) - General chat (stronger):
llama3.1:8b(~4.9GB download) - Chinese/General capability:
qwen2.5:7b(~4.7GB download),qwen2.5:14b(~9.0GB download) - Coding priority:
qwen2.5-coder:7b(~4.7GB download),qwen2.5-coder:14b(~9.0GB download),qwen2.5-coder:32b(~20GB download) - Other common:
gemma2:9b(~5.4GB download),mistral:7b(~4.4GB download)
Choose by Hardware (Experience-Based):
| Your Hardware | Recommended Parameter Size | Recommended Models (Examples) |
|---|---|---|
| MacBook (Apple Silicon) 16GB Unified Memory (e.g., M4 16GB) | 3B~8B | llama3.2:3b / llama3.1:8b / qwen2.5-coder:7b |
| Discrete GPU 16GB VRAM (e.g., 16GB VRAM NVIDIA/AMD) | 7B~14B | qwen2.5-coder:14b / qwen2.5:14b / gemma2:9b |
| Discrete GPU 24GB VRAM | 14B~32B | qwen2.5-coder:32b / qwen2.5:32b / gemma2:27b |
| Discrete GPU 32GB VRAM (e.g., RTX 5090 32GB) | 32B | qwen2.5-coder:32b / qwen2.5:32b |
| Discrete GPU 48GB+ VRAM | 70B/72B | llama3.1:70b / qwen2.5:72b |
About "What Can a 5090 Run"
The key isn't the model number, but your VRAM size.
If your 5090 has 24GB~32GB VRAM, it's generally better suited for 32B class; 70B/72B often needs 48GB+ VRAM, or CPU hybrid offload (will run but slower).
Start downloading (choose one, recommend getting the workflow working first):
# Coding priority (recommended)
ollama pull qwen2.5-coder:7b
# General chat
ollama pull llama3.1:8b
# Lighter weight
ollama pull llama3.2:3bYou should see: Download progress and success message like pulling manifest ... success.
Step 3: Start Ollama Service
Why
OpenCode needs to communicate with Ollama through a local HTTP service.
Run in terminal:
ollama serveYou should see: Listening on 127.0.0.1:11434 (keep this terminal window open).
Step 4: Configure "Local Ollama Provider" in OpenCode (Copy-Paste Ready)
Why
Only seeing Ollama Cloud in /connect is normal: /connect only handles saving API keys (to ~/.local/share/opencode/auth.json).
Local Ollama doesn't need a key, so the correct approach is: configure a provider in opencode.json, putting Ollama's OpenAI-compatible address in it.
Where to Put the Config? (Choose One)
- Global:
~/.config/opencode/opencode.json(works for all projects) - Project-level: put
opencode.jsonin the current project root (only affects this project)
If you're unsure which path OpenCode is reading, you can run:
opencode debug pathsCopy-Paste Ready Config (Local Ollama):
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b": { "name": "Qwen2.5 Coder 7B (local)" },
"llama3.1:8b": { "name": "Llama 3.1 8B (local)" },
"llama3.2:3b": { "name": "Llama 3.2 3B (local)" }
}
}
}
}Two Easily Overlooked Points
- The key in
models(e.g.,qwen2.5-coder:7b) should match the model name youollama pulled. - This config only "declares the list of available models" in OpenCode; it doesn't download models for you. Downloading still requires
ollama pull ....
You should see: After saving the config and restarting OpenCode, models under Ollama (local) will appear in /models (verify in next step).
Step 5: Restart OpenCode, Select Local Model, and Send First Message
Why
OpenCode reads opencode.json at startup; you usually need to restart once after changing config.
Open a new terminal:
opencodeEnter:
/modelsSelect the model you just downloaded and configured, then send:
Hello, please introduce yourselfYou should see: The model starts outputting a response (local models might be a bit slower than cloud).
Why Use Local Models?
Local models vs Cloud models: Choosing the right scenario matters
Two Core Values of Local Models
1. Privacy Protection: Data Never Leaves Your Machine
When using local models, all your data (including file contents, code, conversation history) is processed on your own computer, not sent to any third-party servers.
Typical Use Cases:
- Processing sensitive Excel data (financial reports, customer data, payroll, etc.)
- Analyzing desktop files and folders (containing personal privacy or trade secrets)
- Organizing personal documents (ID numbers, bank information, health records, etc.)
⚠️ Why This Matters
Even if AI service providers promise "not to save your data", the data does get transmitted to their servers. For strict compliance requirements (like finance, healthcare, government), this may violate regulations.
2. Project Confidentiality: Code Never Leaves Your Machine
When developing highly confidential projects, using local models ensures code and documents are always processed on your computer.
Typical Use Cases:
- Developing trade secret projects (unreleased products, patent technology)
- Security-related development (security tools, encryption systems, protection systems)
- Client-commissioned projects (signed NDAs, code cannot be shared)
- Enterprise internal projects (source code contains internal secrets)
💡 When You Must Use Local Models
If any of these apply, strongly recommend using local models:
- ✅ Project has NDA or compliance requirements
- ✅ Code involves patent technology or trade secrets
- ✅ Data involves personal privacy (ID, bank cards, medical records)
- ✅ Data belongs to clients or partners (needs confidentiality)
- ✅ Company security policy prohibits uploading data to external services
Limitations of Local Models
While local models are great for privacy, they have limitations:
| Aspect | Local Models | Cloud Models |
|---|---|---|
| Privacy & Security | ⭐⭐⭐⭐⭐ Data stays local | ⭐⭐ Data needs upload |
| Model Capability | ⭐⭐⭐ Relatively weaker | ⭐⭐⭐⭐⭐ Stronger |
| Response Speed | ⭐⭐ Depends on hardware | ⭐⭐⭐⭐⭐ Faster |
| Cost | ⭐⭐⭐⭐⭐ Free (needs hardware) | ⭐⭐ Requires payment |
| Network Requirement | ⭐⭐⭐⭐⭐ Works offline | ⭐ Needs network |
💡 Recommended Usage Strategy
For most users:
- Sensitive data, confidential projects → Local models
- Daily development, non-sensitive tasks → Cloud models (stronger capability)
Hybrid strategy:
- First use local models for sensitive parts
- Then use cloud models for non-sensitive parts (if needed)
Checklist ✅
Must pass all items to continue; if any fails, go back to the corresponding step
- [ ]
ollama --versionoutputs a version number - [ ]
ollama servelistens on127.0.0.1:11434 - [ ]
opencode.jsonhasbaseURL: http://localhost:11434/v1configured - [ ] In OpenCode,
/modelsshowsOllama (local)and your configured models - [ ] Sending a message receives a response
Common Pitfalls
80% of people get stuck here
| Symptom | Cause | Solution |
|---|---|---|
Only Ollama Cloud in /connect | /connect only saves API keys | Configure opencode.json per Step 4, then use /models to select local model |
Can't see Ollama (local) in /models | Config file location wrong / JSON syntax error / didn't restart OpenCode | Run opencode debug paths to find config directory; check JSON; restart OpenCode |
connection refused | Ollama service not started | Run ollama serve first, then go back to OpenCode /models |
| Download very slow | Network issues or slow mirror | Try again at different time; download smaller model first (3B/7B) |
| Freezes / out of memory error | Model too large | Go down one tier (e.g., 14B → 7B/8B; 32B → 14B) |
| Response very slow | Running on CPU / insufficient resources / first load | Close resource-heavy apps first; try smaller model; first load being slow is normal |
| Port already in use | ollama serve already running | Don't start multiple times; keep one service process |
Ollama Notes
- Before using OpenCode each time, start
ollama servefirst - Local models aren't as capable as Claude, DeepSeek; may struggle with complex tasks
- If your computer gets hot and fans spin loudly, that's normal
Lesson Summary
You learned:
- Understand the core value of local models: Privacy protection and project confidentiality
- Install and verify Ollama
- Choose and download suitable local models based on hardware
- Configure and use Ollama local models in
opencode.jsonto complete first conversation
Next Lesson Preview
You've got the local model workflow working. Next is the final lesson of Stage 1: Auto-update.
Quick Preview
OpenCode auto-updates by default, usually nothing to worry about. Want to know more? → 1.5 Auto-update
Having Issues?
Local model loading slow? Join the community and connect with 2000+ fellow learners for real-time Q&A.

