Local AI Coding Setup: Ollama + OpenCode on macOS Terminal
Step-by-step guide to setting up a fully local AI coding agent with Ollama and OpenCode on macOS. Agentic tool use, multi-model support, zero cloud dependencies — all running on Apple Silicon.
📅
✍️ Gianluca
Local AI Coding Setup: Ollama + OpenCode (macOS Terminal)
Want a fully local AI coding agent running in your terminal with zero cloud dependencies? This guide walks you through setting up Ollama (local LLM runner) and OpenCode (open-source AI coding agent) on macOS with Apple Silicon. The result: agentic coding capabilities, tool use, multi-model support, all running on your machine.
Previously on CodeHelper: We covered MLX-CODE, a Python-based local AI coding assistant using Apple's MLX framework directly. This guide takes a different approach, using Ollama as the model server and OpenCode as the agentic coding interface. Both are 100% local and free, but they differ in architecture, model management, and capabilities.
What You Get:
- ✅ Fully local AI: No data sent to external servers
- ✅ Agentic coding: Tool use, file editing, plan/build modes
- ✅ Multi-model support: Switch between models instantly
- ✅ Terminal-only workflow: No GUI, no Electron, no bloat
- ✅ Open source: Both Ollama and OpenCode are free and open
MLX-CODE vs Ollama + OpenCode
If you read our MLX-CODE article, you might wonder how this setup compares. Here's a quick breakdown:
| Feature | MLX-CODE | Ollama + OpenCode |
|---|---|---|
| Runtime | Python + MLX Framework | Ollama server + Node.js CLI |
| GPU Acceleration | Apple Metal (MLX native) | Apple Metal (via llama.cpp) |
| Agentic Features | Templates, file context | Tool use, plan/build modes, undo/redo |
| Model Management | Manual HuggingFace download | One-command pull via Ollama |
| Context Window | Model-dependent | Configurable per model (up to 32K+) |
| IDE Integration | Terminal only | Terminal + IDE extensions |
| Best For | Quick local inference, MLX experimentation | Agentic workflows, multi-model setups |
Both approaches are valid. MLX-CODE is lighter and more self-contained; Ollama + OpenCode is more feature-rich for agentic coding workflows.
System Requirements
- ✅ macOS (tested on Ventura+)
- ✅ Apple Silicon (M1, M2, M3, M4)
- ✅ Homebrew installed
- ✅ Node.js / npm installed
- ✅ 8GB RAM minimum (16GB+ recommended for 7B models)
Step 1. Install Ollama
Ollama is a local LLM runner that manages model downloads, serves an OpenAI-compatible API, and handles GPU acceleration automatically.
# Install via Homebrew brew install ollama # Verify installation ollama --version # Start the server (keep running in a terminal tab) ollama serve
Keep ollama serve running, OpenCode connects to it via the local API.
Step 2. Download Coding Models
Pull one or more models optimized for code generation:
# Recommended, best quality/speed balance ollama pull qwen2.5-coder:7b # Alternatives ollama pull deepseek-coder:6.7b ollama pull codellama:7b # List installed models ollama list # Remove a model ollama rm model-name
Step 3. Verify the Ollama API
Ollama exposes two endpoints. OpenCode uses the OpenAI-compatible one:
# Check Ollama native API curl http://localhost:11434/api/tags # Check OpenAI-compatible endpoint (used by OpenCode) curl http://localhost:11434/v1/models
Important: OpenCode requires the /v1 endpoint (http://localhost:11434/v1), not the native Ollama API. It uses the @ai-sdk/openai-compatible package internally.
Step 4. Configure Model Context Window
Ollama defaults to a 4K context window, which is too small for agentic coding. You need at least 16K:
# Create a 16K context variant ollama run qwen2.5-coder:7b /set parameter num_ctx 16384 /save qwen2.5-coder:7b-16k /bye # For even better results (if you have 16GB+ RAM) ollama run qwen2.5-coder:7b /set parameter num_ctx 32768 /save qwen2.5-coder:7b-32k /bye
This creates new model variants with the increased context. Use these names in your OpenCode config.
Step 5. Test the Model
# Interactive chat to verify it works ollama run qwen2.5-coder:7b-16k # Try a coding prompt > write a debounce function in typescript # Exit /bye
Step 6. Install OpenCode
OpenCode is an open-source AI coding agent with a terminal TUI, plan/build modes, tool use, and undo/redo. Install globally via npm:
# Install globally npm install -g opencode-ai # Verify opencode --help
Other install methods: Homebrew (brew install opencode), Bun, pnpm, Yarn, or Docker.
Step 7. Connect OpenCode to Ollama
Create the configuration file to connect OpenCode to your local Ollama instance:
# Create config directory mkdir -p ~/.config/opencode # Create config file nano ~/.config/opencode/opencode.json
Paste this configuration:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b-16k": {
"tools": true
}
}
}
},
"model": "ollama/qwen2.5-coder:7b-16k"
}Key configuration notes:
- Config path:
~/.config/opencode/opencode.json(not ~/.opencode/) - npm package: Uses
@ai-sdk/openai-compatibleto talk to Ollama - baseURL: Must include
/v1, this is required - "tools": true enables function calling for agentic features (file editing, commands)
Step 8. Use OpenCode
Navigate to any project and launch:
cd /path/to/your/project opencode .
OpenCode Key Features
Plan Mode
Press
Tabto toggle. Generates implementation strategies without modifying code. Great for reasoning through complex tasks first.Build Mode
The default mode. OpenCode reads, writes, and modifies files in your project with tool use capabilities.
Undo / Redo
Use
/undoand/redoto revert or restore changes. Safe experimentation.File References
Press
@to fuzzy-search and attach project files to your prompt. Context-aware conversations.AGENTS.md
Run
/initto generate an AGENTS.md file. OpenCode learns your project patterns and conventions.Share Conversations
Use
/shareto create a shareable link of your conversation for team collaboration.
Multi-Model Configuration
Add multiple models and switch between them with Ctrl+A:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b-16k": {
"tools": true
},
"deepseek-coder:6.7b": {
"tools": true
},
"codellama:7b": {
"tools": true
}
}
}
},
"model": "ollama/qwen2.5-coder:7b-16k",
"small_model": "ollama/codellama:7b"
}The small_model is used for lightweight tasks like generating titles or summaries.
Recommended Models by Task
| Task | Model | Notes |
|---|---|---|
| General coding | qwen2.5-coder:7b | Best quality/speed balance |
| WordPress / PHP | deepseek-coder:6.7b | Strong PHP performance |
| Low RAM / Fast | codellama:7b | Lighter model, 8GB OK |
| Heavy quality | qwen2.5-coder:32b | Needs 32GB+ RAM |
Remember to create 16K+ context variants for each model you use with agentic workflows.
Troubleshooting
OpenCode doesn't find models
- • Make sure
ollama serveis running - • Check baseURL is
http://localhost:11434/v1 - • Verify model name matches exactly:
ollama list
Tool calls not working
- • Context window too small, increase
num_ctxto 16K+ - • Model doesn't support tools well, try
qwen2.5-coder
Config errors
- • Check JSON syntax (no trailing commas)
- • Use the
$schemafor validation - • Test Ollama:
curl http://localhost:11434/api/tags
Quick Reference Commands
# Ollama ollama serve # Start server ollama list # List models ollama pull qwen2.5-coder:7b # Download model ollama rm model-name # Remove model curl http://localhost:11434/api/tags # Check status curl http://localhost:11434/v1/models # OpenAI-compatible check # OpenCode opencode . # Launch in current project /init # Generate AGENTS.md /undo # Undo last change /redo # Redo last change /share # Share conversation /help # List all commands Tab # Toggle plan/build mode Ctrl+A # Switch models @ # Fuzzy-search project files
Conclusion
With Ollama and OpenCode you get a fully local, privacy-first AI coding agent with agentic capabilities that rival cloud-based tools. No API keys, no subscriptions, no data leaving your machine.
If you're already using MLX-CODE for quick local inference, consider adding Ollama + OpenCode to your toolkit for more complex agentic workflows. Both tools complement each other: MLX-CODE for lightweight, GPU-native inference and OpenCode for full-featured coding agent capabilities.
Resources
Official docs for installation, configuration, providers, and commands.
Model library, API reference, and advanced configuration.
Community guide for connecting Ollama with OpenCode.
Full setup instructions and configuration files for the Ollama + OpenCode workflow covered in this article.
Source code and setup for MLX-CODE, the local AI coding assistant covered in our previous article.
Our previous article on local AI coding with Apple's MLX framework.