Local AI Coding Setup: Ollama + OpenCode on macOS Terminal

Step-by-step guide to setting up a fully local AI coding agent with Ollama and OpenCode on macOS. Agentic tool use, multi-model support, zero cloud dependencies — all running on Apple Silicon.

📅 12 February 2026

✍️ Gianluca

Local AI Coding Setup: Ollama + OpenCode (macOS Terminal)

Want a fully local AI coding agent running in your terminal with zero cloud dependencies? This guide walks you through setting up Ollama (local LLM runner) and OpenCode (open-source AI coding agent) on macOS with Apple Silicon. The result: agentic coding capabilities, tool use, multi-model support, all running on your machine.

Previously on CodeHelper: We covered MLX-CODE, a Python-based local AI coding assistant using Apple's MLX framework directly. This guide takes a different approach, using Ollama as the model server and OpenCode as the agentic coding interface. Both are 100% local and free, but they differ in architecture, model management, and capabilities.

What You Get:

✅ Fully local AI: No data sent to external servers
✅ Agentic coding: Tool use, file editing, plan/build modes
✅ Multi-model support: Switch between models instantly
✅ Terminal-only workflow: No GUI, no Electron, no bloat
✅ Open source: Both Ollama and OpenCode are free and open

MLX-CODE vs Ollama + OpenCode

If you read our MLX-CODE article, you might wonder how this setup compares. Here's a quick breakdown:

Feature	MLX-CODE	Ollama + OpenCode
Runtime	Python + MLX Framework	Ollama server + Node.js CLI
GPU Acceleration	Apple Metal (MLX native)	Apple Metal (via llama.cpp)
Agentic Features	Templates, file context	Tool use, plan/build modes, undo/redo
Model Management	Manual HuggingFace download	One-command pull via Ollama
Context Window	Model-dependent	Configurable per model (up to 32K+)
IDE Integration	Terminal only	Terminal + IDE extensions
Best For	Quick local inference, MLX experimentation	Agentic workflows, multi-model setups

Both approaches are valid. MLX-CODE is lighter and more self-contained; Ollama + OpenCode is more feature-rich for agentic coding workflows.

System Requirements

✅ macOS (tested on Ventura+)
✅ Apple Silicon (M1, M2, M3, M4)
✅ Homebrew installed
✅ Node.js / npm installed
✅ 8GB RAM minimum (16GB+ recommended for 7B models)

Step 1. Install Ollama

Ollama is a local LLM runner that manages model downloads, serves an OpenAI-compatible API, and handles GPU acceleration automatically.

# Install via Homebrew
brew install ollama

# Verify installation
ollama --version

# Start the server (keep running in a terminal tab)
ollama serve

Keep ollama serve running, OpenCode connects to it via the local API.

Step 2. Download Coding Models

Pull one or more models optimized for code generation:

# Recommended, best quality/speed balance
ollama pull qwen2.5-coder:7b

# Alternatives
ollama pull deepseek-coder:6.7b
ollama pull codellama:7b

# List installed models
ollama list

# Remove a model
ollama rm model-name

Step 3. Verify the Ollama API

Ollama exposes two endpoints. OpenCode uses the OpenAI-compatible one:

# Check Ollama native API
curl http://localhost:11434/api/tags

# Check OpenAI-compatible endpoint (used by OpenCode)
curl http://localhost:11434/v1/models

Important: OpenCode requires the /v1 endpoint (http://localhost:11434/v1), not the native Ollama API. It uses the @ai-sdk/openai-compatible package internally.

Step 4. Configure Model Context Window

Ollama defaults to a 4K context window, which is too small for agentic coding. You need at least 16K:

# Create a 16K context variant
ollama run qwen2.5-coder:7b
/set parameter num_ctx 16384
/save qwen2.5-coder:7b-16k
/bye

# For even better results (if you have 16GB+ RAM)
ollama run qwen2.5-coder:7b
/set parameter num_ctx 32768
/save qwen2.5-coder:7b-32k
/bye

This creates new model variants with the increased context. Use these names in your OpenCode config.

Step 5. Test the Model

# Interactive chat to verify it works
ollama run qwen2.5-coder:7b-16k

# Try a coding prompt
> write a debounce function in typescript

# Exit
/bye

Step 6. Install OpenCode

OpenCode is an open-source AI coding agent with a terminal TUI, plan/build modes, tool use, and undo/redo. Install globally via npm:

# Install globally
npm install -g opencode-ai

# Verify
opencode --help

Other install methods: Homebrew (brew install opencode), Bun, pnpm, Yarn, or Docker.

Step 7. Connect OpenCode to Ollama

Create the configuration file to connect OpenCode to your local Ollama instance:

# Create config directory
mkdir -p ~/.config/opencode

# Create config file
nano ~/.config/opencode/opencode.json

Paste this configuration:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-16k": {
          "tools": true
        }
      }
    }
  },
  "model": "ollama/qwen2.5-coder:7b-16k"
}

Key configuration notes:

Config path: ~/.config/opencode/opencode.json (not ~/.opencode/)
npm package: Uses @ai-sdk/openai-compatible to talk to Ollama
baseURL: Must include /v1, this is required
"tools": true enables function calling for agentic features (file editing, commands)

Step 8. Use OpenCode

Navigate to any project and launch:

cd /path/to/your/project
opencode .

OpenCode Key Features

Plan Mode
Press Tab to toggle. Generates implementation strategies without modifying code. Great for reasoning through complex tasks first.
Build Mode
The default mode. OpenCode reads, writes, and modifies files in your project with tool use capabilities.
Undo / Redo
Use /undo and /redo to revert or restore changes. Safe experimentation.
File References
Press @ to fuzzy-search and attach project files to your prompt. Context-aware conversations.
AGENTS.md
Run /init to generate an AGENTS.md file. OpenCode learns your project patterns and conventions.
Share Conversations
Use /share to create a shareable link of your conversation for team collaboration.

Multi-Model Configuration

Add multiple models and switch between them with Ctrl+A:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-16k": {
          "tools": true
        },
        "deepseek-coder:6.7b": {
          "tools": true
        },
        "codellama:7b": {
          "tools": true
        }
      }
    }
  },
  "model": "ollama/qwen2.5-coder:7b-16k",
  "small_model": "ollama/codellama:7b"
}

The small_model is used for lightweight tasks like generating titles or summaries.

Recommended Models by Task

Task	Model	Notes
General coding	qwen2.5-coder:7b	Best quality/speed balance
WordPress / PHP	deepseek-coder:6.7b	Strong PHP performance
Low RAM / Fast	codellama:7b	Lighter model, 8GB OK
Heavy quality	qwen2.5-coder:32b	Needs 32GB+ RAM

Remember to create 16K+ context variants for each model you use with agentic workflows.

Troubleshooting

OpenCode doesn't find models

• Make sure ollama serve is running
• Check baseURL is http://localhost:11434/v1
• Verify model name matches exactly: ollama list

Tool calls not working

• Context window too small, increase num_ctx to 16K+
• Model doesn't support tools well, try qwen2.5-coder

Config errors

• Check JSON syntax (no trailing commas)
• Use the $schema for validation
• Test Ollama: curl http://localhost:11434/api/tags

Quick Reference Commands

# Ollama
ollama serve                          # Start server
ollama list                           # List models
ollama pull qwen2.5-coder:7b          # Download model
ollama rm model-name                  # Remove model
curl http://localhost:11434/api/tags   # Check status
curl http://localhost:11434/v1/models  # OpenAI-compatible check

# OpenCode
opencode .                            # Launch in current project
/init                                 # Generate AGENTS.md
/undo                                 # Undo last change
/redo                                 # Redo last change
/share                                # Share conversation
/help                                 # List all commands
Tab                                   # Toggle plan/build mode
Ctrl+A                                # Switch models
@                                     # Fuzzy-search project files

Conclusion

With Ollama and OpenCode you get a fully local, privacy-first AI coding agent with agentic capabilities that rival cloud-based tools. No API keys, no subscriptions, no data leaving your machine.

If you're already using MLX-CODE for quick local inference, consider adding Ollama + OpenCode to your toolkit for more complex agentic workflows. Both tools complement each other: MLX-CODE for lightweight, GPU-native inference and OpenCode for full-featured coding agent capabilities.

Resources

OpenCode Documentation
Official docs for installation, configuration, providers, and commands.
Ollama Documentation
Model library, API reference, and advanced configuration.
Ollama x OpenCode Guide
Community guide for connecting Ollama with OpenCode.
AI Terminal Coding (GitHub)
Full setup instructions and configuration files for the Ollama + OpenCode workflow covered in this article.
MLX-Terminal-Code (GitHub)
Source code and setup for MLX-CODE, the local AI coding assistant covered in our previous article.
MLX-CODE on CodeHelper
Our previous article on local AI coding with Apple's MLX framework.