Local AI Coding Setup: Ollama + OpenCode on macOS Terminal

Step-by-step guide to setting up a fully local AI coding agent with Ollama and OpenCode on macOS. Agentic tool use, multi-model support, zero cloud dependencies — all running on Apple Silicon.

📅

✍️ Gianluca

Local AI Coding Setup: Ollama + OpenCode (macOS Terminal)

Want a fully local AI coding agent running in your terminal with zero cloud dependencies? This guide walks you through setting up Ollama (local LLM runner) and OpenCode (open-source AI coding agent) on macOS with Apple Silicon. The result: agentic coding capabilities, tool use, multi-model support, all running on your machine.

Previously on CodeHelper: We covered MLX-CODE, a Python-based local AI coding assistant using Apple's MLX framework directly. This guide takes a different approach, using Ollama as the model server and OpenCode as the agentic coding interface. Both are 100% local and free, but they differ in architecture, model management, and capabilities.

What You Get:

  • ✅ Fully local AI: No data sent to external servers
  • ✅ Agentic coding: Tool use, file editing, plan/build modes
  • ✅ Multi-model support: Switch between models instantly
  • ✅ Terminal-only workflow: No GUI, no Electron, no bloat
  • ✅ Open source: Both Ollama and OpenCode are free and open

MLX-CODE vs Ollama + OpenCode

If you read our MLX-CODE article, you might wonder how this setup compares. Here's a quick breakdown:

FeatureMLX-CODEOllama + OpenCode
RuntimePython + MLX FrameworkOllama server + Node.js CLI
GPU AccelerationApple Metal (MLX native)Apple Metal (via llama.cpp)
Agentic FeaturesTemplates, file contextTool use, plan/build modes, undo/redo
Model ManagementManual HuggingFace downloadOne-command pull via Ollama
Context WindowModel-dependentConfigurable per model (up to 32K+)
IDE IntegrationTerminal onlyTerminal + IDE extensions
Best ForQuick local inference, MLX experimentationAgentic workflows, multi-model setups

Both approaches are valid. MLX-CODE is lighter and more self-contained; Ollama + OpenCode is more feature-rich for agentic coding workflows.

System Requirements

  • macOS (tested on Ventura+)
  • Apple Silicon (M1, M2, M3, M4)
  • Homebrew installed
  • Node.js / npm installed
  • 8GB RAM minimum (16GB+ recommended for 7B models)

Step 1. Install Ollama

Ollama is a local LLM runner that manages model downloads, serves an OpenAI-compatible API, and handles GPU acceleration automatically.

# Install via Homebrew
brew install ollama

# Verify installation
ollama --version

# Start the server (keep running in a terminal tab)
ollama serve

Keep ollama serve running, OpenCode connects to it via the local API.

Step 2. Download Coding Models

Pull one or more models optimized for code generation:

# Recommended, best quality/speed balance
ollama pull qwen2.5-coder:7b

# Alternatives
ollama pull deepseek-coder:6.7b
ollama pull codellama:7b

# List installed models
ollama list

# Remove a model
ollama rm model-name

Step 3. Verify the Ollama API

Ollama exposes two endpoints. OpenCode uses the OpenAI-compatible one:

# Check Ollama native API
curl http://localhost:11434/api/tags

# Check OpenAI-compatible endpoint (used by OpenCode)
curl http://localhost:11434/v1/models

Important: OpenCode requires the /v1 endpoint (http://localhost:11434/v1), not the native Ollama API. It uses the @ai-sdk/openai-compatible package internally.

Step 4. Configure Model Context Window

Ollama defaults to a 4K context window, which is too small for agentic coding. You need at least 16K:

# Create a 16K context variant
ollama run qwen2.5-coder:7b
/set parameter num_ctx 16384
/save qwen2.5-coder:7b-16k
/bye

# For even better results (if you have 16GB+ RAM)
ollama run qwen2.5-coder:7b
/set parameter num_ctx 32768
/save qwen2.5-coder:7b-32k
/bye

This creates new model variants with the increased context. Use these names in your OpenCode config.

Step 5. Test the Model

# Interactive chat to verify it works
ollama run qwen2.5-coder:7b-16k

# Try a coding prompt
> write a debounce function in typescript

# Exit
/bye

Step 6. Install OpenCode

OpenCode is an open-source AI coding agent with a terminal TUI, plan/build modes, tool use, and undo/redo. Install globally via npm:

# Install globally
npm install -g opencode-ai

# Verify
opencode --help

Other install methods: Homebrew (brew install opencode), Bun, pnpm, Yarn, or Docker.

Step 7. Connect OpenCode to Ollama

Create the configuration file to connect OpenCode to your local Ollama instance:

# Create config directory
mkdir -p ~/.config/opencode

# Create config file
nano ~/.config/opencode/opencode.json

Paste this configuration:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-16k": {
          "tools": true
        }
      }
    }
  },
  "model": "ollama/qwen2.5-coder:7b-16k"
}

Key configuration notes:

  • Config path: ~/.config/opencode/opencode.json (not ~/.opencode/)
  • npm package: Uses @ai-sdk/openai-compatible to talk to Ollama
  • baseURL: Must include /v1, this is required
  • "tools": true enables function calling for agentic features (file editing, commands)

Step 8. Use OpenCode

Navigate to any project and launch:

cd /path/to/your/project
opencode .

OpenCode Key Features

  • Plan Mode

    Press Tab to toggle. Generates implementation strategies without modifying code. Great for reasoning through complex tasks first.

  • Build Mode

    The default mode. OpenCode reads, writes, and modifies files in your project with tool use capabilities.

  • Undo / Redo

    Use /undo and /redo to revert or restore changes. Safe experimentation.

  • File References

    Press @ to fuzzy-search and attach project files to your prompt. Context-aware conversations.

  • AGENTS.md

    Run /init to generate an AGENTS.md file. OpenCode learns your project patterns and conventions.

  • Share Conversations

    Use /share to create a shareable link of your conversation for team collaboration.

Multi-Model Configuration

Add multiple models and switch between them with Ctrl+A:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-16k": {
          "tools": true
        },
        "deepseek-coder:6.7b": {
          "tools": true
        },
        "codellama:7b": {
          "tools": true
        }
      }
    }
  },
  "model": "ollama/qwen2.5-coder:7b-16k",
  "small_model": "ollama/codellama:7b"
}

The small_model is used for lightweight tasks like generating titles or summaries.

Recommended Models by Task

TaskModelNotes
General codingqwen2.5-coder:7bBest quality/speed balance
WordPress / PHPdeepseek-coder:6.7bStrong PHP performance
Low RAM / Fastcodellama:7bLighter model, 8GB OK
Heavy qualityqwen2.5-coder:32bNeeds 32GB+ RAM

Remember to create 16K+ context variants for each model you use with agentic workflows.

Troubleshooting

OpenCode doesn't find models

  • • Make sure ollama serve is running
  • • Check baseURL is http://localhost:11434/v1
  • • Verify model name matches exactly: ollama list

Tool calls not working

  • • Context window too small, increase num_ctx to 16K+
  • • Model doesn't support tools well, try qwen2.5-coder

Config errors

  • • Check JSON syntax (no trailing commas)
  • • Use the $schema for validation
  • • Test Ollama: curl http://localhost:11434/api/tags

Quick Reference Commands

# Ollama
ollama serve                          # Start server
ollama list                           # List models
ollama pull qwen2.5-coder:7b          # Download model
ollama rm model-name                  # Remove model
curl http://localhost:11434/api/tags   # Check status
curl http://localhost:11434/v1/models  # OpenAI-compatible check

# OpenCode
opencode .                            # Launch in current project
/init                                 # Generate AGENTS.md
/undo                                 # Undo last change
/redo                                 # Redo last change
/share                                # Share conversation
/help                                 # List all commands
Tab                                   # Toggle plan/build mode
Ctrl+A                                # Switch models
@                                     # Fuzzy-search project files

Conclusion

With Ollama and OpenCode you get a fully local, privacy-first AI coding agent with agentic capabilities that rival cloud-based tools. No API keys, no subscriptions, no data leaving your machine.

If you're already using MLX-CODE for quick local inference, consider adding Ollama + OpenCode to your toolkit for more complex agentic workflows. Both tools complement each other: MLX-CODE for lightweight, GPU-native inference and OpenCode for full-featured coding agent capabilities.

Resources