Build a Universal Local AI Coding Workflow on Ubuntu

Table of Contents

Revolutionize Productivity: Building a Universal AI Coding Workflow with CC Switch + Ollama on Ubuntu

💻 Compatible Versions: Ubuntu 22.04 LTS / 24.04 LTS / 26.04 (Preview)
#

In 2026, developers face a paradox: AI coding tools are better than ever, yet increasingly expensive and cloud-dependent.
Claude Code shines at architecture, Codex-style tools excel at code completion, and Gemini dominates long-context reasoning—but all come with latency, privacy, and subscription costs.

This guide shows how to build a fully local, private, and free AI coding workflow on Ubuntu by combining:

Ollama → local LLM runtime
CC Switch → protocol router and API interceptor
Premium CLIs → Claude Code, Gemini CLI, and others

The result: cloud-grade developer UX powered entirely by local models like Qwen 2.5 Coder or DeepSeek-R1.

🏗️ Step 1: Install Ollama and Prepare Local Models
#

Ollama acts as the inference engine—the “muscle” behind your AI tools.

Install Ollama
#

Run the official installer:

curl -fsSL https://ollama.com/install.sh | sh

Enable Network Access (Critical)
#

CC Switch communicates with Ollama over HTTP. Configure Ollama to listen on all interfaces.

sudo systemctl edit ollama.service

Add:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"

Reload and restart:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Download Recommended Coding Models
#

# Best balance of speed and code quality (24GB+ VRAM recommended)
ollama run qwen2.5-coder:32b

# Heavy reasoning and refactoring
ollama run deepseek-r1:14b

🛠️ Step 2: Install AI Coding CLI Clients
#

These tools provide the polished interface—CC Switch will later replace their cloud backends.

Requirement: Node.js v18+

# Claude Code (Anthropic CLI)
npm install -g @anthropic-ai/claude-code

# Gemini CLI
npm install -g gemini-chat-cli

⚠️ Do not log in yet—authentication will be bypassed locally.

🎛️ Step 3: Configure CC Switch (The Routing Hub)
#

CC Switch reroutes API traffic from cloud services to your local Ollama instance.

Install CC Switch
#

npm install -g @songhe/cc-switch

Register Ollama as a Provider
#

ccs new local-ollama

Interactive configuration:

Provider Type: OpenAI Compatible
Base URL: http://localhost:11434/v1
API Key: ollama (placeholder)
Model: qwen2.5-coder:32b

Activate the Proxy
#

ccs switch local-ollama
ccs proxy start

This automatically sets environment variables so Claude Code and Gemini CLI redirect to Ollama.

🚀 Step 4: Test the Local AI Workflow
#

Launch Claude Code:

claude

Try a prompt:

Write a snake game in Python using Pygame.

Instead of calling Anthropic’s servers, inference now runs entirely on your local GPU—no latency, no cost, no data leakage.

💡 Advanced Workflow Tips
#

Hybrid Profiles:
- local-ollama → internal or sensitive code
- cloud-claude → complex architecture or design reviews
Instant Model Switching: Change the model in CC Switch to turn Claude Code into a DeepSeek, Qwen, or Llama-powered assistant—no client restart required.
Offline-First Development: Ideal for air-gapped environments, enterprise codebases, or privacy-critical projects.