Local Models (Ollama & LM Studio)
Run models locally - zero API costs, complete privacy.
Ollama (Recommended)
Installation
# macOS
brew install --cask ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows
# Download from https://ollama.com/download/windowsSetup with agentful
# 1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# 2. Pull a model
ollama pull qwen2.5-coder:7b
# 3. Configure Claude Code
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434
# 4. Run
claude
/agentful-startRecommended Models
# Best for coding (start here)
ollama pull qwen2.5-coder:7b # 6GB VRAM
ollama pull qwen2.5-coder:14b # 12GB VRAM
ollama pull qwen2.5-coder:32b # 24GB VRAM
# Best for tool calling
ollama pull llama3.1:8b # 6GB VRAMAdvanced Config
Increase context window:# Create Modelfile
FROM qwen2.5-coder:7b
PARAMETER num_ctx 32768
PARAMETER temperature 0.7
# Apply
ollama create qwen-32k -f ./Modelfile
claude --model qwen-32k# Add to ~/.bashrc or ~/.zshrc
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434LM Studio
Installation
Download from https://lmstudio.ai (macOS, Windows, Linux)
Setup with agentful
LM Studio requires LiteLLM proxy (Ollama doesn't):
# 1. Install LiteLLM
pip install 'litellm[proxy]'
# 2. Create config
cat > litellm-config.yaml <<EOF
model_list:
- model_name: claude-sonnet-4-5
litellm_params:
model: openai/qwen2.5-coder-7b
api_base: http://localhost:1234/v1
api_key: dummy
EOF
# 3. Start LM Studio server (port 1234)
# 4. Start LiteLLM proxy
litellm --config litellm-config.yaml --port 4000
# 5. Configure Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000/anthropic
export ANTHROPIC_AUTH_TOKEN=sk-1234
# 6. Run
claude- LM Studio uses OpenAI API format
- Claude Code uses Anthropic API format
- LiteLLM translates between them
- Ollama has native Anthropic support (easier)
Troubleshooting
Ollama
Model not found:ollama pull model-name
ollama list# Use smaller model
ollama pull qwen2.5-coder:7b-q4_K_M# Check GPU usage
nvidia-smi
ollama psFROM qwen2.5-coder:7b
PARAMETER num_ctx 32768LM Studio
- Verify server is running (Local Server tab)
- Check port 1234 is accessible
- Ensure LiteLLM proxy is running