csimw.com

Cool Shi.. umm, I mean Stuff and Interests of Mitch Wilson

Ollama: Running AI Models Locally

2025-12-15 │ Tech Tools Programming │ ~ min read

AI

Ollama makes it easy to run large language models locally. No API keys, no rate limits, no data leaving your machine.

What is Ollama?

A tool for running LLMs locally with a simple interface:

Easy setup: One command to download and run models
Model library: Access to popular models like Llama, Mistral, and more
API compatible: Drop-in replacement for OpenAI API
Resource management: Efficient GPU and CPU usage

Why Run Models Locally?

Privacy: Your data never leaves your machine
Cost: No per-token charges
Speed: No network latency
Reliability: Works offline
Customization: Fine-tune models on your data

Getting Started

# Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

# Run a model

ollama run llama3.2

That's it. You now have a local LLM running.

Integration with Development

I use Ollama with:

OpenCode: As the backend AI model
IDE extensions: Copilot-like features with local models
CLI tools: Custom scripts for code review, summarization
Automation: Batch processing of documentation, code analysis

Hardware Requirements

Minimum: 8GB RAM for smaller models (7B parameters)
Recommended: 16GB+ RAM, GPU with 6GB+ VRAM
Ideal: 32GB RAM, modern GPU with 12GB+ VRAM

Even without a GPU, CPU inference works well for smaller models.

← back to posts