Ollama: Local Model Deployment Guide



Ollama

Ollama is an open-source platform  for running and managing large language models (LLMs) locally. It provides a unified interface to download, install, and serve a variety of models without requiring cloud access. Under the hood, Ollama handles model caching, quantization formats, and hardware optimizations so you can focus on building applications rather than wrestling with deployment details.

It’s commonly used to power offline or on-premises AI services where data privacy, latency, or cost constraints make cloud APIs impractical. Developers interact with Ollama through a simple HTTP or gRPC endpoint, sending prompt payloads and receiving streaming or batch completions. By abstracting away the complexities of model hosting, Ollama lets you seamlessly swap between different model families, experiment with compression techniques, and integrate LLM inference into Python scripts, microservices, or containerized pipelines.



Step 1: Install Ollama

Windows

  1. Go to Ollama Downloads

  2. Download the Windows installer .exe

  3. Run the installer and follow prompts

  4. Open PowerShell and verify:
    ollama --version

 Linux

Run this in your terminal:

curl -fsSL https://get.ollama.com | sh

Then verify:

ollama --version

Download Models Using Ollama

 Download qwen2.5:3b (Chat Model)

```

ollama pull qwen2.5:3b

```

This fetches the model weights and configuration into your local Ollama cache.

Download mxbai-embed-large:335m (Embedding Model)

```

ollama pull mxbai-embed-large:335m

```

Start the Ollama Server Locally

Once installed, start the server:

```

ollama serve

```

This launches a local API that handles model loading, chat, and embedding requests.

You can keep this terminal open or run Ollama as a background service.



Model Storage Locations

  • Windows: C:\Users\<YourUser>\.ollama\models\
  • Linux: ~/.ollama/models/

Comments

Popular posts from this blog

How an AI Agent Works Without a Framework

Linear Regression: One Idea, Three Perspectives