Securing local AI by running Ollama in a Docker container | Alpha | PandaiTech

Securing local AI by running Ollama in a Docker container

An advanced technique to isolate AI processes from your OS file system using Docker to keep your data more secure.

Learning Timeline
Key Insights

Limitations for Mac M-Series Users

Currently, Docker on Mac does not have direct access to M-series GPUs. Mac users are encouraged to run Ollama natively if maximum GPU performance is required.

Benefits of Isolation Techniques

Running AI in Docker is not only more secure (as the AI cannot access your personal file system), but it also simplifies model management without cluttering your main system libraries.
Step by Step

Preparing the Docker Environment for Ollama

  1. Install Docker Desktop on your operating system (Windows, Mac, or Linux).
  2. Open your terminal or the WSL (Ubuntu) application if you are using Windows.
  3. For Nvidia GPU users, install the 'Nvidia Container Toolkit' via terminal (using APT commands or by referring to Nvidia documentation) to enable GPU access for Docker.
  4. Ensure the Docker service is running smoothly before proceeding to the next step.

Running Ollama in a Secure Container

  1. Open your terminal/WSL.
  2. Enter the 'docker run' command including the '--gpus all' flag and volume settings to store persistent data.
  3. Set 'port forwarding' to 11434:11434 for API access.
  4. Use the security-opt flag to restrict container privileges, ensuring the AI process is completely isolated from the OS.
  5. Press 'Enter' to begin the image download process and launch the container.
  6. Type the 'docker ps' command in the terminal to verify that the Ollama container status is 'Up' or currently running.

Running AI Models (DeepSeek) in Docker

  1. Identify the name of the running Ollama container (usually named 'ollama').
  2. Type the 'docker exec -it' command followed by the container name and the command to run the model (e.g., deepseek-r1).
  3. Wait for the model download process to complete inside the container.
  4. Start an AI chat session directly within the isolated terminal.
  5. Open 'Task Manager' or a GPU monitoring tool to observe performance spikes, confirming that the AI is utilizing your GPU hardware rather than the CPU.

More from Local AI & Open Source Deployment

View All