Ollama & AI Setup

CICADA IR uses large language models to summarise investigation findings, identify attack patterns, and draft reports. The platform is designed to be lightweight and flexible — point it at a local Ollama instance or a cloud LLM and you are ready to go. No AI model ships with the VM, so you choose what fits your environment: install Ollama directly on the VM, run it on a separate host, or connect to Anthropic Claude.

Overview

OptionPrivacySpeedQualityCost
Ollama (on the VM)All data stays on the VMDepends on VM resources / GPUGood (model-dependent)Free
Ollama (separate host)All data stays on your networkDepends on host resourcesGood (model-dependent)Free
Anthropic Claude (cloud)Data sent to Anthropic APIFastExcellentPay-per-use

Option 1: Install Ollama on the CICADA IR VM

If your hypervisor host has a GPU, this is the best option — you can pass the GPU through to the VM for fast local inference with all data staying on a single appliance.

Step 1: Open a console to the VM

Access the CICADA IR VM console through your hypervisor:

  • VMware ESXi: vSphere Client > select the VM > Console tab
  • VMware Workstation/Fusion: Double-click the VM in the library
  • Proxmox: Select the VM > Console in the web UI
  • VirtualBox: Double-click the VM or right-click > Show
  • Hyper-V: Right-click the VM > Connect

Log in with the cicada-admin user and the password set during initial setup.

Step 2: Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Verify the installation:

ollama --version

Step 3: Pull a model

ModelSizeRAM neededBest for
llama3.1:8b4.7 GB8 GBGood balance of speed and quality (recommended default)
mistral:7b4.1 GB8 GBFast, good for lighter hardware
gemma2:9b5.4 GB10 GBStrong reasoning for its size
qwen2.5:7b4.4 GB8 GBGood all-rounder, strong at structured output
llama3.1:70b40 GB48 GBBest local quality (needs a GPU with sufficient VRAM)
# Pull the recommended default
ollama pull llama3.1:8b

# Verify
ollama list

Step 4: Configure in CICADA IR

  1. Open the CICADA IR web interface in your browser
  2. Navigate to Settings > AI Configuration
  3. Set Provider to Ollama
  4. Set Ollama URL to http://localhost:11434
  5. Set Model to the model you pulled (e.g., llama3.1:8b)
  6. Click Test Connection to verify
  7. Click Save

GPU passthrough for faster inference

If your hypervisor host has an NVIDIA GPU, passing it through to the CICADA IR VM will dramatically speed up AI analysis. The steps depend on your hypervisor:

Proxmox

See the Proxmox VE Guide for detailed GPU passthrough instructions (IOMMU, VFIO, PCI passthrough).

VMware ESXi

  1. In vSphere, edit the VM settings
  2. Click Add New Device > PCI Device
  3. Select the NVIDIA GPU from the list
  4. Save and boot the VM

After passthrough — install NVIDIA drivers on the VM

Via the hypervisor console:

sudo apt update
sudo apt install -y nvidia-driver-550
sudo reboot

# After reboot, verify the GPU is detected
nvidia-smi

# Ollama will automatically use the GPU
ollama run llama3.1:8b "test"

Note: VMware Workstation, Fusion, VirtualBox, and Hyper-V do not support GPU passthrough. On these platforms, use Option 2 (Ollama on a separate host with a GPU) for GPU-accelerated inference.


Option 2: Ollama on a separate host

If you cannot install Ollama on the VM or want to run it on dedicated hardware (e.g., a workstation with a GPU), install Ollama on a separate machine and point CICADA IR to it over the network.

Install Ollama on the host

# Linux / macOS
curl -fsSL https://ollama.com/install.sh | sh

# Windows — download from https://ollama.com/download

Pull a model

ollama pull llama3.1:8b

Expose Ollama to the network

By default, Ollama only listens on localhost. To allow the CICADA IR VM to connect:

Linux (systemd)

sudo systemctl edit ollama

# Add these lines in the editor that opens:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"

# Save, then restart
sudo systemctl restart ollama

macOS

launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
# Quit and reopen Ollama from the menu bar

Windows

Set a system environment variable OLLAMA_HOST to 0.0.0.0:11434via System Properties > Environment Variables, then restart Ollama.

Verify it's accessible:

# From any machine on the same network
curl http://<ollama-host-ip>:11434/api/tags

Configure in CICADA IR

  1. Open the CICADA IR web interface
  2. Navigate to Settings > AI Configuration
  3. Set Provider to Ollama
  4. Set Ollama URL to http://<ollama-host-ip>:11434
  5. Set Model to the model you pulled (e.g., llama3.1:8b)
  6. Click Test Connection to verify
  7. Click Save

Ensure the CICADA IR VM can reach the Ollama host on port 11434. If the machines are on different networks, open port 11434 on any firewalls between them (see Network Requirements).


Option 3: Cloud LLM — Anthropic Claude

For the best analysis quality without managing local infrastructure, CICADA IR supports Anthropic Claude as a cloud AI provider (Professional and Enterprise tiers).

  1. Obtain an API key from console.anthropic.com
  2. In CICADA IR, navigate to Settings > AI Configuration
  3. Set Provider to Anthropic Claude
  4. Paste your API key
  5. Select the model (recommended: claude-sonnet-4-6)
  6. Click Test Connection and then Save

The VM must have outbound HTTPS access to api.anthropic.com on port 443. See Network Requirements.


Choosing the right setup

  • Hypervisor with GPU (ESXi, Proxmox): Install Ollama on the VM with GPU passthrough for the best combination of privacy and performance.
  • Air-gapped / high-security environments: Run Ollama on the VM or a separate host inside your secure network. No data leaves your environment.
  • Desktop hypervisor (Fusion, VirtualBox, Workstation): Run Ollama on the host machine and connect over the network, since GPU passthrough is not available.
  • No spare hardware for Ollama: Use Anthropic Claude — no local infrastructure needed, just an API key.
  • Best quality analysis: Use Anthropic Claude for the most thorough investigation summaries and attack pattern identification.

Model management

Run these on whichever machine has Ollama installed (the VM console or your separate host):

# List installed models
ollama list

# Remove a model you no longer need
ollama rm mistral:7b

# Update a model to the latest version
ollama pull llama3.1:8b

Troubleshooting

IssueSolution
Test Connection fails in CICADA IR settingsVerify the Ollama URL is correct. If running on the VM, use http://localhost:11434. If on a separate host, use http://<host-ip>:11434 and confirm the host is listening: curl http://<host-ip>:11434/api/tags.
AI analysis is very slowThe model may be too large for the available RAM, causing it to swap. Try a smaller model (e.g., mistral:7b) or add a GPU via passthrough.
"Model not found" errorThe model name in CICADA IR settings must exactly match what's installed. Run ollama list and copy the exact name into the settings.
GPU not detected after passthroughEnsure NVIDIA drivers are installed on the VM (sudo apt install -y nvidia-driver-550) and reboot. Verify with nvidia-smi.
Cloud LLM returns an errorVerify the API key in Settings > AI Configuration. Check that the VM can reach api.anthropic.com on port 443, and that your API key has sufficient credits at console.anthropic.com.

Next steps