Ollama & AI Setup
CICADA IR uses large language models to summarise investigation findings, identify attack patterns, and draft reports. The platform is designed to be lightweight and flexible — point it at a local Ollama instance or a cloud LLM and you are ready to go. No AI model ships with the VM, so you choose what fits your environment: install Ollama directly on the VM, run it on a separate host, or connect to Anthropic Claude.
Overview
| Option | Privacy | Speed | Quality | Cost |
|---|---|---|---|---|
| Ollama (on the VM) | All data stays on the VM | Depends on VM resources / GPU | Good (model-dependent) | Free |
| Ollama (separate host) | All data stays on your network | Depends on host resources | Good (model-dependent) | Free |
| Anthropic Claude (cloud) | Data sent to Anthropic API | Fast | Excellent | Pay-per-use |
Option 1: Install Ollama on the CICADA IR VM
If your hypervisor host has a GPU, this is the best option — you can pass the GPU through to the VM for fast local inference with all data staying on a single appliance.
Step 1: Open a console to the VM
Access the CICADA IR VM console through your hypervisor:
- VMware ESXi: vSphere Client > select the VM > Console tab
- VMware Workstation/Fusion: Double-click the VM in the library
- Proxmox: Select the VM > Console in the web UI
- VirtualBox: Double-click the VM or right-click > Show
- Hyper-V: Right-click the VM > Connect
Log in with the cicada-admin user and the password set during initial setup.
Step 2: Install Ollama
curl -fsSL https://ollama.com/install.sh | shVerify the installation:
ollama --versionStep 3: Pull a model
| Model | Size | RAM needed | Best for |
|---|---|---|---|
llama3.1:8b | 4.7 GB | 8 GB | Good balance of speed and quality (recommended default) |
mistral:7b | 4.1 GB | 8 GB | Fast, good for lighter hardware |
gemma2:9b | 5.4 GB | 10 GB | Strong reasoning for its size |
qwen2.5:7b | 4.4 GB | 8 GB | Good all-rounder, strong at structured output |
llama3.1:70b | 40 GB | 48 GB | Best local quality (needs a GPU with sufficient VRAM) |
# Pull the recommended default
ollama pull llama3.1:8b
# Verify
ollama listStep 4: Configure in CICADA IR
- Open the CICADA IR web interface in your browser
- Navigate to Settings > AI Configuration
- Set Provider to Ollama
- Set Ollama URL to
http://localhost:11434 - Set Model to the model you pulled (e.g.,
llama3.1:8b) - Click Test Connection to verify
- Click Save
GPU passthrough for faster inference
If your hypervisor host has an NVIDIA GPU, passing it through to the CICADA IR VM will dramatically speed up AI analysis. The steps depend on your hypervisor:
Proxmox
See the Proxmox VE Guide for detailed GPU passthrough instructions (IOMMU, VFIO, PCI passthrough).
VMware ESXi
- In vSphere, edit the VM settings
- Click Add New Device > PCI Device
- Select the NVIDIA GPU from the list
- Save and boot the VM
After passthrough — install NVIDIA drivers on the VM
Via the hypervisor console:
sudo apt update
sudo apt install -y nvidia-driver-550
sudo reboot
# After reboot, verify the GPU is detected
nvidia-smi
# Ollama will automatically use the GPU
ollama run llama3.1:8b "test"Note: VMware Workstation, Fusion, VirtualBox, and Hyper-V do not support GPU passthrough. On these platforms, use Option 2 (Ollama on a separate host with a GPU) for GPU-accelerated inference.
Option 2: Ollama on a separate host
If you cannot install Ollama on the VM or want to run it on dedicated hardware (e.g., a workstation with a GPU), install Ollama on a separate machine and point CICADA IR to it over the network.
Install Ollama on the host
# Linux / macOS
curl -fsSL https://ollama.com/install.sh | sh
# Windows — download from https://ollama.com/downloadPull a model
ollama pull llama3.1:8bExpose Ollama to the network
By default, Ollama only listens on localhost. To allow the CICADA IR VM to connect:
Linux (systemd)
sudo systemctl edit ollama
# Add these lines in the editor that opens:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
# Save, then restart
sudo systemctl restart ollamamacOS
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
# Quit and reopen Ollama from the menu barWindows
Set a system environment variable OLLAMA_HOST to 0.0.0.0:11434via System Properties > Environment Variables, then restart Ollama.
Verify it's accessible:
# From any machine on the same network
curl http://<ollama-host-ip>:11434/api/tagsConfigure in CICADA IR
- Open the CICADA IR web interface
- Navigate to Settings > AI Configuration
- Set Provider to Ollama
- Set Ollama URL to
http://<ollama-host-ip>:11434 - Set Model to the model you pulled (e.g.,
llama3.1:8b) - Click Test Connection to verify
- Click Save
Ensure the CICADA IR VM can reach the Ollama host on port 11434. If the machines are on different networks, open port 11434 on any firewalls between them (see Network Requirements).
Option 3: Cloud LLM — Anthropic Claude
For the best analysis quality without managing local infrastructure, CICADA IR supports Anthropic Claude as a cloud AI provider (Professional and Enterprise tiers).
- Obtain an API key from console.anthropic.com
- In CICADA IR, navigate to Settings > AI Configuration
- Set Provider to Anthropic Claude
- Paste your API key
- Select the model (recommended:
claude-sonnet-4-6) - Click Test Connection and then Save
The VM must have outbound HTTPS access to api.anthropic.com on port 443. See Network Requirements.
Choosing the right setup
- Hypervisor with GPU (ESXi, Proxmox): Install Ollama on the VM with GPU passthrough for the best combination of privacy and performance.
- Air-gapped / high-security environments: Run Ollama on the VM or a separate host inside your secure network. No data leaves your environment.
- Desktop hypervisor (Fusion, VirtualBox, Workstation): Run Ollama on the host machine and connect over the network, since GPU passthrough is not available.
- No spare hardware for Ollama: Use Anthropic Claude — no local infrastructure needed, just an API key.
- Best quality analysis: Use Anthropic Claude for the most thorough investigation summaries and attack pattern identification.
Model management
Run these on whichever machine has Ollama installed (the VM console or your separate host):
# List installed models
ollama list
# Remove a model you no longer need
ollama rm mistral:7b
# Update a model to the latest version
ollama pull llama3.1:8bTroubleshooting
| Issue | Solution |
|---|---|
| Test Connection fails in CICADA IR settings | Verify the Ollama URL is correct. If running on the VM, use http://localhost:11434. If on a separate host, use http://<host-ip>:11434 and confirm the host is listening: curl http://<host-ip>:11434/api/tags. |
| AI analysis is very slow | The model may be too large for the available RAM, causing it to swap. Try a smaller model (e.g., mistral:7b) or add a GPU via passthrough. |
| "Model not found" error | The model name in CICADA IR settings must exactly match what's installed. Run ollama list and copy the exact name into the settings. |
| GPU not detected after passthrough | Ensure NVIDIA drivers are installed on the VM (sudo apt install -y nvidia-driver-550) and reboot. Verify with nvidia-smi. |
| Cloud LLM returns an error | Verify the API key in Settings > AI Configuration. Check that the VM can reach api.anthropic.com on port 443, and that your API key has sufficient credits at console.anthropic.com. |
Next steps
- Getting Started — Create your first investigation
- Network Requirements — Firewall rules for AI providers
- Troubleshooting — General troubleshooting