Get Started with Axon
From zero to running in under five minutes. Everything you need to deploy your self-hosted AI command center.
Quick Start
$ git clone https://github.com/brandonkorous/axon.git
$ cd axon
$ cp .env.example .env
$ docker compose up
Clone the repository and copy the example environment file. Add your API keys to .env, then start everything with Docker Compose.
The frontend will be available at localhost:3000 and the backend API at localhost:8000.
Prerequisites
- Docker and Docker Compose installed
- At least 8 GB RAM (16 GB+ recommended for local LLMs)
- An API key for your preferred LLM provider (Anthropic, OpenAI) — or Ollama for fully local inference
- CUDA-compatible GPU with 8 GB+ VRAM (optional, for local models)
First Steps
- 1
Open the dashboard
Navigate to
localhost:3000in your browser. - 2
Create an organization
Choose from built-in templates — Startup, Family, Job Hunt, Creator, or Student — each with curated specialist advisors.
- 3
Talk to Axon
Start a conversation. Axon routes your request to the right advisor and builds memory over time.
- 4
Enable local LLMs (optional)
Run
docker compose --profile local-llm upto start Ollama for fully local inference.
Configuration
Key environment variables in your .env file:
| Variable | Default / Example | Description |
|---|---|---|
| DEFAULT_MODEL | anthropic/claude-sonnet-4-20250514 | LLM provider and model |
| ANTHROPIC_API_KEY | sk-ant-... | Required for Claude models |
| OLLAMA_BASE_URL | http://ollama:11434 | Local LLM endpoint |
| DB_ENCRYPTION_KEY | (generate with Fernet) | Encrypts OAuth tokens |
Explore the Docs
API Reference
REST and WebSocket endpoints for the Axon backend.
Advisor SDK
Build custom specialist advisors with their own personas and vaults.
Memory Trees
How neural memory works: vaults, recall, learning, and consolidation.
Configuration
Model providers, org templates, voice settings, and environment variables.
Architecture
Axon Agent
The central orchestrator. Routes requests to specialist advisors based on organization type and query context.
Memory Layer
Local LLM (Ollama) handles recall, learning, and consolidation. Reasoning model (Claude, GPT, or local) handles conversations.
Infrastructure
Docker Compose orchestrates the backend (FastAPI), frontend (React), and optional services (Ollama, SearXNG).