mirror of
https://github.com/infinition/Bjorn.git
synced 2026-03-17 09:31:04 +00:00
- Implemented a new SPA page for LLM Bridge and MCP Server settings in `llm-config.js`. - Added functionality for managing LLM and MCP configurations, including toggling, saving settings, and testing connections. - Created HTTP endpoints in `llm_utils.py` for handling LLM chat, status checks, and MCP server configuration. - Integrated model fetching from LaRuche and Ollama backends. - Enhanced error handling and logging for better debugging and user feedback.
917 lines
38 KiB
Markdown
917 lines
38 KiB
Markdown
# BJORN — LLM Bridge, MCP Server & LLM Orchestrator
|
|
## Complete architecture, operation, commands, fallbacks
|
|
|
|
---
|
|
|
|
## Table of contents
|
|
|
|
1. [Overview](#1-overview)
|
|
2. [Created / modified files](#2-created--modified-files)
|
|
3. [LLM Bridge (`llm_bridge.py`)](#3-llm-bridge-llm_bridgepy)
|
|
4. [MCP Server (`mcp_server.py`)](#4-mcp-server-mcp_serverpy)
|
|
5. [LLM Orchestrator (`llm_orchestrator.py`)](#5-llm-orchestrator-llm_orchestratorpy)
|
|
6. [Orchestrator & Scheduler integration](#6-orchestrator--scheduler-integration)
|
|
7. [Web Utils LLM (`web_utils/llm_utils.py`)](#7-web-utils-llm-web_utilsllm_utilspy)
|
|
8. [EPD comment integration (`comment.py`)](#8-epd-comment-integration-commentpy)
|
|
9. [Configuration (`shared.py`)](#9-configuration-sharedpy)
|
|
10. [HTTP Routes (`webapp.py`)](#10-http-routes-webapppy)
|
|
11. [Web interfaces](#11-web-interfaces)
|
|
12. [Startup (`Bjorn.py`)](#12-startup-bjornpy)
|
|
13. [LaRuche / LAND Protocol compatibility](#13-laruche--land-protocol-compatibility)
|
|
14. [Optional dependencies](#14-optional-dependencies)
|
|
15. [Quick activation & configuration](#15-quick-activation--configuration)
|
|
16. [Complete API endpoint reference](#16-complete-api-endpoint-reference)
|
|
17. [Queue priority system](#17-queue-priority-system)
|
|
18. [Fallbacks & graceful degradation](#18-fallbacks--graceful-degradation)
|
|
19. [Call sequences](#19-call-sequences)
|
|
|
|
---
|
|
|
|
## 1. Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ BJORN (RPi) │
|
|
│ │
|
|
│ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │
|
|
│ │ Core BJORN │ │ MCP Server │ │ Web UI │ │
|
|
│ │ (unchanged) │ │ (mcp_server.py) │ │ /chat.html │ │
|
|
│ │ │ │ 7 exposed tools │ │ /mcp-config.html │ │
|
|
│ │ comment.py │ │ HTTP SSE / stdio │ │ ↳ Orch Log button │ │
|
|
│ │ ↕ LLM hook │ │ │ │ │ │
|
|
│ └──────┬──────┘ └────────┬─────────┘ └──────────┬──────────┘ │
|
|
│ └─────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────────────────▼─────────────────────────────────┐ │
|
|
│ │ LLM Bridge (llm_bridge.py) │ │
|
|
│ │ Singleton · Thread-safe │ │
|
|
│ │ │ │
|
|
│ │ Automatic cascade: │ │
|
|
│ │ 1. LaRuche node (LAND/mDNS → HTTP POST /infer) │ │
|
|
│ │ 2. Local Ollama (HTTP POST /api/chat) │ │
|
|
│ │ 3. External API (Anthropic / OpenAI / OpenRouter) │ │
|
|
│ │ 4. None (→ fallback templates in comment.py) │ │
|
|
│ │ │ │
|
|
│ │ Agentic tool-calling loop (stop_reason=tool_use, ≤6 turns) │ │
|
|
│ │ _BJORN_TOOLS: 7 tools in Anthropic format │ │
|
|
│ └─────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────────────────▼─────────────────────────────────┐ │
|
|
│ │ LLM Orchestrator (llm_orchestrator.py) │ │
|
|
│ │ │ │
|
|
│ │ mode = none → LLM has no role in scheduling │ │
|
|
│ │ mode = advisor → LLM suggests 1 action/cycle (prio 85) │ │
|
|
│ │ mode = autonomous→ own thread, loop + tools (prio 82) │ │
|
|
│ │ │ │
|
|
│ │ Fingerprint (hosts↑, vulns↑, creds↑, queue_id↑) │ │
|
|
│ │ → skip LLM if nothing new (token savings) │ │
|
|
│ └─────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────────────────▼─────────────────────────────────┐ │
|
|
│ │ Action Queue (SQLite) │ │
|
|
│ │ scheduler=40 normal=50 MCP=80 autonomous=82 advisor=85│ │
|
|
│ └─────────────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────┘
|
|
↕ mDNS _ai-inference._tcp.local. (zeroconf)
|
|
┌──────────────────────────────────────────┐
|
|
│ LaRuche Swarm (LAN) │
|
|
│ Node A → Mistral 7B :8419 │
|
|
│ Node B → DeepSeek Coder :8419 │
|
|
│ Node C → Phi-3 Mini :8419 │
|
|
└──────────────────────────────────────────┘
|
|
```
|
|
|
|
**Design principles:**
|
|
- Everything is **disabled by default** — zero impact if not configured
|
|
- All dependencies are **optional** — silent import if missing
|
|
- **Systematic fallback** at every level — Bjorn never crashes because of the LLM
|
|
- The bridge is a **singleton** — one instance per process, thread-safe
|
|
- EPD comments preserve their **exact original behaviour** if LLM is disabled
|
|
- The LLM is the **brain** (decides what to do), the orchestrator is the **arms** (executes)
|
|
|
|
---
|
|
|
|
## 2. Created / modified files
|
|
|
|
### Created files
|
|
|
|
| File | Approx. size | Role |
|
|
|------|-------------|------|
|
|
| `llm_bridge.py` | ~450 lines | LLM Singleton — backend cascade + agentic tool-calling loop |
|
|
| `mcp_server.py` | ~280 lines | FastMCP MCP Server — 7 Bjorn tools |
|
|
| `web_utils/llm_utils.py` | ~220 lines | LLM/MCP HTTP endpoints (web_utils pattern) |
|
|
| `llm_orchestrator.py` | ~410 lines | LLM Orchestrator — advisor & autonomous modes |
|
|
| `web/chat.html` | ~300 lines | Chat interface + Orch Log button |
|
|
| `web/mcp-config.html` | ~400 lines | LLM & MCP configuration page |
|
|
|
|
### Modified files
|
|
|
|
| File | What changed |
|
|
|------|-------------|
|
|
| `shared.py` | +45 config keys (LLM bridge, MCP, orchestrator) |
|
|
| `comment.py` | LLM hook in `get_comment()` — 12 lines added |
|
|
| `utils.py` | +1 entry in lazy WebUtils registry: `"llm_utils"` |
|
|
| `webapp.py` | +9 GET/POST routes in `_register_routes_once()` |
|
|
| `Bjorn.py` | LLM Bridge warm-up + conditional MCP server start |
|
|
| `orchestrator.py` | +`LLMOrchestrator` lifecycle + advisor call in background tasks |
|
|
| `action_scheduler.py` | +skip scheduler if LLM autonomous only (`llm_orchestrator_skip_scheduler`) |
|
|
| `requirements.txt` | +3 comment lines (optional dependencies documented) |
|
|
|
|
---
|
|
|
|
## 3. LLM Bridge (`llm_bridge.py`)
|
|
|
|
### Internal architecture
|
|
|
|
```
|
|
LLMBridge (Singleton)
|
|
├── __init__() Initialises singleton, launches LaRuche discovery
|
|
├── complete() Main API — cascades all backends
|
|
│ └── tools=None/[...] Optional param to enable tool-calling
|
|
├── generate_comment() Generates a short EPD comment (≤80 tokens)
|
|
├── chat() Stateful chat with per-session history
|
|
│ └── tools=_BJORN_TOOLS if llm_chat_tools_enabled=True
|
|
├── clear_history() Clears a session's history
|
|
├── status() Returns bridge state (for the UI)
|
|
│
|
|
├── _start_laruche_discovery() Starts mDNS thread in background
|
|
├── _discover_laruche_mdns() Listens to _ai-inference._tcp.local. continuously
|
|
│
|
|
├── _call_laruche() Backend 1 — POST http://[node]:8419/infer
|
|
├── _call_ollama() Backend 2 — POST http://localhost:11434/api/chat
|
|
├── _call_anthropic() Backend 3a — POST api.anthropic.com + AGENTIC LOOP
|
|
│ └── loop ≤6 turns: send → tool_use → execute → feed result → repeat
|
|
├── _call_openai_compat() Backend 3b — POST [base_url]/v1/chat/completions
|
|
│
|
|
├── _execute_tool(name, inputs) Dispatches to mcp_server._impl_*
|
|
│ └── gate: checks mcp_allowed_tools before executing
|
|
│
|
|
└── _build_system_prompt() Builds system prompt with live Bjorn context
|
|
|
|
_BJORN_TOOLS : List[Dict] Anthropic-format definitions for the 7 MCP tools
|
|
```
|
|
|
|
### _BJORN_TOOLS — full list
|
|
|
|
```python
|
|
_BJORN_TOOLS = [
|
|
{"name": "get_hosts", "description": "...", "input_schema": {...}},
|
|
{"name": "get_vulnerabilities", ...},
|
|
{"name": "get_credentials", ...},
|
|
{"name": "get_action_history", ...},
|
|
{"name": "get_status", ...},
|
|
{"name": "run_action", ...}, # gated by mcp_allowed_tools
|
|
{"name": "query_db", ...}, # SELECT only
|
|
]
|
|
```
|
|
|
|
### Backend cascade
|
|
|
|
```
|
|
llm_backend = "auto" → LaRuche → Ollama → API → None
|
|
llm_backend = "laruche" → LaRuche only
|
|
llm_backend = "ollama" → Ollama only
|
|
llm_backend = "api" → External API only
|
|
```
|
|
|
|
At each step, if a backend fails (timeout, network error, missing model), the next one is tried **silently**. If all fail, `complete()` returns `None`.
|
|
|
|
### Agentic tool-calling loop (`_call_anthropic`)
|
|
|
|
When `tools` is passed to `complete()`, the Anthropic backend enters agentic mode:
|
|
|
|
```
|
|
_call_anthropic(messages, system, tools, max_tokens, timeout)
|
|
│
|
|
├─ POST /v1/messages {tools: [...]}
|
|
│
|
|
├─ [stop_reason = "tool_use"]
|
|
│ for each tool_use block:
|
|
│ result = _execute_tool(name, inputs)
|
|
│ append {role: "tool", tool_use_id: ..., content: result}
|
|
│ POST /v1/messages [messages + tool results] ← next turn
|
|
│
|
|
└─ [stop_reason = "end_turn"] → returns final text
|
|
[≥6 turns] → returns partial text + warning
|
|
```
|
|
|
|
`_execute_tool()` dispatches directly to `mcp_server._impl_*` (no network), checking `mcp_allowed_tools` for `run_action`.
|
|
|
|
### Tool-calling in chat (`chat()`)
|
|
|
|
If `llm_chat_tools_enabled = True`, the chat passes `tools=_BJORN_TOOLS` to the backend, letting the LLM answer with real-time data (hosts, vulns, creds…) rather than relying only on its training knowledge.
|
|
|
|
### Chat history
|
|
|
|
- Each session has its own history (key = `session_id`)
|
|
- Special session `"llm_orchestrator"`: contains the autonomous orchestrator's reasoning
|
|
- Max size configurable: `llm_chat_history_size` (default: 20 messages)
|
|
- History is **in-memory only** — not persisted across restarts
|
|
- Thread-safe via `_hist_lock`
|
|
|
|
---
|
|
|
|
## 4. MCP Server (`mcp_server.py`)
|
|
|
|
### What is MCP?
|
|
|
|
The **Model Context Protocol** (Anthropic) is an open-source protocol that lets AI agents (Claude Desktop, custom agents, etc.) use external tools via a standardised interface.
|
|
|
|
By enabling Bjorn's MCP server, **any MCP client can query and control Bjorn** — without knowing the internal DB structure.
|
|
|
|
### Exposed tools
|
|
|
|
| Tool | Arguments | Description |
|
|
|------|-----------|-------------|
|
|
| `get_hosts` | `alive_only: bool = True` | Returns discovered hosts (IP, MAC, hostname, OS, ports) |
|
|
| `get_vulnerabilities` | `host_ip: str = ""`, `limit: int = 100` | Returns discovered CVE vulnerabilities |
|
|
| `get_credentials` | `service: str = ""`, `limit: int = 100` | Returns captured credentials (SSH, FTP, SMB…) |
|
|
| `get_action_history` | `limit: int = 50`, `action_name: str = ""` | History of executed actions |
|
|
| `get_status` | *(none)* | Real-time state: mode, active action, counters |
|
|
| `run_action` | `action_name: str`, `target_ip: str`, `target_mac: str = ""` | Queues a Bjorn action (MCP priority = 80) |
|
|
| `query_db` | `sql: str`, `params: str = "[]"` | Free SELECT against the SQLite DB (read-only) |
|
|
|
|
**Security:** each tool checks `mcp_allowed_tools` — unlisted tools return a clean error. `query_db` rejects anything that is not a `SELECT`.
|
|
|
|
### `_impl_run_action` — priority detail
|
|
|
|
```python
|
|
_MCP_PRIORITY = 80 # > scheduler(40) > normal(50)
|
|
|
|
sd.db.queue_action(
|
|
action_name=action_name,
|
|
mac=mac, # resolved from hosts WHERE ip=? if not supplied
|
|
ip=target_ip,
|
|
priority=_MCP_PRIORITY,
|
|
trigger="mcp",
|
|
metadata={"decision_method": "mcp", "decision_origin": "mcp"},
|
|
)
|
|
sd.queue_event.set() # wakes the orchestrator immediately
|
|
```
|
|
|
|
### Available transports
|
|
|
|
| Transport | Config | Usage |
|
|
|-----------|--------|-------|
|
|
| `http` (default) | `mcp_transport: "http"`, `mcp_port: 8765` | Accessible from any MCP client on LAN via SSE |
|
|
| `stdio` | `mcp_transport: "stdio"` | Claude Desktop, CLI agents |
|
|
|
|
---
|
|
|
|
## 5. LLM Orchestrator (`llm_orchestrator.py`)
|
|
|
|
The LLM Orchestrator transforms Bjorn from a scriptable tool into an autonomous agent. It is **completely optional and disableable** via `llm_orchestrator_mode = "none"`.
|
|
|
|
### Operating modes
|
|
|
|
| Mode | Config value | Operation |
|
|
|------|-------------|-----------|
|
|
| Disabled | `"none"` (default) | LLM plays no role in planning |
|
|
| Advisor | `"advisor"` | LLM consulted periodically, suggests 1 action |
|
|
| Autonomous | `"autonomous"` | Own thread, LLM observes + plans with tools |
|
|
|
|
### Internal architecture
|
|
|
|
```
|
|
LLMOrchestrator
|
|
├── start() Starts autonomous thread if mode=autonomous
|
|
├── stop() Stops thread (join 15s max)
|
|
├── restart_if_mode_changed() Called from orchestrator.run() each iteration
|
|
├── is_active() True if autonomous thread is alive
|
|
│
|
|
├── [ADVISOR MODE]
|
|
│ advise() → called from orchestrator._process_background_tasks()
|
|
│ ├── _build_snapshot() → compact dict (hosts, vulns, creds, queue)
|
|
│ ├── LLMBridge().complete(prompt, system)
|
|
│ └── _apply_advisor_response(raw, allowed)
|
|
│ ├── parse JSON {"action": str, "target_ip": str, "reason": str}
|
|
│ ├── validate action ∈ allowed
|
|
│ └── db.queue_action(priority=85, trigger="llm_advisor")
|
|
│
|
|
└── [AUTONOMOUS MODE]
|
|
_autonomous_loop() Thread "LLMOrchestrator" (daemon)
|
|
└── loop:
|
|
_compute_fingerprint() → (hosts, vulns, creds, max_queue_id)
|
|
_has_actionable_change() → skip if nothing increased
|
|
_run_autonomous_cycle()
|
|
├── filter tools: read-only always + run_action if in allowed
|
|
├── LLMBridge().complete(prompt, system, tools=[...])
|
|
│ └── _call_anthropic() agentic loop
|
|
│ → LLM calls run_action via tools
|
|
│ → _execute_tool → _impl_run_action → queue
|
|
└── if llm_orchestrator_log_reasoning=True:
|
|
logger.info("[LLM_ORCH_REASONING]...")
|
|
_push_to_chat() → "llm_orchestrator" session in LLMBridge
|
|
sleep(llm_orchestrator_interval_s)
|
|
```
|
|
|
|
### Fingerprint and smart skip
|
|
|
|
```python
|
|
def _compute_fingerprint(self) -> tuple:
|
|
# (host_count, vuln_count, cred_count, max_completed_queue_id)
|
|
return (hosts, vulns, creds, last_id)
|
|
|
|
def _has_actionable_change(self, fp: tuple) -> bool:
|
|
if self._last_fingerprint is None:
|
|
return True # first cycle always runs
|
|
# Triggers ONLY if something INCREASED
|
|
# hosts going offline → not actionable
|
|
return any(fp[i] > self._last_fingerprint[i] for i in range(len(fp)))
|
|
```
|
|
|
|
**Token savings:** if `llm_orchestrator_skip_if_no_change = True` (default), the LLM cycle is skipped if no new hosts/vulns/creds and no action completed since the last cycle.
|
|
|
|
### LLM priorities vs queue
|
|
|
|
```python
|
|
_ADVISOR_PRIORITY = 85 # advisor > MCP(80) > normal(50) > scheduler(40)
|
|
_AUTONOMOUS_PRIORITY = 82 # autonomous slightly below advisor
|
|
```
|
|
|
|
### Autonomous system prompt — example
|
|
|
|
```
|
|
"You are Bjorn's autonomous orchestrator, running on a Raspberry Pi network security tool.
|
|
Current state: 12 hosts discovered, 3 vulnerabilities, 1 credentials.
|
|
Operation mode: ATTACK. Hard limit: at most 3 run_action calls per cycle.
|
|
Only these action names may be queued: NmapScan, SSHBruteforce, SMBScan.
|
|
Strategy: prioritise unexplored services, hosts with high port counts, and hosts with no recent scans.
|
|
Do not queue duplicate actions already pending or recently successful.
|
|
Use Norse references occasionally. Be terse and tactical."
|
|
```
|
|
|
|
### Advisor response format
|
|
|
|
```json
|
|
// Action recommended:
|
|
{"action": "NmapScan", "target_ip": "192.168.1.42", "reason": "unexplored host, 0 open ports known"}
|
|
|
|
// Nothing to do:
|
|
{"action": null}
|
|
```
|
|
|
|
### Reasoning log
|
|
|
|
When `llm_orchestrator_log_reasoning = True`:
|
|
- Full reasoning is logged via `logger.info("[LLM_ORCH_REASONING]...")`
|
|
- It is also injected into the `"llm_orchestrator"` session in `LLMBridge._chat_histories`
|
|
- Viewable in real time in `chat.html` via the **Orch Log** button
|
|
|
|
---
|
|
|
|
## 6. Orchestrator & Scheduler integration
|
|
|
|
### `orchestrator.py`
|
|
|
|
```python
|
|
# __init__
|
|
self.llm_orchestrator = None
|
|
self._init_llm_orchestrator()
|
|
|
|
# _init_llm_orchestrator()
|
|
if shared_data.config.get("llm_enabled") and shared_data.config.get("llm_orchestrator_mode") != "none":
|
|
from llm_orchestrator import LLMOrchestrator
|
|
self.llm_orchestrator = LLMOrchestrator(shared_data)
|
|
self.llm_orchestrator.start()
|
|
|
|
# run() — each iteration
|
|
self._sync_llm_orchestrator() # starts/stops thread according to runtime config
|
|
|
|
# _process_background_tasks()
|
|
if self.llm_orchestrator and mode == "advisor":
|
|
self.llm_orchestrator.advise()
|
|
```
|
|
|
|
### `action_scheduler.py` — skip option
|
|
|
|
```python
|
|
# In run(), each iteration:
|
|
_llm_skip = bool(
|
|
shared_data.config.get("llm_orchestrator_skip_scheduler", False)
|
|
and shared_data.config.get("llm_orchestrator_mode") == "autonomous"
|
|
and shared_data.config.get("llm_enabled", False)
|
|
)
|
|
|
|
if not _llm_skip:
|
|
self._publish_all_upcoming() # step 2: publish due actions
|
|
self._evaluate_global_actions() # step 3: global evaluation
|
|
self.evaluate_all_triggers() # step 4: per-host triggers
|
|
# Steps 1 (promote due) and 5 (cleanup/priorities) always run
|
|
```
|
|
|
|
When `llm_orchestrator_skip_scheduler = True` + `mode = autonomous` + `llm_enabled = True`:
|
|
- The scheduler no longer publishes automatic actions (no more `B_require`, `B_trigger`, etc.)
|
|
- The autonomous LLM becomes **sole master of the queue**
|
|
- Queue hygiene (promotions, cleanup) remains active
|
|
|
|
---
|
|
|
|
## 7. Web Utils LLM (`web_utils/llm_utils.py`)
|
|
|
|
Follows the exact **same pattern** as all other `web_utils` (constructor `__init__(self, shared_data)`, methods called by `webapp.py`).
|
|
|
|
### Methods
|
|
|
|
| Method | Type | Description |
|
|
|--------|------|-------------|
|
|
| `get_llm_status(handler)` | GET | LLM bridge state (active backend, LaRuche URL…) |
|
|
| `get_llm_config(handler)` | GET | Current LLM config (api_key masked) |
|
|
| `get_llm_reasoning(handler)` | GET | `llm_orchestrator` session history (reasoning log) |
|
|
| `handle_chat(data)` | POST | Sends a message, returns LLM response |
|
|
| `clear_chat_history(data)` | POST | Clears a session's history |
|
|
| `get_mcp_status(handler)` | GET | MCP server state (running, port, transport) |
|
|
| `toggle_mcp(data)` | POST | Enables/disables MCP server + saves config |
|
|
| `save_mcp_config(data)` | POST | Saves MCP config (tools, port, transport) |
|
|
| `save_llm_config(data)` | POST | Saves LLM config (all parameters) |
|
|
|
|
---
|
|
|
|
## 8. EPD comment integration (`comment.py`)
|
|
|
|
### Behaviour before modification
|
|
|
|
```
|
|
get_comment(status, lang, params)
|
|
└── if delay elapsed OR status changed
|
|
└── _pick_text(status, lang, params) ← SQLite DB
|
|
└── returns weighted text
|
|
```
|
|
|
|
### Behaviour after modification
|
|
|
|
```
|
|
get_comment(status, lang, params)
|
|
└── if delay elapsed OR status changed
|
|
│
|
|
├── [if llm_comments_enabled = True]
|
|
│ └── LLMBridge().generate_comment(status, params)
|
|
│ ├── success → LLM text (≤12 words, ~8s max)
|
|
│ └── failure/timeout → text = None
|
|
│
|
|
└── [if text = None] ← SYSTEMATIC FALLBACK
|
|
└── _pick_text(status, lang, params) ← original behaviour
|
|
└── returns weighted DB text
|
|
```
|
|
|
|
**Original behaviour preserved 100% if LLM disabled or failing.**
|
|
|
|
---
|
|
|
|
## 9. Configuration (`shared.py`)
|
|
|
|
### LLM Bridge section (`__title_llm__`)
|
|
|
|
| Key | Default | Type | Description |
|
|
|-----|---------|------|-------------|
|
|
| `llm_enabled` | `False` | bool | **Master toggle** — activates the entire bridge |
|
|
| `llm_comments_enabled` | `False` | bool | Use LLM for EPD comments |
|
|
| `llm_chat_enabled` | `True` | bool | Enable /chat.html interface |
|
|
| `llm_chat_tools_enabled` | `False` | bool | Enable tool-calling in web chat |
|
|
| `llm_backend` | `"auto"` | str | `auto` \| `laruche` \| `ollama` \| `api` |
|
|
| `llm_laruche_discovery` | `True` | bool | Auto-discover LaRuche nodes via mDNS |
|
|
| `llm_laruche_url` | `""` | str | Manual LaRuche URL (overrides discovery) |
|
|
| `llm_ollama_url` | `"http://127.0.0.1:11434"` | str | Local Ollama URL |
|
|
| `llm_ollama_model` | `"phi3:mini"` | str | Ollama model to use |
|
|
| `llm_api_provider` | `"anthropic"` | str | `anthropic` \| `openai` \| `openrouter` |
|
|
| `llm_api_key` | `""` | str | API key (masked in UI) |
|
|
| `llm_api_model` | `"claude-haiku-4-5-20251001"` | str | External API model |
|
|
| `llm_api_base_url` | `""` | str | Custom base URL (OpenRouter, proxy…) |
|
|
| `llm_timeout_s` | `30` | int | Global LLM call timeout (seconds) |
|
|
| `llm_max_tokens` | `500` | int | Max tokens for chat |
|
|
| `llm_comment_max_tokens` | `80` | int | Max tokens for EPD comments |
|
|
| `llm_chat_history_size` | `20` | int | Max messages per chat session |
|
|
|
|
### MCP Server section (`__title_mcp__`)
|
|
|
|
| Key | Default | Type | Description |
|
|
|-----|---------|------|-------------|
|
|
| `mcp_enabled` | `False` | bool | Enable MCP server |
|
|
| `mcp_transport` | `"http"` | str | `http` (SSE) \| `stdio` |
|
|
| `mcp_port` | `8765` | int | HTTP SSE port |
|
|
| `mcp_allowed_tools` | `[all]` | list | List of authorised MCP tools |
|
|
|
|
### LLM Orchestrator section (`__title_llm_orch__`)
|
|
|
|
| Key | Default | Type | Description |
|
|
|-----|---------|------|-------------|
|
|
| `llm_orchestrator_mode` | `"none"` | str | `none` \| `advisor` \| `autonomous` |
|
|
| `llm_orchestrator_interval_s` | `60` | int | Delay between autonomous cycles (min 30s) |
|
|
| `llm_orchestrator_max_actions` | `3` | int | Max actions per autonomous cycle |
|
|
| `llm_orchestrator_allowed_actions` | `[]` | list | Actions the LLM may queue (empty = mcp_allowed_tools) |
|
|
| `llm_orchestrator_skip_scheduler` | `False` | bool | Disable scheduler when autonomous is active |
|
|
| `llm_orchestrator_skip_if_no_change` | `True` | bool | Skip cycle if fingerprint unchanged |
|
|
| `llm_orchestrator_log_reasoning` | `False` | bool | Log full LLM reasoning |
|
|
|
|
---
|
|
|
|
## 10. HTTP Routes (`webapp.py`)
|
|
|
|
### GET routes
|
|
|
|
| Route | Handler | Description |
|
|
|-------|---------|-------------|
|
|
| `GET /api/llm/status` | `llm_utils.get_llm_status` | LLM bridge state |
|
|
| `GET /api/llm/config` | `llm_utils.get_llm_config` | LLM config (api_key masked) |
|
|
| `GET /api/llm/reasoning` | `llm_utils.get_llm_reasoning` | Orchestrator reasoning log |
|
|
| `GET /api/mcp/status` | `llm_utils.get_mcp_status` | MCP server state |
|
|
|
|
### POST routes (JSON data-only)
|
|
|
|
| Route | Handler | Description |
|
|
|-------|---------|-------------|
|
|
| `POST /api/llm/chat` | `llm_utils.handle_chat` | Send a message to the LLM |
|
|
| `POST /api/llm/clear_history` | `llm_utils.clear_chat_history` | Clear a session's history |
|
|
| `POST /api/llm/config` | `llm_utils.save_llm_config` | Save LLM config |
|
|
| `POST /api/mcp/toggle` | `llm_utils.toggle_mcp` | Enable/disable MCP |
|
|
| `POST /api/mcp/config` | `llm_utils.save_mcp_config` | Save MCP config |
|
|
|
|
All routes respect Bjorn's existing authentication (`webauth`).
|
|
|
|
---
|
|
|
|
## 11. Web interfaces
|
|
|
|
### `/chat.html`
|
|
|
|
Terminal-style chat interface (black/red, consistent with Bjorn).
|
|
|
|
**Features:**
|
|
- Auto-detects LLM state on load (`GET /api/llm/status`)
|
|
- Displays active backend (LaRuche URL, or mode)
|
|
- "Bjorn is thinking..." indicator during response
|
|
- Unique session ID per browser tab
|
|
- `Enter` = send, `Shift+Enter` = new line
|
|
- Textarea auto-resize
|
|
- **"Clear history"** button — clears server-side session
|
|
- **"Orch Log"** button — loads the autonomous orchestrator's reasoning
|
|
- Calls `GET /api/llm/reasoning`
|
|
- Renders each message (cycle prompt + LLM response) as chat bubbles
|
|
- "← Back to chat" to return to normal chat
|
|
- Helper message if log is empty (hint: enable `llm_orchestrator_log_reasoning`)
|
|
|
|
**Access:** `http://[bjorn-ip]:8000/chat.html`
|
|
|
|
### `/mcp-config.html`
|
|
|
|
Full LLM & MCP configuration page.
|
|
|
|
**LLM Bridge section:**
|
|
- Master enable/disable toggle
|
|
- EPD comments, chat, chat tool-calling toggles
|
|
- Backend selector (auto / laruche / ollama / api)
|
|
- LaRuche mDNS discovery toggle + manual URL
|
|
- Ollama configuration (URL + model)
|
|
- External API configuration (provider, key, model, custom URL)
|
|
- Timeout and token parameters
|
|
- "TEST CONNECTION" button
|
|
|
|
**MCP Server section:**
|
|
- Enable toggle with live start/stop
|
|
- Transport selector (HTTP SSE / stdio)
|
|
- HTTP port
|
|
- Per-tool checkboxes
|
|
- "RUNNING" / "OFF" indicator
|
|
|
|
**Access:** `http://[bjorn-ip]:8000/mcp-config.html`
|
|
|
|
---
|
|
|
|
## 12. Startup (`Bjorn.py`)
|
|
|
|
```python
|
|
# LLM Bridge — warm up singleton
|
|
try:
|
|
from llm_bridge import LLMBridge
|
|
LLMBridge() # Starts mDNS discovery if llm_laruche_discovery=True
|
|
logger.info("LLM Bridge initialised")
|
|
except Exception as e:
|
|
logger.warning("LLM Bridge init skipped: %s", e)
|
|
|
|
# MCP Server
|
|
try:
|
|
import mcp_server
|
|
if shared_data.config.get("mcp_enabled", False):
|
|
mcp_server.start() # Daemon thread "MCPServer"
|
|
logger.info("MCP server started")
|
|
else:
|
|
logger.info("MCP server loaded (disabled)")
|
|
except Exception as e:
|
|
logger.warning("MCP server init skipped: %s", e)
|
|
```
|
|
|
|
The LLM Orchestrator is initialised inside `orchestrator.py` (not `Bjorn.py`), since it depends on the orchestrator loop cycle.
|
|
|
|
---
|
|
|
|
## 13. LaRuche / LAND Protocol compatibility
|
|
|
|
### LAND Protocol
|
|
|
|
LAND (Local AI Network Discovery) is the LaRuche protocol:
|
|
- **Discovery:** mDNS service type `_ai-inference._tcp.local.`
|
|
- **Inference:** `POST http://[node]:8419/infer`
|
|
|
|
### What Bjorn implements on the Python side
|
|
|
|
```python
|
|
# mDNS listening (zeroconf)
|
|
from zeroconf import Zeroconf, ServiceBrowser
|
|
ServiceBrowser(zc, "_ai-inference._tcp.local.", listener)
|
|
# → Auto-detects LaRuche nodes
|
|
|
|
# Inference call (urllib stdlib, zero dependency)
|
|
payload = {"prompt": "...", "capability": "llm", "max_tokens": 500}
|
|
urllib.request.urlopen(f"{url}/infer", data=json.dumps(payload))
|
|
```
|
|
|
|
### Scenarios
|
|
|
|
| Scenario | Behaviour |
|
|
|----------|-----------|
|
|
| LaRuche node detected on LAN | Used automatically as priority backend |
|
|
| Multiple LaRuche nodes | First discovered is used |
|
|
| Manual URL configured | Used directly, discovery ignored |
|
|
| LaRuche node absent | Cascades to Ollama or external API |
|
|
| `zeroconf` not installed | Discovery silently disabled, DEBUG log |
|
|
|
|
---
|
|
|
|
## 14. Optional dependencies
|
|
|
|
| Package | Min version | Feature unlocked | Install command |
|
|
|---------|------------|------------------|----------------|
|
|
| `mcp[cli]` | ≥ 1.0.0 | Full MCP server | `pip install "mcp[cli]"` |
|
|
| `zeroconf` | ≥ 0.131.0 | LaRuche mDNS discovery | `pip install zeroconf` |
|
|
|
|
**No new dependencies** added for LLM backends:
|
|
- **LaRuche / Ollama**: uses `urllib.request` (Python stdlib)
|
|
- **Anthropic / OpenAI**: REST API via `urllib` — no SDK needed
|
|
|
|
---
|
|
|
|
## 15. Quick activation & configuration
|
|
|
|
### Basic LLM chat
|
|
|
|
```bash
|
|
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"llm_enabled": true, "llm_backend": "ollama", "llm_ollama_model": "phi3:mini"}'
|
|
# → http://[bjorn-ip]:8000/chat.html
|
|
```
|
|
|
|
### Chat with tool-calling (LLM accesses live network data)
|
|
|
|
```bash
|
|
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
|
-d '{"llm_enabled": true, "llm_chat_tools_enabled": true}'
|
|
```
|
|
|
|
### LLM Orchestrator — advisor mode
|
|
|
|
```bash
|
|
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
|
-d '{
|
|
"llm_enabled": true,
|
|
"llm_orchestrator_mode": "advisor",
|
|
"llm_orchestrator_allowed_actions": ["NmapScan", "SSHBruteforce"]
|
|
}'
|
|
```
|
|
|
|
### LLM Orchestrator — autonomous mode (LLM as sole planner)
|
|
|
|
```bash
|
|
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
|
-d '{
|
|
"llm_enabled": true,
|
|
"llm_orchestrator_mode": "autonomous",
|
|
"llm_orchestrator_skip_scheduler": true,
|
|
"llm_orchestrator_max_actions": 5,
|
|
"llm_orchestrator_interval_s": 120,
|
|
"llm_orchestrator_allowed_actions": ["NmapScan", "SSHBruteforce", "SMBScan"],
|
|
"llm_orchestrator_log_reasoning": true
|
|
}'
|
|
# → View reasoning: http://[bjorn-ip]:8000/chat.html → Orch Log button
|
|
```
|
|
|
|
### With Anthropic API
|
|
|
|
```bash
|
|
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
|
-d '{
|
|
"llm_enabled": true,
|
|
"llm_backend": "api",
|
|
"llm_api_provider": "anthropic",
|
|
"llm_api_key": "sk-ant-...",
|
|
"llm_api_model": "claude-haiku-4-5-20251001"
|
|
}'
|
|
```
|
|
|
|
### With OpenRouter (access to all models)
|
|
|
|
```bash
|
|
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
|
-d '{
|
|
"llm_enabled": true,
|
|
"llm_backend": "api",
|
|
"llm_api_provider": "openrouter",
|
|
"llm_api_key": "sk-or-...",
|
|
"llm_api_model": "meta-llama/llama-3.2-3b-instruct",
|
|
"llm_api_base_url": "https://openrouter.ai/api"
|
|
}'
|
|
```
|
|
|
|
### Model recommendations by scenario
|
|
|
|
| Scenario | Backend | Recommended model | Pi RAM |
|
|
|----------|---------|-------------------|--------|
|
|
| Autonomous orchestrator + LaRuche on LAN | laruche | Mistral/Phi on the node | 0 (remote inference) |
|
|
| Autonomous orchestrator offline | ollama | `qwen2.5:3b` | ~3 GB |
|
|
| Autonomous orchestrator cloud | api | `claude-haiku-4-5-20251001` | 0 |
|
|
| Chat + tools | ollama | `phi3:mini` | ~2 GB |
|
|
| EPD comments only | ollama | `smollm2:360m` | ~400 MB |
|
|
|
|
---
|
|
|
|
## 16. Complete API endpoint reference
|
|
|
|
### GET
|
|
|
|
```
|
|
GET /api/llm/status
|
|
→ {"enabled": bool, "backend": str, "laruche_url": str|null,
|
|
"laruche_discovery": bool, "ollama_url": str, "ollama_model": str,
|
|
"api_provider": str, "api_model": str, "api_key_set": bool}
|
|
|
|
GET /api/llm/config
|
|
→ {all llm_* keys except api_key, + "llm_api_key_set": bool}
|
|
|
|
GET /api/llm/reasoning
|
|
→ {"status": "ok", "messages": [{"role": str, "content": str}, ...], "count": int}
|
|
→ {"status": "error", "message": str, "messages": [], "count": 0}
|
|
|
|
GET /api/mcp/status
|
|
→ {"enabled": bool, "running": bool, "transport": str,
|
|
"port": int, "allowed_tools": [str]}
|
|
```
|
|
|
|
### POST
|
|
|
|
```
|
|
POST /api/llm/chat
|
|
Body: {"message": str, "session_id": str?}
|
|
→ {"status": "ok", "response": str, "session_id": str}
|
|
→ {"status": "error", "message": str}
|
|
|
|
POST /api/llm/clear_history
|
|
Body: {"session_id": str?}
|
|
→ {"status": "ok"}
|
|
|
|
POST /api/llm/config
|
|
Body: {any subset of llm_* and llm_orchestrator_* keys}
|
|
→ {"status": "ok"}
|
|
→ {"status": "error", "message": str}
|
|
|
|
POST /api/mcp/toggle
|
|
Body: {"enabled": bool}
|
|
→ {"status": "ok", "enabled": bool, "started": bool?}
|
|
|
|
POST /api/mcp/config
|
|
Body: {"allowed_tools": [str]?, "port": int?, "transport": str?}
|
|
→ {"status": "ok", "config": {...}}
|
|
```
|
|
|
|
---
|
|
|
|
## 17. Queue priority system
|
|
|
|
```
|
|
Priority Source Trigger
|
|
──────────────────────────────────────────────────────────────
|
|
85 LLM Advisor llm_orchestrator.advise()
|
|
82 LLM Autonomous _run_autonomous_cycle() via run_action tool
|
|
80 External MCP _impl_run_action() via MCP client or chat
|
|
50 Normal / manual queue_action() without explicit priority
|
|
40 Scheduler action_scheduler evaluates triggers
|
|
```
|
|
|
|
The scheduler always processes the highest-priority pending item first. LLM and MCP actions therefore preempt scheduler actions.
|
|
|
|
---
|
|
|
|
## 18. Fallbacks & graceful degradation
|
|
|
|
| Condition | Behaviour |
|
|
|-----------|-----------|
|
|
| `llm_enabled = False` | `complete()` returns `None` immediately — zero overhead |
|
|
| `llm_orchestrator_mode = "none"` | LLMOrchestrator not instantiated |
|
|
| `mcp` not installed | `_build_mcp_server()` returns `None`, WARNING log |
|
|
| `zeroconf` not installed | LaRuche discovery silently disabled, DEBUG log |
|
|
| LaRuche node timeout | Exception caught, cascade to next backend |
|
|
| Ollama not running | `URLError` caught, cascade to API |
|
|
| API key missing | `_call_api()` returns `None`, cascade |
|
|
| All backends fail | `complete()` returns `None` |
|
|
| LLM returns `None` for EPD | `comment.py` uses `_pick_text()` (original behaviour) |
|
|
| LLM advisor: invalid JSON | DEBUG log, returns `None`, next cycle |
|
|
| LLM advisor: disallowed action | WARNING log, ignored |
|
|
| LLM autonomous: no change | cycle skipped, zero API call |
|
|
| LLM autonomous: ≥6 tool turns | returns partial text + warning |
|
|
| Exception in LLM Bridge | `try/except` at every level, DEBUG log |
|
|
|
|
### Timeouts
|
|
|
|
```
|
|
Chat / complete() → llm_timeout_s (default: 30s)
|
|
EPD comments → 8s (hardcoded, short to avoid blocking render)
|
|
Autonomous cycle → 90s (long: may chain multiple tool calls)
|
|
Advisor → 20s (short prompt + JSON response)
|
|
```
|
|
|
|
---
|
|
|
|
## 19. Call sequences
|
|
|
|
### Web chat with tool-calling
|
|
|
|
```
|
|
Browser → POST /api/llm/chat {"message": "which hosts are vulnerable?"}
|
|
└── LLMUtils.handle_chat(data)
|
|
└── LLMBridge().chat(message, session_id)
|
|
└── complete(messages, system, tools=_BJORN_TOOLS)
|
|
└── _call_anthropic(messages, tools=[...])
|
|
├── POST /v1/messages → stop_reason=tool_use
|
|
│ └── tool: get_hosts(alive_only=true)
|
|
│ → _execute_tool → _impl_get_hosts()
|
|
│ → JSON of hosts
|
|
├── POST /v1/messages [+ tool result] → end_turn
|
|
└── returns "3 exposed SSH hosts: 192.168.1.10, ..."
|
|
← {"status": "ok", "response": "3 exposed SSH hosts..."}
|
|
```
|
|
|
|
### LLM autonomous cycle
|
|
|
|
```
|
|
Thread "LLMOrchestrator" (daemon, interval=60s)
|
|
└── _run_autonomous_cycle()
|
|
├── fp = _compute_fingerprint() → (12, 3, 1, 47)
|
|
├── _has_actionable_change(fp) → True (vuln_count 2→3)
|
|
├── self._last_fingerprint = fp
|
|
│
|
|
└── LLMBridge().complete(prompt, system, tools=[read-only + run_action])
|
|
└── _call_anthropic(tools=[...])
|
|
├── POST → tool_use: get_hosts()
|
|
│ → [{ip: "192.168.1.20", ports: "22,80,443"}]
|
|
├── POST → tool_use: get_action_history()
|
|
│ → [...]
|
|
├── POST → tool_use: run_action("SSHBruteforce", "192.168.1.20")
|
|
│ → _execute_tool → _impl_run_action()
|
|
│ → db.queue_action(priority=82, trigger="llm_autonomous")
|
|
│ → queue_event.set()
|
|
└── POST → end_turn
|
|
→ "Queued SSHBruteforce on 192.168.1.20 (Mjolnir strikes the unguarded gate)"
|
|
→ [if log_reasoning=True] logger.info("[LLM_ORCH_REASONING]...")
|
|
→ [if log_reasoning=True] _push_to_chat(bridge, prompt, response)
|
|
```
|
|
|
|
### Reading reasoning from chat.html
|
|
|
|
```
|
|
User clicks "Orch Log"
|
|
└── fetch GET /api/llm/reasoning
|
|
└── LLMUtils.get_llm_reasoning(handler)
|
|
└── LLMBridge()._chat_histories["llm_orchestrator"]
|
|
→ [{"role": "user", "content": "[Autonomous cycle]..."},
|
|
{"role": "assistant", "content": "Queued SSHBruteforce..."}]
|
|
← {"status": "ok", "messages": [...], "count": 2}
|
|
→ Rendered as chat bubbles in #messages
|
|
```
|
|
|
|
### MCP from external client (Claude Desktop)
|
|
|
|
```
|
|
Claude Desktop → tool_call: run_action("NmapScan", "192.168.1.0/24")
|
|
└── FastMCP dispatch
|
|
└── mcp_server.run_action(action_name, target_ip)
|
|
└── _impl_run_action()
|
|
├── db.queue_action(priority=80, trigger="mcp")
|
|
└── queue_event.set()
|
|
← {"status": "queued", "action": "NmapScan", "target": "192.168.1.0/24", "priority": 80}
|
|
```
|
|
|
|
### EPD comment with LLM
|
|
|
|
```
|
|
display.py → CommentAI.get_comment("SSHBruteforce", params={...})
|
|
└── delay elapsed OR status changed → proceed
|
|
├── llm_comments_enabled = True ?
|
|
│ └── LLMBridge().generate_comment("SSHBruteforce", params)
|
|
│ └── complete([{role:user, content:"Status: SSHBruteforce..."}],
|
|
│ max_tokens=80, timeout=8)
|
|
│ ├── LaRuche → "Norse gods smell SSH credentials..." ✓
|
|
│ └── [or timeout 8s] → None
|
|
└── text = None → _pick_text("SSHBruteforce", lang, params)
|
|
└── SELECT FROM comments WHERE status='SSHBruteforce'
|
|
→ "Processing authentication attempts..."
|
|
```
|