mirror of
https://github.com/infinition/Bjorn.git
synced 2026-03-17 09:31:04 +00:00
Add LLM configuration and MCP server management UI and backend functionality
- Implemented a new SPA page for LLM Bridge and MCP Server settings in `llm-config.js`. - Added functionality for managing LLM and MCP configurations, including toggling, saving settings, and testing connections. - Created HTTP endpoints in `llm_utils.py` for handling LLM chat, status checks, and MCP server configuration. - Integrated model fetching from LaRuche and Ollama backends. - Enhanced error handling and logging for better debugging and user feedback.
This commit is contained in:
916
LLM_MCP_ARCHITECTURE.md
Normal file
916
LLM_MCP_ARCHITECTURE.md
Normal file
@@ -0,0 +1,916 @@
|
||||
# BJORN — LLM Bridge, MCP Server & LLM Orchestrator
|
||||
## Complete architecture, operation, commands, fallbacks
|
||||
|
||||
---
|
||||
|
||||
## Table of contents
|
||||
|
||||
1. [Overview](#1-overview)
|
||||
2. [Created / modified files](#2-created--modified-files)
|
||||
3. [LLM Bridge (`llm_bridge.py`)](#3-llm-bridge-llm_bridgepy)
|
||||
4. [MCP Server (`mcp_server.py`)](#4-mcp-server-mcp_serverpy)
|
||||
5. [LLM Orchestrator (`llm_orchestrator.py`)](#5-llm-orchestrator-llm_orchestratorpy)
|
||||
6. [Orchestrator & Scheduler integration](#6-orchestrator--scheduler-integration)
|
||||
7. [Web Utils LLM (`web_utils/llm_utils.py`)](#7-web-utils-llm-web_utilsllm_utilspy)
|
||||
8. [EPD comment integration (`comment.py`)](#8-epd-comment-integration-commentpy)
|
||||
9. [Configuration (`shared.py`)](#9-configuration-sharedpy)
|
||||
10. [HTTP Routes (`webapp.py`)](#10-http-routes-webapppy)
|
||||
11. [Web interfaces](#11-web-interfaces)
|
||||
12. [Startup (`Bjorn.py`)](#12-startup-bjornpy)
|
||||
13. [LaRuche / LAND Protocol compatibility](#13-laruche--land-protocol-compatibility)
|
||||
14. [Optional dependencies](#14-optional-dependencies)
|
||||
15. [Quick activation & configuration](#15-quick-activation--configuration)
|
||||
16. [Complete API endpoint reference](#16-complete-api-endpoint-reference)
|
||||
17. [Queue priority system](#17-queue-priority-system)
|
||||
18. [Fallbacks & graceful degradation](#18-fallbacks--graceful-degradation)
|
||||
19. [Call sequences](#19-call-sequences)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ BJORN (RPi) │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │
|
||||
│ │ Core BJORN │ │ MCP Server │ │ Web UI │ │
|
||||
│ │ (unchanged) │ │ (mcp_server.py) │ │ /chat.html │ │
|
||||
│ │ │ │ 7 exposed tools │ │ /mcp-config.html │ │
|
||||
│ │ comment.py │ │ HTTP SSE / stdio │ │ ↳ Orch Log button │ │
|
||||
│ │ ↕ LLM hook │ │ │ │ │ │
|
||||
│ └──────┬──────┘ └────────┬─────────┘ └──────────┬──────────┘ │
|
||||
│ └─────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────────────────────▼─────────────────────────────────┐ │
|
||||
│ │ LLM Bridge (llm_bridge.py) │ │
|
||||
│ │ Singleton · Thread-safe │ │
|
||||
│ │ │ │
|
||||
│ │ Automatic cascade: │ │
|
||||
│ │ 1. LaRuche node (LAND/mDNS → HTTP POST /infer) │ │
|
||||
│ │ 2. Local Ollama (HTTP POST /api/chat) │ │
|
||||
│ │ 3. External API (Anthropic / OpenAI / OpenRouter) │ │
|
||||
│ │ 4. None (→ fallback templates in comment.py) │ │
|
||||
│ │ │ │
|
||||
│ │ Agentic tool-calling loop (stop_reason=tool_use, ≤6 turns) │ │
|
||||
│ │ _BJORN_TOOLS: 7 tools in Anthropic format │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────────────────────▼─────────────────────────────────┐ │
|
||||
│ │ LLM Orchestrator (llm_orchestrator.py) │ │
|
||||
│ │ │ │
|
||||
│ │ mode = none → LLM has no role in scheduling │ │
|
||||
│ │ mode = advisor → LLM suggests 1 action/cycle (prio 85) │ │
|
||||
│ │ mode = autonomous→ own thread, loop + tools (prio 82) │ │
|
||||
│ │ │ │
|
||||
│ │ Fingerprint (hosts↑, vulns↑, creds↑, queue_id↑) │ │
|
||||
│ │ → skip LLM if nothing new (token savings) │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────────────────────▼─────────────────────────────────┐ │
|
||||
│ │ Action Queue (SQLite) │ │
|
||||
│ │ scheduler=40 normal=50 MCP=80 autonomous=82 advisor=85│ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
↕ mDNS _ai-inference._tcp.local. (zeroconf)
|
||||
┌──────────────────────────────────────────┐
|
||||
│ LaRuche Swarm (LAN) │
|
||||
│ Node A → Mistral 7B :8419 │
|
||||
│ Node B → DeepSeek Coder :8419 │
|
||||
│ Node C → Phi-3 Mini :8419 │
|
||||
└──────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Design principles:**
|
||||
- Everything is **disabled by default** — zero impact if not configured
|
||||
- All dependencies are **optional** — silent import if missing
|
||||
- **Systematic fallback** at every level — Bjorn never crashes because of the LLM
|
||||
- The bridge is a **singleton** — one instance per process, thread-safe
|
||||
- EPD comments preserve their **exact original behaviour** if LLM is disabled
|
||||
- The LLM is the **brain** (decides what to do), the orchestrator is the **arms** (executes)
|
||||
|
||||
---
|
||||
|
||||
## 2. Created / modified files
|
||||
|
||||
### Created files
|
||||
|
||||
| File | Approx. size | Role |
|
||||
|------|-------------|------|
|
||||
| `llm_bridge.py` | ~450 lines | LLM Singleton — backend cascade + agentic tool-calling loop |
|
||||
| `mcp_server.py` | ~280 lines | FastMCP MCP Server — 7 Bjorn tools |
|
||||
| `web_utils/llm_utils.py` | ~220 lines | LLM/MCP HTTP endpoints (web_utils pattern) |
|
||||
| `llm_orchestrator.py` | ~410 lines | LLM Orchestrator — advisor & autonomous modes |
|
||||
| `web/chat.html` | ~300 lines | Chat interface + Orch Log button |
|
||||
| `web/mcp-config.html` | ~400 lines | LLM & MCP configuration page |
|
||||
|
||||
### Modified files
|
||||
|
||||
| File | What changed |
|
||||
|------|-------------|
|
||||
| `shared.py` | +45 config keys (LLM bridge, MCP, orchestrator) |
|
||||
| `comment.py` | LLM hook in `get_comment()` — 12 lines added |
|
||||
| `utils.py` | +1 entry in lazy WebUtils registry: `"llm_utils"` |
|
||||
| `webapp.py` | +9 GET/POST routes in `_register_routes_once()` |
|
||||
| `Bjorn.py` | LLM Bridge warm-up + conditional MCP server start |
|
||||
| `orchestrator.py` | +`LLMOrchestrator` lifecycle + advisor call in background tasks |
|
||||
| `action_scheduler.py` | +skip scheduler if LLM autonomous only (`llm_orchestrator_skip_scheduler`) |
|
||||
| `requirements.txt` | +3 comment lines (optional dependencies documented) |
|
||||
|
||||
---
|
||||
|
||||
## 3. LLM Bridge (`llm_bridge.py`)
|
||||
|
||||
### Internal architecture
|
||||
|
||||
```
|
||||
LLMBridge (Singleton)
|
||||
├── __init__() Initialises singleton, launches LaRuche discovery
|
||||
├── complete() Main API — cascades all backends
|
||||
│ └── tools=None/[...] Optional param to enable tool-calling
|
||||
├── generate_comment() Generates a short EPD comment (≤80 tokens)
|
||||
├── chat() Stateful chat with per-session history
|
||||
│ └── tools=_BJORN_TOOLS if llm_chat_tools_enabled=True
|
||||
├── clear_history() Clears a session's history
|
||||
├── status() Returns bridge state (for the UI)
|
||||
│
|
||||
├── _start_laruche_discovery() Starts mDNS thread in background
|
||||
├── _discover_laruche_mdns() Listens to _ai-inference._tcp.local. continuously
|
||||
│
|
||||
├── _call_laruche() Backend 1 — POST http://[node]:8419/infer
|
||||
├── _call_ollama() Backend 2 — POST http://localhost:11434/api/chat
|
||||
├── _call_anthropic() Backend 3a — POST api.anthropic.com + AGENTIC LOOP
|
||||
│ └── loop ≤6 turns: send → tool_use → execute → feed result → repeat
|
||||
├── _call_openai_compat() Backend 3b — POST [base_url]/v1/chat/completions
|
||||
│
|
||||
├── _execute_tool(name, inputs) Dispatches to mcp_server._impl_*
|
||||
│ └── gate: checks mcp_allowed_tools before executing
|
||||
│
|
||||
└── _build_system_prompt() Builds system prompt with live Bjorn context
|
||||
|
||||
_BJORN_TOOLS : List[Dict] Anthropic-format definitions for the 7 MCP tools
|
||||
```
|
||||
|
||||
### _BJORN_TOOLS — full list
|
||||
|
||||
```python
|
||||
_BJORN_TOOLS = [
|
||||
{"name": "get_hosts", "description": "...", "input_schema": {...}},
|
||||
{"name": "get_vulnerabilities", ...},
|
||||
{"name": "get_credentials", ...},
|
||||
{"name": "get_action_history", ...},
|
||||
{"name": "get_status", ...},
|
||||
{"name": "run_action", ...}, # gated by mcp_allowed_tools
|
||||
{"name": "query_db", ...}, # SELECT only
|
||||
]
|
||||
```
|
||||
|
||||
### Backend cascade
|
||||
|
||||
```
|
||||
llm_backend = "auto" → LaRuche → Ollama → API → None
|
||||
llm_backend = "laruche" → LaRuche only
|
||||
llm_backend = "ollama" → Ollama only
|
||||
llm_backend = "api" → External API only
|
||||
```
|
||||
|
||||
At each step, if a backend fails (timeout, network error, missing model), the next one is tried **silently**. If all fail, `complete()` returns `None`.
|
||||
|
||||
### Agentic tool-calling loop (`_call_anthropic`)
|
||||
|
||||
When `tools` is passed to `complete()`, the Anthropic backend enters agentic mode:
|
||||
|
||||
```
|
||||
_call_anthropic(messages, system, tools, max_tokens, timeout)
|
||||
│
|
||||
├─ POST /v1/messages {tools: [...]}
|
||||
│
|
||||
├─ [stop_reason = "tool_use"]
|
||||
│ for each tool_use block:
|
||||
│ result = _execute_tool(name, inputs)
|
||||
│ append {role: "tool", tool_use_id: ..., content: result}
|
||||
│ POST /v1/messages [messages + tool results] ← next turn
|
||||
│
|
||||
└─ [stop_reason = "end_turn"] → returns final text
|
||||
[≥6 turns] → returns partial text + warning
|
||||
```
|
||||
|
||||
`_execute_tool()` dispatches directly to `mcp_server._impl_*` (no network), checking `mcp_allowed_tools` for `run_action`.
|
||||
|
||||
### Tool-calling in chat (`chat()`)
|
||||
|
||||
If `llm_chat_tools_enabled = True`, the chat passes `tools=_BJORN_TOOLS` to the backend, letting the LLM answer with real-time data (hosts, vulns, creds…) rather than relying only on its training knowledge.
|
||||
|
||||
### Chat history
|
||||
|
||||
- Each session has its own history (key = `session_id`)
|
||||
- Special session `"llm_orchestrator"`: contains the autonomous orchestrator's reasoning
|
||||
- Max size configurable: `llm_chat_history_size` (default: 20 messages)
|
||||
- History is **in-memory only** — not persisted across restarts
|
||||
- Thread-safe via `_hist_lock`
|
||||
|
||||
---
|
||||
|
||||
## 4. MCP Server (`mcp_server.py`)
|
||||
|
||||
### What is MCP?
|
||||
|
||||
The **Model Context Protocol** (Anthropic) is an open-source protocol that lets AI agents (Claude Desktop, custom agents, etc.) use external tools via a standardised interface.
|
||||
|
||||
By enabling Bjorn's MCP server, **any MCP client can query and control Bjorn** — without knowing the internal DB structure.
|
||||
|
||||
### Exposed tools
|
||||
|
||||
| Tool | Arguments | Description |
|
||||
|------|-----------|-------------|
|
||||
| `get_hosts` | `alive_only: bool = True` | Returns discovered hosts (IP, MAC, hostname, OS, ports) |
|
||||
| `get_vulnerabilities` | `host_ip: str = ""`, `limit: int = 100` | Returns discovered CVE vulnerabilities |
|
||||
| `get_credentials` | `service: str = ""`, `limit: int = 100` | Returns captured credentials (SSH, FTP, SMB…) |
|
||||
| `get_action_history` | `limit: int = 50`, `action_name: str = ""` | History of executed actions |
|
||||
| `get_status` | *(none)* | Real-time state: mode, active action, counters |
|
||||
| `run_action` | `action_name: str`, `target_ip: str`, `target_mac: str = ""` | Queues a Bjorn action (MCP priority = 80) |
|
||||
| `query_db` | `sql: str`, `params: str = "[]"` | Free SELECT against the SQLite DB (read-only) |
|
||||
|
||||
**Security:** each tool checks `mcp_allowed_tools` — unlisted tools return a clean error. `query_db` rejects anything that is not a `SELECT`.
|
||||
|
||||
### `_impl_run_action` — priority detail
|
||||
|
||||
```python
|
||||
_MCP_PRIORITY = 80 # > scheduler(40) > normal(50)
|
||||
|
||||
sd.db.queue_action(
|
||||
action_name=action_name,
|
||||
mac=mac, # resolved from hosts WHERE ip=? if not supplied
|
||||
ip=target_ip,
|
||||
priority=_MCP_PRIORITY,
|
||||
trigger="mcp",
|
||||
metadata={"decision_method": "mcp", "decision_origin": "mcp"},
|
||||
)
|
||||
sd.queue_event.set() # wakes the orchestrator immediately
|
||||
```
|
||||
|
||||
### Available transports
|
||||
|
||||
| Transport | Config | Usage |
|
||||
|-----------|--------|-------|
|
||||
| `http` (default) | `mcp_transport: "http"`, `mcp_port: 8765` | Accessible from any MCP client on LAN via SSE |
|
||||
| `stdio` | `mcp_transport: "stdio"` | Claude Desktop, CLI agents |
|
||||
|
||||
---
|
||||
|
||||
## 5. LLM Orchestrator (`llm_orchestrator.py`)
|
||||
|
||||
The LLM Orchestrator transforms Bjorn from a scriptable tool into an autonomous agent. It is **completely optional and disableable** via `llm_orchestrator_mode = "none"`.
|
||||
|
||||
### Operating modes
|
||||
|
||||
| Mode | Config value | Operation |
|
||||
|------|-------------|-----------|
|
||||
| Disabled | `"none"` (default) | LLM plays no role in planning |
|
||||
| Advisor | `"advisor"` | LLM consulted periodically, suggests 1 action |
|
||||
| Autonomous | `"autonomous"` | Own thread, LLM observes + plans with tools |
|
||||
|
||||
### Internal architecture
|
||||
|
||||
```
|
||||
LLMOrchestrator
|
||||
├── start() Starts autonomous thread if mode=autonomous
|
||||
├── stop() Stops thread (join 15s max)
|
||||
├── restart_if_mode_changed() Called from orchestrator.run() each iteration
|
||||
├── is_active() True if autonomous thread is alive
|
||||
│
|
||||
├── [ADVISOR MODE]
|
||||
│ advise() → called from orchestrator._process_background_tasks()
|
||||
│ ├── _build_snapshot() → compact dict (hosts, vulns, creds, queue)
|
||||
│ ├── LLMBridge().complete(prompt, system)
|
||||
│ └── _apply_advisor_response(raw, allowed)
|
||||
│ ├── parse JSON {"action": str, "target_ip": str, "reason": str}
|
||||
│ ├── validate action ∈ allowed
|
||||
│ └── db.queue_action(priority=85, trigger="llm_advisor")
|
||||
│
|
||||
└── [AUTONOMOUS MODE]
|
||||
_autonomous_loop() Thread "LLMOrchestrator" (daemon)
|
||||
└── loop:
|
||||
_compute_fingerprint() → (hosts, vulns, creds, max_queue_id)
|
||||
_has_actionable_change() → skip if nothing increased
|
||||
_run_autonomous_cycle()
|
||||
├── filter tools: read-only always + run_action if in allowed
|
||||
├── LLMBridge().complete(prompt, system, tools=[...])
|
||||
│ └── _call_anthropic() agentic loop
|
||||
│ → LLM calls run_action via tools
|
||||
│ → _execute_tool → _impl_run_action → queue
|
||||
└── if llm_orchestrator_log_reasoning=True:
|
||||
logger.info("[LLM_ORCH_REASONING]...")
|
||||
_push_to_chat() → "llm_orchestrator" session in LLMBridge
|
||||
sleep(llm_orchestrator_interval_s)
|
||||
```
|
||||
|
||||
### Fingerprint and smart skip
|
||||
|
||||
```python
|
||||
def _compute_fingerprint(self) -> tuple:
|
||||
# (host_count, vuln_count, cred_count, max_completed_queue_id)
|
||||
return (hosts, vulns, creds, last_id)
|
||||
|
||||
def _has_actionable_change(self, fp: tuple) -> bool:
|
||||
if self._last_fingerprint is None:
|
||||
return True # first cycle always runs
|
||||
# Triggers ONLY if something INCREASED
|
||||
# hosts going offline → not actionable
|
||||
return any(fp[i] > self._last_fingerprint[i] for i in range(len(fp)))
|
||||
```
|
||||
|
||||
**Token savings:** if `llm_orchestrator_skip_if_no_change = True` (default), the LLM cycle is skipped if no new hosts/vulns/creds and no action completed since the last cycle.
|
||||
|
||||
### LLM priorities vs queue
|
||||
|
||||
```python
|
||||
_ADVISOR_PRIORITY = 85 # advisor > MCP(80) > normal(50) > scheduler(40)
|
||||
_AUTONOMOUS_PRIORITY = 82 # autonomous slightly below advisor
|
||||
```
|
||||
|
||||
### Autonomous system prompt — example
|
||||
|
||||
```
|
||||
"You are Bjorn's autonomous orchestrator, running on a Raspberry Pi network security tool.
|
||||
Current state: 12 hosts discovered, 3 vulnerabilities, 1 credentials.
|
||||
Operation mode: ATTACK. Hard limit: at most 3 run_action calls per cycle.
|
||||
Only these action names may be queued: NmapScan, SSHBruteforce, SMBScan.
|
||||
Strategy: prioritise unexplored services, hosts with high port counts, and hosts with no recent scans.
|
||||
Do not queue duplicate actions already pending or recently successful.
|
||||
Use Norse references occasionally. Be terse and tactical."
|
||||
```
|
||||
|
||||
### Advisor response format
|
||||
|
||||
```json
|
||||
// Action recommended:
|
||||
{"action": "NmapScan", "target_ip": "192.168.1.42", "reason": "unexplored host, 0 open ports known"}
|
||||
|
||||
// Nothing to do:
|
||||
{"action": null}
|
||||
```
|
||||
|
||||
### Reasoning log
|
||||
|
||||
When `llm_orchestrator_log_reasoning = True`:
|
||||
- Full reasoning is logged via `logger.info("[LLM_ORCH_REASONING]...")`
|
||||
- It is also injected into the `"llm_orchestrator"` session in `LLMBridge._chat_histories`
|
||||
- Viewable in real time in `chat.html` via the **Orch Log** button
|
||||
|
||||
---
|
||||
|
||||
## 6. Orchestrator & Scheduler integration
|
||||
|
||||
### `orchestrator.py`
|
||||
|
||||
```python
|
||||
# __init__
|
||||
self.llm_orchestrator = None
|
||||
self._init_llm_orchestrator()
|
||||
|
||||
# _init_llm_orchestrator()
|
||||
if shared_data.config.get("llm_enabled") and shared_data.config.get("llm_orchestrator_mode") != "none":
|
||||
from llm_orchestrator import LLMOrchestrator
|
||||
self.llm_orchestrator = LLMOrchestrator(shared_data)
|
||||
self.llm_orchestrator.start()
|
||||
|
||||
# run() — each iteration
|
||||
self._sync_llm_orchestrator() # starts/stops thread according to runtime config
|
||||
|
||||
# _process_background_tasks()
|
||||
if self.llm_orchestrator and mode == "advisor":
|
||||
self.llm_orchestrator.advise()
|
||||
```
|
||||
|
||||
### `action_scheduler.py` — skip option
|
||||
|
||||
```python
|
||||
# In run(), each iteration:
|
||||
_llm_skip = bool(
|
||||
shared_data.config.get("llm_orchestrator_skip_scheduler", False)
|
||||
and shared_data.config.get("llm_orchestrator_mode") == "autonomous"
|
||||
and shared_data.config.get("llm_enabled", False)
|
||||
)
|
||||
|
||||
if not _llm_skip:
|
||||
self._publish_all_upcoming() # step 2: publish due actions
|
||||
self._evaluate_global_actions() # step 3: global evaluation
|
||||
self.evaluate_all_triggers() # step 4: per-host triggers
|
||||
# Steps 1 (promote due) and 5 (cleanup/priorities) always run
|
||||
```
|
||||
|
||||
When `llm_orchestrator_skip_scheduler = True` + `mode = autonomous` + `llm_enabled = True`:
|
||||
- The scheduler no longer publishes automatic actions (no more `B_require`, `B_trigger`, etc.)
|
||||
- The autonomous LLM becomes **sole master of the queue**
|
||||
- Queue hygiene (promotions, cleanup) remains active
|
||||
|
||||
---
|
||||
|
||||
## 7. Web Utils LLM (`web_utils/llm_utils.py`)
|
||||
|
||||
Follows the exact **same pattern** as all other `web_utils` (constructor `__init__(self, shared_data)`, methods called by `webapp.py`).
|
||||
|
||||
### Methods
|
||||
|
||||
| Method | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `get_llm_status(handler)` | GET | LLM bridge state (active backend, LaRuche URL…) |
|
||||
| `get_llm_config(handler)` | GET | Current LLM config (api_key masked) |
|
||||
| `get_llm_reasoning(handler)` | GET | `llm_orchestrator` session history (reasoning log) |
|
||||
| `handle_chat(data)` | POST | Sends a message, returns LLM response |
|
||||
| `clear_chat_history(data)` | POST | Clears a session's history |
|
||||
| `get_mcp_status(handler)` | GET | MCP server state (running, port, transport) |
|
||||
| `toggle_mcp(data)` | POST | Enables/disables MCP server + saves config |
|
||||
| `save_mcp_config(data)` | POST | Saves MCP config (tools, port, transport) |
|
||||
| `save_llm_config(data)` | POST | Saves LLM config (all parameters) |
|
||||
|
||||
---
|
||||
|
||||
## 8. EPD comment integration (`comment.py`)
|
||||
|
||||
### Behaviour before modification
|
||||
|
||||
```
|
||||
get_comment(status, lang, params)
|
||||
└── if delay elapsed OR status changed
|
||||
└── _pick_text(status, lang, params) ← SQLite DB
|
||||
└── returns weighted text
|
||||
```
|
||||
|
||||
### Behaviour after modification
|
||||
|
||||
```
|
||||
get_comment(status, lang, params)
|
||||
└── if delay elapsed OR status changed
|
||||
│
|
||||
├── [if llm_comments_enabled = True]
|
||||
│ └── LLMBridge().generate_comment(status, params)
|
||||
│ ├── success → LLM text (≤12 words, ~8s max)
|
||||
│ └── failure/timeout → text = None
|
||||
│
|
||||
└── [if text = None] ← SYSTEMATIC FALLBACK
|
||||
└── _pick_text(status, lang, params) ← original behaviour
|
||||
└── returns weighted DB text
|
||||
```
|
||||
|
||||
**Original behaviour preserved 100% if LLM disabled or failing.**
|
||||
|
||||
---
|
||||
|
||||
## 9. Configuration (`shared.py`)
|
||||
|
||||
### LLM Bridge section (`__title_llm__`)
|
||||
|
||||
| Key | Default | Type | Description |
|
||||
|-----|---------|------|-------------|
|
||||
| `llm_enabled` | `False` | bool | **Master toggle** — activates the entire bridge |
|
||||
| `llm_comments_enabled` | `False` | bool | Use LLM for EPD comments |
|
||||
| `llm_chat_enabled` | `True` | bool | Enable /chat.html interface |
|
||||
| `llm_chat_tools_enabled` | `False` | bool | Enable tool-calling in web chat |
|
||||
| `llm_backend` | `"auto"` | str | `auto` \| `laruche` \| `ollama` \| `api` |
|
||||
| `llm_laruche_discovery` | `True` | bool | Auto-discover LaRuche nodes via mDNS |
|
||||
| `llm_laruche_url` | `""` | str | Manual LaRuche URL (overrides discovery) |
|
||||
| `llm_ollama_url` | `"http://127.0.0.1:11434"` | str | Local Ollama URL |
|
||||
| `llm_ollama_model` | `"phi3:mini"` | str | Ollama model to use |
|
||||
| `llm_api_provider` | `"anthropic"` | str | `anthropic` \| `openai` \| `openrouter` |
|
||||
| `llm_api_key` | `""` | str | API key (masked in UI) |
|
||||
| `llm_api_model` | `"claude-haiku-4-5-20251001"` | str | External API model |
|
||||
| `llm_api_base_url` | `""` | str | Custom base URL (OpenRouter, proxy…) |
|
||||
| `llm_timeout_s` | `30` | int | Global LLM call timeout (seconds) |
|
||||
| `llm_max_tokens` | `500` | int | Max tokens for chat |
|
||||
| `llm_comment_max_tokens` | `80` | int | Max tokens for EPD comments |
|
||||
| `llm_chat_history_size` | `20` | int | Max messages per chat session |
|
||||
|
||||
### MCP Server section (`__title_mcp__`)
|
||||
|
||||
| Key | Default | Type | Description |
|
||||
|-----|---------|------|-------------|
|
||||
| `mcp_enabled` | `False` | bool | Enable MCP server |
|
||||
| `mcp_transport` | `"http"` | str | `http` (SSE) \| `stdio` |
|
||||
| `mcp_port` | `8765` | int | HTTP SSE port |
|
||||
| `mcp_allowed_tools` | `[all]` | list | List of authorised MCP tools |
|
||||
|
||||
### LLM Orchestrator section (`__title_llm_orch__`)
|
||||
|
||||
| Key | Default | Type | Description |
|
||||
|-----|---------|------|-------------|
|
||||
| `llm_orchestrator_mode` | `"none"` | str | `none` \| `advisor` \| `autonomous` |
|
||||
| `llm_orchestrator_interval_s` | `60` | int | Delay between autonomous cycles (min 30s) |
|
||||
| `llm_orchestrator_max_actions` | `3` | int | Max actions per autonomous cycle |
|
||||
| `llm_orchestrator_allowed_actions` | `[]` | list | Actions the LLM may queue (empty = mcp_allowed_tools) |
|
||||
| `llm_orchestrator_skip_scheduler` | `False` | bool | Disable scheduler when autonomous is active |
|
||||
| `llm_orchestrator_skip_if_no_change` | `True` | bool | Skip cycle if fingerprint unchanged |
|
||||
| `llm_orchestrator_log_reasoning` | `False` | bool | Log full LLM reasoning |
|
||||
|
||||
---
|
||||
|
||||
## 10. HTTP Routes (`webapp.py`)
|
||||
|
||||
### GET routes
|
||||
|
||||
| Route | Handler | Description |
|
||||
|-------|---------|-------------|
|
||||
| `GET /api/llm/status` | `llm_utils.get_llm_status` | LLM bridge state |
|
||||
| `GET /api/llm/config` | `llm_utils.get_llm_config` | LLM config (api_key masked) |
|
||||
| `GET /api/llm/reasoning` | `llm_utils.get_llm_reasoning` | Orchestrator reasoning log |
|
||||
| `GET /api/mcp/status` | `llm_utils.get_mcp_status` | MCP server state |
|
||||
|
||||
### POST routes (JSON data-only)
|
||||
|
||||
| Route | Handler | Description |
|
||||
|-------|---------|-------------|
|
||||
| `POST /api/llm/chat` | `llm_utils.handle_chat` | Send a message to the LLM |
|
||||
| `POST /api/llm/clear_history` | `llm_utils.clear_chat_history` | Clear a session's history |
|
||||
| `POST /api/llm/config` | `llm_utils.save_llm_config` | Save LLM config |
|
||||
| `POST /api/mcp/toggle` | `llm_utils.toggle_mcp` | Enable/disable MCP |
|
||||
| `POST /api/mcp/config` | `llm_utils.save_mcp_config` | Save MCP config |
|
||||
|
||||
All routes respect Bjorn's existing authentication (`webauth`).
|
||||
|
||||
---
|
||||
|
||||
## 11. Web interfaces
|
||||
|
||||
### `/chat.html`
|
||||
|
||||
Terminal-style chat interface (black/red, consistent with Bjorn).
|
||||
|
||||
**Features:**
|
||||
- Auto-detects LLM state on load (`GET /api/llm/status`)
|
||||
- Displays active backend (LaRuche URL, or mode)
|
||||
- "Bjorn is thinking..." indicator during response
|
||||
- Unique session ID per browser tab
|
||||
- `Enter` = send, `Shift+Enter` = new line
|
||||
- Textarea auto-resize
|
||||
- **"Clear history"** button — clears server-side session
|
||||
- **"Orch Log"** button — loads the autonomous orchestrator's reasoning
|
||||
- Calls `GET /api/llm/reasoning`
|
||||
- Renders each message (cycle prompt + LLM response) as chat bubbles
|
||||
- "← Back to chat" to return to normal chat
|
||||
- Helper message if log is empty (hint: enable `llm_orchestrator_log_reasoning`)
|
||||
|
||||
**Access:** `http://[bjorn-ip]:8000/chat.html`
|
||||
|
||||
### `/mcp-config.html`
|
||||
|
||||
Full LLM & MCP configuration page.
|
||||
|
||||
**LLM Bridge section:**
|
||||
- Master enable/disable toggle
|
||||
- EPD comments, chat, chat tool-calling toggles
|
||||
- Backend selector (auto / laruche / ollama / api)
|
||||
- LaRuche mDNS discovery toggle + manual URL
|
||||
- Ollama configuration (URL + model)
|
||||
- External API configuration (provider, key, model, custom URL)
|
||||
- Timeout and token parameters
|
||||
- "TEST CONNECTION" button
|
||||
|
||||
**MCP Server section:**
|
||||
- Enable toggle with live start/stop
|
||||
- Transport selector (HTTP SSE / stdio)
|
||||
- HTTP port
|
||||
- Per-tool checkboxes
|
||||
- "RUNNING" / "OFF" indicator
|
||||
|
||||
**Access:** `http://[bjorn-ip]:8000/mcp-config.html`
|
||||
|
||||
---
|
||||
|
||||
## 12. Startup (`Bjorn.py`)
|
||||
|
||||
```python
|
||||
# LLM Bridge — warm up singleton
|
||||
try:
|
||||
from llm_bridge import LLMBridge
|
||||
LLMBridge() # Starts mDNS discovery if llm_laruche_discovery=True
|
||||
logger.info("LLM Bridge initialised")
|
||||
except Exception as e:
|
||||
logger.warning("LLM Bridge init skipped: %s", e)
|
||||
|
||||
# MCP Server
|
||||
try:
|
||||
import mcp_server
|
||||
if shared_data.config.get("mcp_enabled", False):
|
||||
mcp_server.start() # Daemon thread "MCPServer"
|
||||
logger.info("MCP server started")
|
||||
else:
|
||||
logger.info("MCP server loaded (disabled)")
|
||||
except Exception as e:
|
||||
logger.warning("MCP server init skipped: %s", e)
|
||||
```
|
||||
|
||||
The LLM Orchestrator is initialised inside `orchestrator.py` (not `Bjorn.py`), since it depends on the orchestrator loop cycle.
|
||||
|
||||
---
|
||||
|
||||
## 13. LaRuche / LAND Protocol compatibility
|
||||
|
||||
### LAND Protocol
|
||||
|
||||
LAND (Local AI Network Discovery) is the LaRuche protocol:
|
||||
- **Discovery:** mDNS service type `_ai-inference._tcp.local.`
|
||||
- **Inference:** `POST http://[node]:8419/infer`
|
||||
|
||||
### What Bjorn implements on the Python side
|
||||
|
||||
```python
|
||||
# mDNS listening (zeroconf)
|
||||
from zeroconf import Zeroconf, ServiceBrowser
|
||||
ServiceBrowser(zc, "_ai-inference._tcp.local.", listener)
|
||||
# → Auto-detects LaRuche nodes
|
||||
|
||||
# Inference call (urllib stdlib, zero dependency)
|
||||
payload = {"prompt": "...", "capability": "llm", "max_tokens": 500}
|
||||
urllib.request.urlopen(f"{url}/infer", data=json.dumps(payload))
|
||||
```
|
||||
|
||||
### Scenarios
|
||||
|
||||
| Scenario | Behaviour |
|
||||
|----------|-----------|
|
||||
| LaRuche node detected on LAN | Used automatically as priority backend |
|
||||
| Multiple LaRuche nodes | First discovered is used |
|
||||
| Manual URL configured | Used directly, discovery ignored |
|
||||
| LaRuche node absent | Cascades to Ollama or external API |
|
||||
| `zeroconf` not installed | Discovery silently disabled, DEBUG log |
|
||||
|
||||
---
|
||||
|
||||
## 14. Optional dependencies
|
||||
|
||||
| Package | Min version | Feature unlocked | Install command |
|
||||
|---------|------------|------------------|----------------|
|
||||
| `mcp[cli]` | ≥ 1.0.0 | Full MCP server | `pip install "mcp[cli]"` |
|
||||
| `zeroconf` | ≥ 0.131.0 | LaRuche mDNS discovery | `pip install zeroconf` |
|
||||
|
||||
**No new dependencies** added for LLM backends:
|
||||
- **LaRuche / Ollama**: uses `urllib.request` (Python stdlib)
|
||||
- **Anthropic / OpenAI**: REST API via `urllib` — no SDK needed
|
||||
|
||||
---
|
||||
|
||||
## 15. Quick activation & configuration
|
||||
|
||||
### Basic LLM chat
|
||||
|
||||
```bash
|
||||
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"llm_enabled": true, "llm_backend": "ollama", "llm_ollama_model": "phi3:mini"}'
|
||||
# → http://[bjorn-ip]:8000/chat.html
|
||||
```
|
||||
|
||||
### Chat with tool-calling (LLM accesses live network data)
|
||||
|
||||
```bash
|
||||
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
||||
-d '{"llm_enabled": true, "llm_chat_tools_enabled": true}'
|
||||
```
|
||||
|
||||
### LLM Orchestrator — advisor mode
|
||||
|
||||
```bash
|
||||
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
||||
-d '{
|
||||
"llm_enabled": true,
|
||||
"llm_orchestrator_mode": "advisor",
|
||||
"llm_orchestrator_allowed_actions": ["NmapScan", "SSHBruteforce"]
|
||||
}'
|
||||
```
|
||||
|
||||
### LLM Orchestrator — autonomous mode (LLM as sole planner)
|
||||
|
||||
```bash
|
||||
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
||||
-d '{
|
||||
"llm_enabled": true,
|
||||
"llm_orchestrator_mode": "autonomous",
|
||||
"llm_orchestrator_skip_scheduler": true,
|
||||
"llm_orchestrator_max_actions": 5,
|
||||
"llm_orchestrator_interval_s": 120,
|
||||
"llm_orchestrator_allowed_actions": ["NmapScan", "SSHBruteforce", "SMBScan"],
|
||||
"llm_orchestrator_log_reasoning": true
|
||||
}'
|
||||
# → View reasoning: http://[bjorn-ip]:8000/chat.html → Orch Log button
|
||||
```
|
||||
|
||||
### With Anthropic API
|
||||
|
||||
```bash
|
||||
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
||||
-d '{
|
||||
"llm_enabled": true,
|
||||
"llm_backend": "api",
|
||||
"llm_api_provider": "anthropic",
|
||||
"llm_api_key": "sk-ant-...",
|
||||
"llm_api_model": "claude-haiku-4-5-20251001"
|
||||
}'
|
||||
```
|
||||
|
||||
### With OpenRouter (access to all models)
|
||||
|
||||
```bash
|
||||
curl -X POST http://[bjorn-ip]:8000/api/llm/config \
|
||||
-d '{
|
||||
"llm_enabled": true,
|
||||
"llm_backend": "api",
|
||||
"llm_api_provider": "openrouter",
|
||||
"llm_api_key": "sk-or-...",
|
||||
"llm_api_model": "meta-llama/llama-3.2-3b-instruct",
|
||||
"llm_api_base_url": "https://openrouter.ai/api"
|
||||
}'
|
||||
```
|
||||
|
||||
### Model recommendations by scenario
|
||||
|
||||
| Scenario | Backend | Recommended model | Pi RAM |
|
||||
|----------|---------|-------------------|--------|
|
||||
| Autonomous orchestrator + LaRuche on LAN | laruche | Mistral/Phi on the node | 0 (remote inference) |
|
||||
| Autonomous orchestrator offline | ollama | `qwen2.5:3b` | ~3 GB |
|
||||
| Autonomous orchestrator cloud | api | `claude-haiku-4-5-20251001` | 0 |
|
||||
| Chat + tools | ollama | `phi3:mini` | ~2 GB |
|
||||
| EPD comments only | ollama | `smollm2:360m` | ~400 MB |
|
||||
|
||||
---
|
||||
|
||||
## 16. Complete API endpoint reference
|
||||
|
||||
### GET
|
||||
|
||||
```
|
||||
GET /api/llm/status
|
||||
→ {"enabled": bool, "backend": str, "laruche_url": str|null,
|
||||
"laruche_discovery": bool, "ollama_url": str, "ollama_model": str,
|
||||
"api_provider": str, "api_model": str, "api_key_set": bool}
|
||||
|
||||
GET /api/llm/config
|
||||
→ {all llm_* keys except api_key, + "llm_api_key_set": bool}
|
||||
|
||||
GET /api/llm/reasoning
|
||||
→ {"status": "ok", "messages": [{"role": str, "content": str}, ...], "count": int}
|
||||
→ {"status": "error", "message": str, "messages": [], "count": 0}
|
||||
|
||||
GET /api/mcp/status
|
||||
→ {"enabled": bool, "running": bool, "transport": str,
|
||||
"port": int, "allowed_tools": [str]}
|
||||
```
|
||||
|
||||
### POST
|
||||
|
||||
```
|
||||
POST /api/llm/chat
|
||||
Body: {"message": str, "session_id": str?}
|
||||
→ {"status": "ok", "response": str, "session_id": str}
|
||||
→ {"status": "error", "message": str}
|
||||
|
||||
POST /api/llm/clear_history
|
||||
Body: {"session_id": str?}
|
||||
→ {"status": "ok"}
|
||||
|
||||
POST /api/llm/config
|
||||
Body: {any subset of llm_* and llm_orchestrator_* keys}
|
||||
→ {"status": "ok"}
|
||||
→ {"status": "error", "message": str}
|
||||
|
||||
POST /api/mcp/toggle
|
||||
Body: {"enabled": bool}
|
||||
→ {"status": "ok", "enabled": bool, "started": bool?}
|
||||
|
||||
POST /api/mcp/config
|
||||
Body: {"allowed_tools": [str]?, "port": int?, "transport": str?}
|
||||
→ {"status": "ok", "config": {...}}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 17. Queue priority system
|
||||
|
||||
```
|
||||
Priority Source Trigger
|
||||
──────────────────────────────────────────────────────────────
|
||||
85 LLM Advisor llm_orchestrator.advise()
|
||||
82 LLM Autonomous _run_autonomous_cycle() via run_action tool
|
||||
80 External MCP _impl_run_action() via MCP client or chat
|
||||
50 Normal / manual queue_action() without explicit priority
|
||||
40 Scheduler action_scheduler evaluates triggers
|
||||
```
|
||||
|
||||
The scheduler always processes the highest-priority pending item first. LLM and MCP actions therefore preempt scheduler actions.
|
||||
|
||||
---
|
||||
|
||||
## 18. Fallbacks & graceful degradation
|
||||
|
||||
| Condition | Behaviour |
|
||||
|-----------|-----------|
|
||||
| `llm_enabled = False` | `complete()` returns `None` immediately — zero overhead |
|
||||
| `llm_orchestrator_mode = "none"` | LLMOrchestrator not instantiated |
|
||||
| `mcp` not installed | `_build_mcp_server()` returns `None`, WARNING log |
|
||||
| `zeroconf` not installed | LaRuche discovery silently disabled, DEBUG log |
|
||||
| LaRuche node timeout | Exception caught, cascade to next backend |
|
||||
| Ollama not running | `URLError` caught, cascade to API |
|
||||
| API key missing | `_call_api()` returns `None`, cascade |
|
||||
| All backends fail | `complete()` returns `None` |
|
||||
| LLM returns `None` for EPD | `comment.py` uses `_pick_text()` (original behaviour) |
|
||||
| LLM advisor: invalid JSON | DEBUG log, returns `None`, next cycle |
|
||||
| LLM advisor: disallowed action | WARNING log, ignored |
|
||||
| LLM autonomous: no change | cycle skipped, zero API call |
|
||||
| LLM autonomous: ≥6 tool turns | returns partial text + warning |
|
||||
| Exception in LLM Bridge | `try/except` at every level, DEBUG log |
|
||||
|
||||
### Timeouts
|
||||
|
||||
```
|
||||
Chat / complete() → llm_timeout_s (default: 30s)
|
||||
EPD comments → 8s (hardcoded, short to avoid blocking render)
|
||||
Autonomous cycle → 90s (long: may chain multiple tool calls)
|
||||
Advisor → 20s (short prompt + JSON response)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 19. Call sequences
|
||||
|
||||
### Web chat with tool-calling
|
||||
|
||||
```
|
||||
Browser → POST /api/llm/chat {"message": "which hosts are vulnerable?"}
|
||||
└── LLMUtils.handle_chat(data)
|
||||
└── LLMBridge().chat(message, session_id)
|
||||
└── complete(messages, system, tools=_BJORN_TOOLS)
|
||||
└── _call_anthropic(messages, tools=[...])
|
||||
├── POST /v1/messages → stop_reason=tool_use
|
||||
│ └── tool: get_hosts(alive_only=true)
|
||||
│ → _execute_tool → _impl_get_hosts()
|
||||
│ → JSON of hosts
|
||||
├── POST /v1/messages [+ tool result] → end_turn
|
||||
└── returns "3 exposed SSH hosts: 192.168.1.10, ..."
|
||||
← {"status": "ok", "response": "3 exposed SSH hosts..."}
|
||||
```
|
||||
|
||||
### LLM autonomous cycle
|
||||
|
||||
```
|
||||
Thread "LLMOrchestrator" (daemon, interval=60s)
|
||||
└── _run_autonomous_cycle()
|
||||
├── fp = _compute_fingerprint() → (12, 3, 1, 47)
|
||||
├── _has_actionable_change(fp) → True (vuln_count 2→3)
|
||||
├── self._last_fingerprint = fp
|
||||
│
|
||||
└── LLMBridge().complete(prompt, system, tools=[read-only + run_action])
|
||||
└── _call_anthropic(tools=[...])
|
||||
├── POST → tool_use: get_hosts()
|
||||
│ → [{ip: "192.168.1.20", ports: "22,80,443"}]
|
||||
├── POST → tool_use: get_action_history()
|
||||
│ → [...]
|
||||
├── POST → tool_use: run_action("SSHBruteforce", "192.168.1.20")
|
||||
│ → _execute_tool → _impl_run_action()
|
||||
│ → db.queue_action(priority=82, trigger="llm_autonomous")
|
||||
│ → queue_event.set()
|
||||
└── POST → end_turn
|
||||
→ "Queued SSHBruteforce on 192.168.1.20 (Mjolnir strikes the unguarded gate)"
|
||||
→ [if log_reasoning=True] logger.info("[LLM_ORCH_REASONING]...")
|
||||
→ [if log_reasoning=True] _push_to_chat(bridge, prompt, response)
|
||||
```
|
||||
|
||||
### Reading reasoning from chat.html
|
||||
|
||||
```
|
||||
User clicks "Orch Log"
|
||||
└── fetch GET /api/llm/reasoning
|
||||
└── LLMUtils.get_llm_reasoning(handler)
|
||||
└── LLMBridge()._chat_histories["llm_orchestrator"]
|
||||
→ [{"role": "user", "content": "[Autonomous cycle]..."},
|
||||
{"role": "assistant", "content": "Queued SSHBruteforce..."}]
|
||||
← {"status": "ok", "messages": [...], "count": 2}
|
||||
→ Rendered as chat bubbles in #messages
|
||||
```
|
||||
|
||||
### MCP from external client (Claude Desktop)
|
||||
|
||||
```
|
||||
Claude Desktop → tool_call: run_action("NmapScan", "192.168.1.0/24")
|
||||
└── FastMCP dispatch
|
||||
└── mcp_server.run_action(action_name, target_ip)
|
||||
└── _impl_run_action()
|
||||
├── db.queue_action(priority=80, trigger="mcp")
|
||||
└── queue_event.set()
|
||||
← {"status": "queued", "action": "NmapScan", "target": "192.168.1.0/24", "priority": 80}
|
||||
```
|
||||
|
||||
### EPD comment with LLM
|
||||
|
||||
```
|
||||
display.py → CommentAI.get_comment("SSHBruteforce", params={...})
|
||||
└── delay elapsed OR status changed → proceed
|
||||
├── llm_comments_enabled = True ?
|
||||
│ └── LLMBridge().generate_comment("SSHBruteforce", params)
|
||||
│ └── complete([{role:user, content:"Status: SSHBruteforce..."}],
|
||||
│ max_tokens=80, timeout=8)
|
||||
│ ├── LaRuche → "Norse gods smell SSH credentials..." ✓
|
||||
│ └── [or timeout 8s] → None
|
||||
└── text = None → _pick_text("SSHBruteforce", lang, params)
|
||||
└── SELECT FROM comments WHERE status='SSHBruteforce'
|
||||
→ "Processing authentication attempts..."
|
||||
```
|
||||
Reference in New Issue
Block a user