mirror of https://github.com/infinition/Bjorn.git synced 2026-03-17 09:31:04 +00:00

Files

infinition b759ab6d4b Add LLM configuration and MCP server management UI and backend functionality

- Implemented a new SPA page for LLM Bridge and MCP Server settings in `llm-config.js`.
- Added functionality for managing LLM and MCP configurations, including toggling, saving settings, and testing connections.
- Created HTTP endpoints in `llm_utils.py` for handling LLM chat, status checks, and MCP server configuration.
- Integrated model fetching from LaRuche and Ollama backends.
- Enhanced error handling and logging for better debugging and user feedback.

2026-03-16 20:33:22 +01:00

38 KiB

Raw Blame History

BJORN — LLM Bridge, MCP Server & LLM Orchestrator

Complete architecture, operation, commands, fallbacks

Overview
Created / modified files
LLM Bridge (llm_bridge.py)
MCP Server (mcp_server.py)
LLM Orchestrator (llm_orchestrator.py)
Orchestrator & Scheduler integration
Web Utils LLM (web_utils/llm_utils.py)
EPD comment integration (comment.py)
Configuration (shared.py)
HTTP Routes (webapp.py)
Web interfaces
Startup (Bjorn.py)
LaRuche / LAND Protocol compatibility
Optional dependencies
Quick activation & configuration
Complete API endpoint reference
Queue priority system
Fallbacks & graceful degradation
Call sequences

1. Overview

┌─────────────────────────────────────────────────────────────────────┐
│                           BJORN (RPi)                               │
│                                                                     │
│  ┌─────────────┐  ┌──────────────────┐  ┌─────────────────────┐   │
│  │ Core BJORN  │  │   MCP Server     │  │ Web UI              │   │
│  │ (unchanged) │  │ (mcp_server.py)  │  │ /chat.html          │   │
│  │             │  │ 7 exposed tools  │  │ /mcp-config.html    │   │
│  │ comment.py  │  │ HTTP SSE / stdio │  │  ↳ Orch Log button  │   │
│  │  ↕ LLM hook │  │                  │  │                     │   │
│  └──────┬──────┘  └────────┬─────────┘  └──────────┬──────────┘   │
│         └─────────────────────────────────────────────┘            │
│                             │                                       │
│  ┌──────────────────────────▼─────────────────────────────────┐   │
│  │                 LLM Bridge (llm_bridge.py)                  │   │
│  │                   Singleton · Thread-safe                   │   │
│  │                                                             │   │
│  │  Automatic cascade:                                         │   │
│  │  1. LaRuche node  (LAND/mDNS → HTTP POST /infer)           │   │
│  │  2. Local Ollama  (HTTP POST /api/chat)                     │   │
│  │  3. External API  (Anthropic / OpenAI / OpenRouter)         │   │
│  │  4. None          (→ fallback templates in comment.py)      │   │
│  │                                                             │   │
│  │  Agentic tool-calling loop (stop_reason=tool_use, ≤6 turns) │   │
│  │  _BJORN_TOOLS: 7 tools in Anthropic format                 │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                             │                                       │
│  ┌──────────────────────────▼─────────────────────────────────┐   │
│  │              LLM Orchestrator (llm_orchestrator.py)         │   │
│  │                                                             │   │
│  │  mode = none      → LLM has no role in scheduling           │   │
│  │  mode = advisor   → LLM suggests 1 action/cycle (prio 85)  │   │
│  │  mode = autonomous→ own thread, loop + tools (prio 82)     │   │
│  │                                                             │   │
│  │  Fingerprint (hosts↑, vulns↑, creds↑, queue_id↑)          │   │
│  │  → skip LLM if nothing new (token savings)                 │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                             │                                       │
│  ┌──────────────────────────▼─────────────────────────────────┐   │
│  │                Action Queue (SQLite)                        │   │
│  │  scheduler=40  normal=50  MCP=80  autonomous=82  advisor=85│   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘
          ↕ mDNS  _ai-inference._tcp.local.  (zeroconf)
┌──────────────────────────────────────────┐
│         LaRuche Swarm (LAN)              │
│  Node A → Mistral 7B   :8419             │
│  Node B → DeepSeek Coder :8419           │
│  Node C → Phi-3 Mini   :8419             │
└──────────────────────────────────────────┘

Design principles:

Everything is disabled by default — zero impact if not configured
All dependencies are optional — silent import if missing
Systematic fallback at every level — Bjorn never crashes because of the LLM
The bridge is a singleton — one instance per process, thread-safe
EPD comments preserve their exact original behaviour if LLM is disabled
The LLM is the brain (decides what to do), the orchestrator is the arms (executes)

2. Created / modified files

Created files

File	Approx. size	Role
`llm_bridge.py`	~450 lines	LLM Singleton — backend cascade + agentic tool-calling loop
`mcp_server.py`	~280 lines	FastMCP MCP Server — 7 Bjorn tools
`web_utils/llm_utils.py`	~220 lines	LLM/MCP HTTP endpoints (web_utils pattern)
`llm_orchestrator.py`	~410 lines	LLM Orchestrator — advisor & autonomous modes
`web/chat.html`	~300 lines	Chat interface + Orch Log button
`web/mcp-config.html`	~400 lines	LLM & MCP configuration page

Modified files

File	What changed
`shared.py`	+45 config keys (LLM bridge, MCP, orchestrator)
`comment.py`	LLM hook in `get_comment()` — 12 lines added
`utils.py`	+1 entry in lazy WebUtils registry: `"llm_utils"`
`webapp.py`	+9 GET/POST routes in `_register_routes_once()`
`Bjorn.py`	LLM Bridge warm-up + conditional MCP server start
`orchestrator.py`	+`LLMOrchestrator` lifecycle + advisor call in background tasks
`action_scheduler.py`	+skip scheduler if LLM autonomous only (`llm_orchestrator_skip_scheduler`)
`requirements.txt`	+3 comment lines (optional dependencies documented)

3. LLM Bridge (`llm_bridge.py`)

Internal architecture

LLMBridge (Singleton)
├── __init__()              Initialises singleton, launches LaRuche discovery
├── complete()              Main API — cascades all backends
│     └── tools=None/[...]  Optional param to enable tool-calling
├── generate_comment()      Generates a short EPD comment (≤80 tokens)
├── chat()                  Stateful chat with per-session history
│     └── tools=_BJORN_TOOLS if llm_chat_tools_enabled=True
├── clear_history()         Clears a session's history
├── status()                Returns bridge state (for the UI)
│
├── _start_laruche_discovery()   Starts mDNS thread in background
├── _discover_laruche_mdns()     Listens to _ai-inference._tcp.local. continuously
│
├── _call_laruche()         Backend 1 — POST http://[node]:8419/infer
├── _call_ollama()          Backend 2 — POST http://localhost:11434/api/chat
├── _call_anthropic()       Backend 3a — POST api.anthropic.com + AGENTIC LOOP
│     └── loop ≤6 turns: send → tool_use → execute → feed result → repeat
├── _call_openai_compat()   Backend 3b — POST [base_url]/v1/chat/completions
│
├── _execute_tool(name, inputs)  Dispatches to mcp_server._impl_*
│     └── gate: checks mcp_allowed_tools before executing
│
└── _build_system_prompt()  Builds system prompt with live Bjorn context

_BJORN_TOOLS : List[Dict]   Anthropic-format definitions for the 7 MCP tools

_BJORN_TOOLS — full list

_BJORN_TOOLS = [
    {"name": "get_hosts",           "description": "...", "input_schema": {...}},
    {"name": "get_vulnerabilities", ...},
    {"name": "get_credentials",     ...},
    {"name": "get_action_history",  ...},
    {"name": "get_status",          ...},
    {"name": "run_action",          ...},  # gated by mcp_allowed_tools
    {"name": "query_db",            ...},  # SELECT only
]

Backend cascade

llm_backend = "auto"    →  LaRuche → Ollama → API → None
llm_backend = "laruche" →  LaRuche only
llm_backend = "ollama"  →  Ollama only
llm_backend = "api"     →  External API only

At each step, if a backend fails (timeout, network error, missing model), the next one is tried silently. If all fail, complete() returns None.

Agentic tool-calling loop (`_call_anthropic`)

When tools is passed to complete(), the Anthropic backend enters agentic mode:

_call_anthropic(messages, system, tools, max_tokens, timeout)
  │
  ├─ POST /v1/messages {tools: [...]}
  │
  ├─ [stop_reason = "tool_use"]
  │     for each tool_use block:
  │       result = _execute_tool(name, inputs)
  │       append {role: "tool", tool_use_id: ..., content: result}
  │     POST /v1/messages [messages + tool results]  ← next turn
  │
  └─ [stop_reason = "end_turn"]  → returns final text
     [≥6 turns]                  → returns partial text + warning

_execute_tool() dispatches directly to mcp_server._impl_* (no network), checking mcp_allowed_tools for run_action.

Tool-calling in chat (`chat()`)

If llm_chat_tools_enabled = True, the chat passes tools=_BJORN_TOOLS to the backend, letting the LLM answer with real-time data (hosts, vulns, creds…) rather than relying only on its training knowledge.

Chat history

Each session has its own history (key = session_id)
Special session "llm_orchestrator": contains the autonomous orchestrator's reasoning
Max size configurable: llm_chat_history_size (default: 20 messages)
History is in-memory only — not persisted across restarts
Thread-safe via _hist_lock

4. MCP Server (`mcp_server.py`)

What is MCP?

The Model Context Protocol (Anthropic) is an open-source protocol that lets AI agents (Claude Desktop, custom agents, etc.) use external tools via a standardised interface.

By enabling Bjorn's MCP server, any MCP client can query and control Bjorn — without knowing the internal DB structure.

Exposed tools

Tool	Arguments	Description
`get_hosts`	`alive_only: bool = True`	Returns discovered hosts (IP, MAC, hostname, OS, ports)
`get_vulnerabilities`	`host_ip: str = ""`, `limit: int = 100`	Returns discovered CVE vulnerabilities
`get_credentials`	`service: str = ""`, `limit: int = 100`	Returns captured credentials (SSH, FTP, SMB…)
`get_action_history`	`limit: int = 50`, `action_name: str = ""`	History of executed actions
`get_status`	(none)	Real-time state: mode, active action, counters
`run_action`	`action_name: str`, `target_ip: str`, `target_mac: str = ""`	Queues a Bjorn action (MCP priority = 80)
`query_db`	`sql: str`, `params: str = "[]"`	Free SELECT against the SQLite DB (read-only)

Security: each tool checks mcp_allowed_tools — unlisted tools return a clean error. query_db rejects anything that is not a SELECT.

`_impl_run_action` — priority detail

_MCP_PRIORITY = 80  # > scheduler(40) > normal(50)

sd.db.queue_action(
    action_name=action_name,
    mac=mac,          # resolved from hosts WHERE ip=? if not supplied
    ip=target_ip,
    priority=_MCP_PRIORITY,
    trigger="mcp",
    metadata={"decision_method": "mcp", "decision_origin": "mcp"},
)
sd.queue_event.set()  # wakes the orchestrator immediately

Available transports

Transport	Config	Usage
`http` (default)	`mcp_transport: "http"`, `mcp_port: 8765`	Accessible from any MCP client on LAN via SSE
`stdio`	`mcp_transport: "stdio"`	Claude Desktop, CLI agents

5. LLM Orchestrator (`llm_orchestrator.py`)

The LLM Orchestrator transforms Bjorn from a scriptable tool into an autonomous agent. It is completely optional and disableable via llm_orchestrator_mode = "none".

Operating modes

Mode	Config value	Operation
Disabled	`"none"` (default)	LLM plays no role in planning
Advisor	`"advisor"`	LLM consulted periodically, suggests 1 action
Autonomous	`"autonomous"`	Own thread, LLM observes + plans with tools

Internal architecture

LLMOrchestrator
├── start()                    Starts autonomous thread if mode=autonomous
├── stop()                     Stops thread (join 15s max)
├── restart_if_mode_changed()  Called from orchestrator.run() each iteration
├── is_active()                True if autonomous thread is alive
│
├── [ADVISOR MODE]
│   advise()                   → called from orchestrator._process_background_tasks()
│     ├── _build_snapshot()    → compact dict (hosts, vulns, creds, queue)
│     ├── LLMBridge().complete(prompt, system)
│     └── _apply_advisor_response(raw, allowed)
│           ├── parse JSON {"action": str, "target_ip": str, "reason": str}
│           ├── validate action ∈ allowed
│           └── db.queue_action(priority=85, trigger="llm_advisor")
│
└── [AUTONOMOUS MODE]
    _autonomous_loop()         Thread "LLMOrchestrator" (daemon)
      └── loop:
            _compute_fingerprint()   → (hosts, vulns, creds, max_queue_id)
            _has_actionable_change() → skip if nothing increased
            _run_autonomous_cycle()
              ├── filter tools: read-only always + run_action if in allowed
              ├── LLMBridge().complete(prompt, system, tools=[...])
              │     └── _call_anthropic() agentic loop
              │           → LLM calls run_action via tools
              │                → _execute_tool → _impl_run_action → queue
              └── if llm_orchestrator_log_reasoning=True:
                    logger.info("[LLM_ORCH_REASONING]...")
                    _push_to_chat()  → "llm_orchestrator" session in LLMBridge
            sleep(llm_orchestrator_interval_s)

Fingerprint and smart skip

def _compute_fingerprint(self) -> tuple:
    # (host_count, vuln_count, cred_count, max_completed_queue_id)
    return (hosts, vulns, creds, last_id)

def _has_actionable_change(self, fp: tuple) -> bool:
    if self._last_fingerprint is None:
        return True  # first cycle always runs
    # Triggers ONLY if something INCREASED
    # hosts going offline → not actionable
    return any(fp[i] > self._last_fingerprint[i] for i in range(len(fp)))

Token savings: if llm_orchestrator_skip_if_no_change = True (default), the LLM cycle is skipped if no new hosts/vulns/creds and no action completed since the last cycle.

LLM priorities vs queue

_ADVISOR_PRIORITY    = 85  # advisor > MCP(80) > normal(50) > scheduler(40)
_AUTONOMOUS_PRIORITY = 82  # autonomous slightly below advisor

Autonomous system prompt — example

"You are Bjorn's autonomous orchestrator, running on a Raspberry Pi network security tool.
Current state: 12 hosts discovered, 3 vulnerabilities, 1 credentials.
Operation mode: ATTACK. Hard limit: at most 3 run_action calls per cycle.
Only these action names may be queued: NmapScan, SSHBruteforce, SMBScan.
Strategy: prioritise unexplored services, hosts with high port counts, and hosts with no recent scans.
Do not queue duplicate actions already pending or recently successful.
Use Norse references occasionally. Be terse and tactical."

Advisor response format

// Action recommended:
{"action": "NmapScan", "target_ip": "192.168.1.42", "reason": "unexplored host, 0 open ports known"}

// Nothing to do:
{"action": null}

Reasoning log

When llm_orchestrator_log_reasoning = True:

Full reasoning is logged via logger.info("[LLM_ORCH_REASONING]...")
It is also injected into the "llm_orchestrator" session in LLMBridge._chat_histories
Viewable in real time in chat.html via the Orch Log button

6. Orchestrator & Scheduler integration

`orchestrator.py`

# __init__
self.llm_orchestrator = None
self._init_llm_orchestrator()

# _init_llm_orchestrator()
if shared_data.config.get("llm_enabled") and shared_data.config.get("llm_orchestrator_mode") != "none":
    from llm_orchestrator import LLMOrchestrator
    self.llm_orchestrator = LLMOrchestrator(shared_data)
    self.llm_orchestrator.start()

# run() — each iteration
self._sync_llm_orchestrator()   # starts/stops thread according to runtime config

# _process_background_tasks()
if self.llm_orchestrator and mode == "advisor":
    self.llm_orchestrator.advise()

`action_scheduler.py` — skip option

# In run(), each iteration:
_llm_skip = bool(
    shared_data.config.get("llm_orchestrator_skip_scheduler", False)
    and shared_data.config.get("llm_orchestrator_mode") == "autonomous"
    and shared_data.config.get("llm_enabled", False)
)

if not _llm_skip:
    self._publish_all_upcoming()    # step 2: publish due actions
    self._evaluate_global_actions() # step 3: global evaluation
    self.evaluate_all_triggers()    # step 4: per-host triggers
# Steps 1 (promote due) and 5 (cleanup/priorities) always run

When llm_orchestrator_skip_scheduler = True + mode = autonomous + llm_enabled = True:

The scheduler no longer publishes automatic actions (no more B_require, B_trigger, etc.)
The autonomous LLM becomes sole master of the queue
Queue hygiene (promotions, cleanup) remains active

7. Web Utils LLM (`web_utils/llm_utils.py`)

Follows the exact same pattern as all other web_utils (constructor __init__(self, shared_data), methods called by webapp.py).

Methods

Method	Type	Description
`get_llm_status(handler)`	GET	LLM bridge state (active backend, LaRuche URL…)
`get_llm_config(handler)`	GET	Current LLM config (api_key masked)
`get_llm_reasoning(handler)`	GET	`llm_orchestrator` session history (reasoning log)
`handle_chat(data)`	POST	Sends a message, returns LLM response
`clear_chat_history(data)`	POST	Clears a session's history
`get_mcp_status(handler)`	GET	MCP server state (running, port, transport)
`toggle_mcp(data)`	POST	Enables/disables MCP server + saves config
`save_mcp_config(data)`	POST	Saves MCP config (tools, port, transport)
`save_llm_config(data)`	POST	Saves LLM config (all parameters)

8. EPD comment integration (`comment.py`)

Behaviour before modification

get_comment(status, lang, params)
  └── if delay elapsed OR status changed
        └── _pick_text(status, lang, params)  ← SQLite DB
              └── returns weighted text

Behaviour after modification

get_comment(status, lang, params)
  └── if delay elapsed OR status changed
        │
        ├── [if llm_comments_enabled = True]
        │     └── LLMBridge().generate_comment(status, params)
        │           ├── success → LLM text (≤12 words, ~8s max)
        │           └── failure/timeout → text = None
        │
        └── [if text = None]  ← SYSTEMATIC FALLBACK
              └── _pick_text(status, lang, params)  ← original behaviour
                    └── returns weighted DB text

Original behaviour preserved 100% if LLM disabled or failing.

9. Configuration (`shared.py`)

LLM Bridge section (`__title_llm__`)

Key	Default	Type	Description
`llm_enabled`	`False`	bool	Master toggle — activates the entire bridge
`llm_comments_enabled`	`False`	bool	Use LLM for EPD comments
`llm_chat_enabled`	`True`	bool	Enable /chat.html interface
`llm_chat_tools_enabled`	`False`	bool	Enable tool-calling in web chat
`llm_backend`	`"auto"`	str	`auto` \| `laruche` \| `ollama` \| `api`
`llm_laruche_discovery`	`True`	bool	Auto-discover LaRuche nodes via mDNS
`llm_laruche_url`	`""`	str	Manual LaRuche URL (overrides discovery)
`llm_ollama_url`	`"http://127.0.0.1:11434"`	str	Local Ollama URL
`llm_ollama_model`	`"phi3:mini"`	str	Ollama model to use
`llm_api_provider`	`"anthropic"`	str	`anthropic` \| `openai` \| `openrouter`
`llm_api_key`	`""`	str	API key (masked in UI)
`llm_api_model`	`"claude-haiku-4-5-20251001"`	str	External API model
`llm_api_base_url`	`""`	str	Custom base URL (OpenRouter, proxy…)
`llm_timeout_s`	`30`	int	Global LLM call timeout (seconds)
`llm_max_tokens`	`500`	int	Max tokens for chat
`llm_comment_max_tokens`	`80`	int	Max tokens for EPD comments
`llm_chat_history_size`	`20`	int	Max messages per chat session

MCP Server section (`__title_mcp__`)

Key	Default	Type	Description
`mcp_enabled`	`False`	bool	Enable MCP server
`mcp_transport`	`"http"`	str	`http` (SSE) \| `stdio`
`mcp_port`	`8765`	int	HTTP SSE port
`mcp_allowed_tools`	`[all]`	list	List of authorised MCP tools

LLM Orchestrator section (`__title_llm_orch__`)

Key	Default	Type	Description
`llm_orchestrator_mode`	`"none"`	str	`none` \| `advisor` \| `autonomous`
`llm_orchestrator_interval_s`	`60`	int	Delay between autonomous cycles (min 30s)
`llm_orchestrator_max_actions`	`3`	int	Max actions per autonomous cycle
`llm_orchestrator_allowed_actions`	`[]`	list	Actions the LLM may queue (empty = mcp_allowed_tools)
`llm_orchestrator_skip_scheduler`	`False`	bool	Disable scheduler when autonomous is active
`llm_orchestrator_skip_if_no_change`	`True`	bool	Skip cycle if fingerprint unchanged
`llm_orchestrator_log_reasoning`	`False`	bool	Log full LLM reasoning

10. HTTP Routes (`webapp.py`)

GET routes

Route	Handler	Description
`GET /api/llm/status`	`llm_utils.get_llm_status`	LLM bridge state
`GET /api/llm/config`	`llm_utils.get_llm_config`	LLM config (api_key masked)
`GET /api/llm/reasoning`	`llm_utils.get_llm_reasoning`	Orchestrator reasoning log
`GET /api/mcp/status`	`llm_utils.get_mcp_status`	MCP server state

POST routes (JSON data-only)

Route	Handler	Description
`POST /api/llm/chat`	`llm_utils.handle_chat`	Send a message to the LLM
`POST /api/llm/clear_history`	`llm_utils.clear_chat_history`	Clear a session's history
`POST /api/llm/config`	`llm_utils.save_llm_config`	Save LLM config
`POST /api/mcp/toggle`	`llm_utils.toggle_mcp`	Enable/disable MCP
`POST /api/mcp/config`	`llm_utils.save_mcp_config`	Save MCP config

All routes respect Bjorn's existing authentication (webauth).

11. Web interfaces

`/chat.html`

Terminal-style chat interface (black/red, consistent with Bjorn).

Features:

Auto-detects LLM state on load (GET /api/llm/status)
Displays active backend (LaRuche URL, or mode)
"Bjorn is thinking..." indicator during response
Unique session ID per browser tab
Enter = send, Shift+Enter = new line
Textarea auto-resize
"Clear history" button — clears server-side session
"Orch Log" button — loads the autonomous orchestrator's reasoning
- Calls GET /api/llm/reasoning
- Renders each message (cycle prompt + LLM response) as chat bubbles
- "← Back to chat" to return to normal chat
- Helper message if log is empty (hint: enable llm_orchestrator_log_reasoning)

Access: http://[bjorn-ip]:8000/chat.html

`/mcp-config.html`

Full LLM & MCP configuration page.

LLM Bridge section:

Master enable/disable toggle
EPD comments, chat, chat tool-calling toggles
Backend selector (auto / laruche / ollama / api)
LaRuche mDNS discovery toggle + manual URL
Ollama configuration (URL + model)
External API configuration (provider, key, model, custom URL)
Timeout and token parameters
"TEST CONNECTION" button

MCP Server section:

Enable toggle with live start/stop
Transport selector (HTTP SSE / stdio)
HTTP port
Per-tool checkboxes
"RUNNING" / "OFF" indicator

Access: http://[bjorn-ip]:8000/mcp-config.html

12. Startup (`Bjorn.py`)

# LLM Bridge — warm up singleton
try:
    from llm_bridge import LLMBridge
    LLMBridge()  # Starts mDNS discovery if llm_laruche_discovery=True
    logger.info("LLM Bridge initialised")
except Exception as e:
    logger.warning("LLM Bridge init skipped: %s", e)

# MCP Server
try:
    import mcp_server
    if shared_data.config.get("mcp_enabled", False):
        mcp_server.start()      # Daemon thread "MCPServer"
        logger.info("MCP server started")
    else:
        logger.info("MCP server loaded (disabled)")
except Exception as e:
    logger.warning("MCP server init skipped: %s", e)

The LLM Orchestrator is initialised inside orchestrator.py (not Bjorn.py), since it depends on the orchestrator loop cycle.

13. LaRuche / LAND Protocol compatibility

LAND Protocol

LAND (Local AI Network Discovery) is the LaRuche protocol:

Discovery: mDNS service type _ai-inference._tcp.local.
Inference: POST http://[node]:8419/infer

What Bjorn implements on the Python side

# mDNS listening (zeroconf)
from zeroconf import Zeroconf, ServiceBrowser
ServiceBrowser(zc, "_ai-inference._tcp.local.", listener)
# → Auto-detects LaRuche nodes

# Inference call (urllib stdlib, zero dependency)
payload = {"prompt": "...", "capability": "llm", "max_tokens": 500}
urllib.request.urlopen(f"{url}/infer", data=json.dumps(payload))

Scenarios

Scenario	Behaviour
LaRuche node detected on LAN	Used automatically as priority backend
Multiple LaRuche nodes	First discovered is used
Manual URL configured	Used directly, discovery ignored
LaRuche node absent	Cascades to Ollama or external API
`zeroconf` not installed	Discovery silently disabled, DEBUG log

14. Optional dependencies

Package	Min version	Feature unlocked	Install command
`mcp[cli]`	≥ 1.0.0	Full MCP server	`pip install "mcp[cli]"`
`zeroconf`	≥ 0.131.0	LaRuche mDNS discovery	`pip install zeroconf`

No new dependencies added for LLM backends:

LaRuche / Ollama: uses urllib.request (Python stdlib)
Anthropic / OpenAI: REST API via urllib — no SDK needed

15. Quick activation & configuration

Basic LLM chat

curl -X POST http://[bjorn-ip]:8000/api/llm/config \
  -H "Content-Type: application/json" \
  -d '{"llm_enabled": true, "llm_backend": "ollama", "llm_ollama_model": "phi3:mini"}'
# → http://[bjorn-ip]:8000/chat.html

Chat with tool-calling (LLM accesses live network data)

curl -X POST http://[bjorn-ip]:8000/api/llm/config \
  -d '{"llm_enabled": true, "llm_chat_tools_enabled": true}'

LLM Orchestrator — advisor mode

curl -X POST http://[bjorn-ip]:8000/api/llm/config \
  -d '{
    "llm_enabled": true,
    "llm_orchestrator_mode": "advisor",
    "llm_orchestrator_allowed_actions": ["NmapScan", "SSHBruteforce"]
  }'

LLM Orchestrator — autonomous mode (LLM as sole planner)

curl -X POST http://[bjorn-ip]:8000/api/llm/config \
  -d '{
    "llm_enabled": true,
    "llm_orchestrator_mode": "autonomous",
    "llm_orchestrator_skip_scheduler": true,
    "llm_orchestrator_max_actions": 5,
    "llm_orchestrator_interval_s": 120,
    "llm_orchestrator_allowed_actions": ["NmapScan", "SSHBruteforce", "SMBScan"],
    "llm_orchestrator_log_reasoning": true
  }'
# → View reasoning: http://[bjorn-ip]:8000/chat.html  → Orch Log button

With Anthropic API

curl -X POST http://[bjorn-ip]:8000/api/llm/config \
  -d '{
    "llm_enabled": true,
    "llm_backend": "api",
    "llm_api_provider": "anthropic",
    "llm_api_key": "sk-ant-...",
    "llm_api_model": "claude-haiku-4-5-20251001"
  }'

With OpenRouter (access to all models)

curl -X POST http://[bjorn-ip]:8000/api/llm/config \
  -d '{
    "llm_enabled": true,
    "llm_backend": "api",
    "llm_api_provider": "openrouter",
    "llm_api_key": "sk-or-...",
    "llm_api_model": "meta-llama/llama-3.2-3b-instruct",
    "llm_api_base_url": "https://openrouter.ai/api"
  }'

Model recommendations by scenario

Scenario	Backend	Recommended model	Pi RAM
Autonomous orchestrator + LaRuche on LAN	laruche	Mistral/Phi on the node	0 (remote inference)
Autonomous orchestrator offline	ollama	`qwen2.5:3b`	~3 GB
Autonomous orchestrator cloud	api	`claude-haiku-4-5-20251001`	0
Chat + tools	ollama	`phi3:mini`	~2 GB
EPD comments only	ollama	`smollm2:360m`	~400 MB

16. Complete API endpoint reference

GET

GET /api/llm/status
→ {"enabled": bool, "backend": str, "laruche_url": str|null,
   "laruche_discovery": bool, "ollama_url": str, "ollama_model": str,
   "api_provider": str, "api_model": str, "api_key_set": bool}

GET /api/llm/config
→ {all llm_* keys except api_key, + "llm_api_key_set": bool}

GET /api/llm/reasoning
→ {"status": "ok", "messages": [{"role": str, "content": str}, ...], "count": int}
→ {"status": "error", "message": str, "messages": [], "count": 0}

GET /api/mcp/status
→ {"enabled": bool, "running": bool, "transport": str,
   "port": int, "allowed_tools": [str]}

POST

POST /api/llm/chat
Body: {"message": str, "session_id": str?}
→ {"status": "ok", "response": str, "session_id": str}
→ {"status": "error", "message": str}

POST /api/llm/clear_history
Body: {"session_id": str?}
→ {"status": "ok"}

POST /api/llm/config
Body: {any subset of llm_* and llm_orchestrator_* keys}
→ {"status": "ok"}
→ {"status": "error", "message": str}

POST /api/mcp/toggle
Body: {"enabled": bool}
→ {"status": "ok", "enabled": bool, "started": bool?}

POST /api/mcp/config
Body: {"allowed_tools": [str]?, "port": int?, "transport": str?}
→ {"status": "ok", "config": {...}}

17. Queue priority system

Priority  Source              Trigger
──────────────────────────────────────────────────────────────
   85     LLM Advisor         llm_orchestrator.advise()
   82     LLM Autonomous      _run_autonomous_cycle() via run_action tool
   80     External MCP        _impl_run_action() via MCP client or chat
   50     Normal / manual     queue_action() without explicit priority
   40     Scheduler           action_scheduler evaluates triggers

The scheduler always processes the highest-priority pending item first. LLM and MCP actions therefore preempt scheduler actions.

18. Fallbacks & graceful degradation

Condition	Behaviour
`llm_enabled = False`	`complete()` returns `None` immediately — zero overhead
`llm_orchestrator_mode = "none"`	LLMOrchestrator not instantiated
`mcp` not installed	`_build_mcp_server()` returns `None`, WARNING log
`zeroconf` not installed	LaRuche discovery silently disabled, DEBUG log
LaRuche node timeout	Exception caught, cascade to next backend
Ollama not running	`URLError` caught, cascade to API
API key missing	`_call_api()` returns `None`, cascade
All backends fail	`complete()` returns `None`
LLM returns `None` for EPD	`comment.py` uses `_pick_text()` (original behaviour)
LLM advisor: invalid JSON	DEBUG log, returns `None`, next cycle
LLM advisor: disallowed action	WARNING log, ignored
LLM autonomous: no change	cycle skipped, zero API call
LLM autonomous: ≥6 tool turns	returns partial text + warning
Exception in LLM Bridge	`try/except` at every level, DEBUG log

Timeouts

Chat / complete()     → llm_timeout_s (default: 30s)
EPD comments          → 8s (hardcoded, short to avoid blocking render)
Autonomous cycle      → 90s (long: may chain multiple tool calls)
Advisor               → 20s (short prompt + JSON response)

19. Call sequences

Web chat with tool-calling

Browser → POST /api/llm/chat {"message": "which hosts are vulnerable?"}
  └── LLMUtils.handle_chat(data)
        └── LLMBridge().chat(message, session_id)
              └── complete(messages, system, tools=_BJORN_TOOLS)
                    └── _call_anthropic(messages, tools=[...])
                          ├── POST /v1/messages → stop_reason=tool_use
                          │     └── tool: get_hosts(alive_only=true)
                          │           → _execute_tool → _impl_get_hosts()
                          │                 → JSON of hosts
                          ├── POST /v1/messages [+ tool result] → end_turn
                          └── returns "3 exposed SSH hosts: 192.168.1.10, ..."
← {"status": "ok", "response": "3 exposed SSH hosts..."}

LLM autonomous cycle

Thread "LLMOrchestrator" (daemon, interval=60s)
  └── _run_autonomous_cycle()
        ├── fp = _compute_fingerprint()  → (12, 3, 1, 47)
        ├── _has_actionable_change(fp)   → True (vuln_count 2→3)
        ├── self._last_fingerprint = fp
        │
        └── LLMBridge().complete(prompt, system, tools=[read-only + run_action])
              └── _call_anthropic(tools=[...])
                    ├── POST → tool_use: get_hosts()
                    │     → [{ip: "192.168.1.20", ports: "22,80,443"}]
                    ├── POST → tool_use: get_action_history()
                    │     → [...]
                    ├── POST → tool_use: run_action("SSHBruteforce", "192.168.1.20")
                    │     → _execute_tool → _impl_run_action()
                    │           → db.queue_action(priority=82, trigger="llm_autonomous")
                    │           → queue_event.set()
                    └── POST → end_turn
                          → "Queued SSHBruteforce on 192.168.1.20 (Mjolnir strikes the unguarded gate)"
              → [if log_reasoning=True] logger.info("[LLM_ORCH_REASONING]...")
              → [if log_reasoning=True] _push_to_chat(bridge, prompt, response)

Reading reasoning from chat.html

User clicks "Orch Log"
  └── fetch GET /api/llm/reasoning
        └── LLMUtils.get_llm_reasoning(handler)
              └── LLMBridge()._chat_histories["llm_orchestrator"]
                    → [{"role": "user",      "content": "[Autonomous cycle]..."},
                       {"role": "assistant", "content": "Queued SSHBruteforce..."}]
← {"status": "ok", "messages": [...], "count": 2}
→ Rendered as chat bubbles in #messages

MCP from external client (Claude Desktop)

Claude Desktop → tool_call: run_action("NmapScan", "192.168.1.0/24")
  └── FastMCP dispatch
        └── mcp_server.run_action(action_name, target_ip)
              └── _impl_run_action()
                    ├── db.queue_action(priority=80, trigger="mcp")
                    └── queue_event.set()
← {"status": "queued", "action": "NmapScan", "target": "192.168.1.0/24", "priority": 80}

EPD comment with LLM

display.py → CommentAI.get_comment("SSHBruteforce", params={...})
  └── delay elapsed OR status changed → proceed
        ├── llm_comments_enabled = True ?
        │     └── LLMBridge().generate_comment("SSHBruteforce", params)
        │           └── complete([{role:user, content:"Status: SSHBruteforce..."}],
        │                        max_tokens=80, timeout=8)
        │                 ├── LaRuche → "Norse gods smell SSH credentials..."  ✓
        │                 └── [or timeout 8s] → None
        └── text = None → _pick_text("SSHBruteforce", lang, params)
              └── SELECT FROM comments WHERE status='SSHBruteforce'
                    → "Processing authentication attempts..."

38 KiB Raw Blame History

BJORN — LLM Bridge, MCP Server & LLM Orchestrator

Complete architecture, operation, commands, fallbacks

Table of contents

1. Overview

2. Created / modified files

Created files

Modified files

3. LLM Bridge (llm_bridge.py)

Internal architecture

_BJORN_TOOLS — full list

Backend cascade

Agentic tool-calling loop (_call_anthropic)

Tool-calling in chat (chat())

Chat history

4. MCP Server (mcp_server.py)

What is MCP?

Exposed tools

_impl_run_action — priority detail

Available transports

5. LLM Orchestrator (llm_orchestrator.py)

Operating modes

Internal architecture

Fingerprint and smart skip

LLM priorities vs queue

Autonomous system prompt — example

Advisor response format

Reasoning log

6. Orchestrator & Scheduler integration

orchestrator.py

action_scheduler.py — skip option

7. Web Utils LLM (web_utils/llm_utils.py)

Methods

8. EPD comment integration (comment.py)

Behaviour before modification

Behaviour after modification

9. Configuration (shared.py)

LLM Bridge section (__title_llm__)

MCP Server section (__title_mcp__)

LLM Orchestrator section (__title_llm_orch__)

10. HTTP Routes (webapp.py)

GET routes

POST routes (JSON data-only)

11. Web interfaces

/chat.html

/mcp-config.html

12. Startup (Bjorn.py)

13. LaRuche / LAND Protocol compatibility

LAND Protocol

What Bjorn implements on the Python side

Scenarios

14. Optional dependencies

15. Quick activation & configuration

Basic LLM chat

Chat with tool-calling (LLM accesses live network data)

LLM Orchestrator — advisor mode

LLM Orchestrator — autonomous mode (LLM as sole planner)

With Anthropic API

With OpenRouter (access to all models)

Model recommendations by scenario

16. Complete API endpoint reference

GET

POST

17. Queue priority system

18. Fallbacks & graceful degradation

Timeouts

19. Call sequences

Web chat with tool-calling

LLM autonomous cycle

Reading reasoning from chat.html

MCP from external client (Claude Desktop)

EPD comment with LLM

38 KiB

Raw Blame History

3. LLM Bridge (`llm_bridge.py`)

Agentic tool-calling loop (`_call_anthropic`)

Tool-calling in chat (`chat()`)

4. MCP Server (`mcp_server.py`)

`_impl_run_action` — priority detail

5. LLM Orchestrator (`llm_orchestrator.py`)

`orchestrator.py`

`action_scheduler.py` — skip option

7. Web Utils LLM (`web_utils/llm_utils.py`)

8. EPD comment integration (`comment.py`)

9. Configuration (`shared.py`)

LLM Bridge section (`__title_llm__`)

MCP Server section (`__title_mcp__`)

LLM Orchestrator section (`__title_llm_orch__`)

10. HTTP Routes (`webapp.py`)

`/chat.html`

`/mcp-config.html`

12. Startup (`Bjorn.py`)