Lumi/plugins/lumi_ai
2026-06-12 11:54:46 +02:00
..
backend Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
data Add Lumi AI, birthday plugin, and persistent updates 2026-06-11 06:35:43 +02:00
public Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
templates Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
tests Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
views Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
cmds.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
index.js Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
models_manifest.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
plugin.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
README.md Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
runtime_manifest.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00

Lumi AI

lumi_ai is a standalone Lumi plugin that manages a local llama.cpp inference process and adds a scoped AI Assistant to the WebUI.

Install and configure

  1. Place this directory at plugins/lumi_ai/.
  2. Restart Lumi.
  3. Open Plugins -> Lumi AI in the sidebar.
  4. Download the managed runtime and a compatible model.
  5. Select the model, configure visibility and instructions, then save.
  6. Start the runtime and enable AI.

The settings page is always registered as an admin-only item in the Plugins sidebar section. The assistant pill is injected separately above the profile footer and follows the configured admin, moderator, and user visibility controls.

Storage

Every writable path is confined to plugins/lumi_ai/data/:

  • config/: settings and runtime state
  • models/: verified GGUF models
  • runtime/: extracted llama.cpp runtime
  • logs/: runtime logs
  • metrics/: usage and audit records
  • rag/, cache/, tmp/: plugin-local working data

Downloads are written to data/tmp/, verified against a pinned SHA-256 digest, and only then moved or extracted into their final plugin-local directory.

Runtime and downloads

Models use pinned Hugging Face repository commits. The runtime uses a pinned official ggml-org/llama.cpp GitHub release because the llama.cpp project does not publish authoritative multi-platform runtime archives on Hugging Face. This is the only download-source exception; the archive URL, version, size, and SHA-256 are pinned in runtime_manifest.json.

The runtime binds only to 127.0.0.1 on an ephemeral port. It is never exposed on 0.0.0.0.

Before loading a model, Lumi AI runs llama-server --help as a smoke test. Failed launches and exits are decoded into plugin-local diagnostics, including Windows NTSTATUS values such as 0xC0000005 / STATUS_ACCESS_VIOLATION. The admin page provides remediation steps, raw stdout/stderr tails, model verification, and a redacted diagnostics bundle.

Model tiers are capability-based: Tiny, Small, Medium, Large, General, Power, and Extreme. Lumi AI detects compatible GPUs and selects a pinned Vulkan or Metal runtime when available, with CPU as the fallback. The GPU Acceleration setting maps a percentage to the model's supported offload layers and automatically limits the selectable range using model size, context size, and available VRAM.

GPU allocation stores administrator intent separately from the live allocation. Managed runtime VRAM is treated as Lumi-owned usage, while external VRAM pressure can clamp the actual allocation without overwriting the saved intent. The admin page also provides safe model deletion, category-based storage cleanup, paged metrics, structured runtime logs, and assistant-pill visibility diagnostics.

Lumi Assistant uses an immutable identity and safety policy, with administrator-configurable support topics, domains, style, links, clarification behavior, answer length, and role-specific overrides. A plugin-local repository index under data/repo_index/ provides verified Lumi WebUI routes and support context. The admin visibility debugger reports backend eligibility and frontend slot, loader, response, and mount conditions.

Assistant replies normalize verified Lumi routes into safe links for the active WebUI host. Repository paths and implementation details are restricted to administrators; moderator code help is separately opt-in. Reply length limits are applied after inference to delivered text rather than prompt or retrieved context.

The WebUI assistant keeps a bounded per-user conversation and panel preference in browser storage so navigation does not discard the active chat. The panel opens at one-sixth of the viewport by default, supports vertical resizing, and restores its previous height and open state. Assistant Markdown is rendered as safe DOM content, including fenced code blocks with exact-copy controls.

The !assistant command and its default !lumi alias use the same provider, identity, scope, access restrictions, and rate limits as the WebUI assistant. Replies are normalized for the originating platform. AI bans, timeouts, and recent rate-limit denials are managed from the plugin settings page and stored under data/config/.

The access-control picker searches known Lumi profiles and linked platform identities. Runtime log listings and metrics use 25-row pages, while individual log views read only the latest bounded chunk.

Assistant panel rendering is validated before the core reports the panel as available. Template, locals, endpoint, HTML, and mount diagnostics are available to administrators, with failures recorded in data/logs/assistant-panel.log.

The test console no longer exposes a user-editable scope label. Clearly unrelated requests are rejected deterministically, while ambiguous requests are passed to the scoped Lumi system prompt instead of being rejected by a fixed keyword list.

Plugin API

Other Lumi plugins can use:

const ai = global.lumiFrameworks?.ai;
const health = await ai.health();
const result = await ai.generate({
  message: "Summarize this Lumi event.",
  user: requestingUser,
  sessionId: requestSessionId,
  scope: "my_plugin"
});

Available functions:

  • generate
  • classify
  • summarize
  • route_tool
  • health
  • capabilities
  • metrics_summary
  • registerContext
  • unregisterContext
  • registerTool

AI tools must provide an owning plugin, a synchronous permission check, a fixed argument schema, and an established workflow handler. Model output cannot execute SQL, shell commands, file operations, or arbitrary URLs.

Tool registration

ai.registerTool({
  tool_id: "example.action",
  display_name: "Example action",
  description: "Runs an existing plugin workflow.",
  owning_plugin: "example",
  required_role: "user",
  required_permission: "example.action.self",
  permission_check: ({ user, arguments: args }) => canRunWorkflow(user, args),
  schema: { target: "string", amount: "integer" },
  confirmation_required: true,
  risk_level: "sensitive",
  audit_category: "example",
  workflow_handler: ({ arguments: args, user, initiated_via_ai, ai_request_id }) =>
    existingWorkflow({ ...args, actor: user, initiated_via_ai, ai_request_id })
});

Verification

Run:

node plugins/lumi_ai/tests/verify.js

The verification covers path confinement, size formatting, GPU intent and actual allocation, pagination, model and log deletion safety, assistant role access, tool schema and permission checks, queue limits, refusal behavior, and runtime resume persistence.