Lumi/plugins/lumi_ai
2026-06-16 08:30:40 +02:00
..
backend ui: add streamed interactions and homepage controls 2026-06-16 08:30:40 +02:00
data Add Lumi AI tool plugin manager 2026-06-13 20:28:06 +02:00
public ui: add streamed interactions and homepage controls 2026-06-16 08:30:40 +02:00
templates Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
tests Fix Lumi AI tool discovery and execution 2026-06-14 04:18:36 +02:00
views ui: add streamed interactions and homepage controls 2026-06-16 08:30:40 +02:00
cmds.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
index.js ui: add streamed interactions and homepage controls 2026-06-16 08:30:40 +02:00
models_manifest.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00
plugin.json Add self-contained Lumi web search 2026-06-14 05:01:13 +02:00
README.md Add self-contained Lumi web search 2026-06-14 05:01:13 +02:00
runtime_manifest.json Expand plugins and WebUI administration 2026-06-12 11:54:46 +02:00

Lumi AI

AI tool plugins

Administrators can open Tools from the Lumi AI settings title bar. The manager combines installed tools with remote plugins/lumi_ai_* directories from the core-configured Git repository and branch. Remote metadata is cached under data/tools/ for five minutes.

Each AI tool plugin lives directly under plugins/lumi_ai_{name}/ and requires:

  • tool_info.json with tool_id, display_name, version, description, scope, permissions, capabilities, and limitations
  • readme.md for the separate documentation inspector

Optional backend code is configured with entrypoints.backend and exports register() or init():

module.exports.register = ({ registerTool, metadata, paths, assetUrl }) => {
  registerTool({
    tool_id: `${metadata.tool_id}.lookup`,
    display_name: "Example lookup",
    description: "Runs a read-only lookup.",
    required_role: "user",
    required_permission: "example.lookup",
    audit_category: "lookup",
    confirmation_required: false,
    risk_level: "low",
    schema: { query: "string" },
    permission_check: ({ user }) => Boolean(user?.id),
    workflow_handler: async ({ arguments: args }) => ({ query: args.query })
  });
};

The loader exposes no generic shell, SQL, filesystem, network, or code-execution API. Tool IDs must use the owning plugin namespace. Metadata roles can only make backend roles stricter, and sensitive or mutating definitions always require the existing Lumi AI confirmation flow.

Enable installs remote tools atomically and registers valid definitions. Disable unregisters them while retaining files. Update preserves data/ and config/ by default and rolls back to the previous directory if validation or swapping fails. Delete uses the shared three-second destructive confirmation and removes only the selected plugins/lumi_ai_* directory.

Tools may declare a settings_schema in tool_info.json. The manager renders an admin-only Settings modal, validates and stores values under that tool's data/settings.json, redacts secret fields on reads, and reloads enabled tools after a save so availability and behavior update immediately.

Tools may also declare a constrained tool_namespace, default-enabled installation state, capability diagnostics, a settings migrator, and tool-owned settings view/assets. Declared assets remain path-confined to the tool directory, and backend permission checks remain authoritative for every capability.

Improvement Center

The Improvement Center at /plugins/lumi_ai/improvement_center stores end-user response feedback, supports moderator verification with an administrator-managed trusted reviewer list, and reserves approval, editing, deletion, promotion, eval runs, and exports for administrators.

Approved corrections are staged until an administrator selects Save Corrections. Active entries are constrained by minimum role, origin, and platform, and verified route links are checked against the local Lumi repository index. Manual instruction and DPO JSONL exports include approved examples only and never start training.

Lightweight request gate

When Lumi AI is enabled and its runtime is running, a separate CPU-oriented gate model stays active beside the main model. The gate can serve high-confidence answers from the verified Lumi route/help/plugin index and plugin-local safe cache, or route the request to clarification, refusal, unavailable, or the main LLM.

The gate never executes tools. Repeated or explicitly forced prompts, low-confidence requests, user-specific data, economy, moderation, permissions, and action requests always continue to the main LLM after normal access controls and rate limits pass.

Rate-limited WebUI users receive a live retry countdown. Send and retry controls remain disabled until the server-provided cooldown expires.

Gate inference is bounded to the configured 1-5 second timeout and uses a compact classification-only prompt. Complex, ambiguous, code, and troubleshooting requests bypass the gate. WebUI requests use background jobs with live stage polling, so cold starts and long main-model generations do not hold a proxy-facing request open.

lumi_ai is a standalone Lumi plugin that manages a local llama.cpp inference process and adds a scoped AI Assistant to the WebUI.

Install and configure

  1. Place this directory at plugins/lumi_ai/.
  2. Restart Lumi.
  3. Open Plugins -> Lumi AI in the sidebar.
  4. Download the managed runtime and a compatible model.
  5. Select the model, configure visibility and instructions, then save.
  6. Start the runtime and enable AI.

The settings page is always registered as an admin-only item in the Plugins sidebar section. The assistant pill is injected separately above the profile footer and follows the configured admin, moderator, and user visibility controls.

Storage

Every writable path is confined to plugins/lumi_ai/data/:

  • config/: settings and runtime state
  • models/: verified GGUF models
  • runtime/: extracted llama.cpp runtime
  • logs/: runtime logs
  • metrics/: usage and audit records
  • rag/, cache/, tmp/: plugin-local working data

Downloads are written to data/tmp/, verified against a pinned SHA-256 digest, and only then moved or extracted into their final plugin-local directory.

Runtime and downloads

Models use pinned Hugging Face repository commits. The runtime uses a pinned official ggml-org/llama.cpp GitHub release because the llama.cpp project does not publish authoritative multi-platform runtime archives on Hugging Face. This is the only download-source exception; the archive URL, version, size, and SHA-256 are pinned in runtime_manifest.json.

Long main-model requests run as cancellable background jobs. The WebUI polls job state, shows a recoverable soft-timeout panel, and leaves generation running until the configured hard generation timeout or an explicit cancel.

GPU diagnostics separate total/free VRAM, Lumi-managed model allocation, and estimated external VRAM pressure. A loaded managed model is not counted as external pressure when calculating the safe GPU allocation.

Main-model requests use configurable token budgets by request class: navigation/help, simple answers, code/custom commands, admin diagnostics, and explicitly requested long answers. Polling shows live elapsed time and the selected budget, while metrics record the request class, token speeds, and validated stage timings.

The runtime binds only to 127.0.0.1 on an ephemeral port. It is never exposed on 0.0.0.0.

Before loading a model, Lumi AI runs llama-server --help as a smoke test. Failed launches and exits are decoded into plugin-local diagnostics, including Windows NTSTATUS values such as 0xC0000005 / STATUS_ACCESS_VIOLATION. The admin page provides remediation steps, raw stdout/stderr tails, model verification, and a redacted diagnostics bundle.

Model tiers are capability-based: Tiny, Small, Medium, Large, General, Power, and Extreme. Lumi AI detects compatible GPUs and selects a pinned Vulkan or Metal runtime when available, with CPU as the fallback. The GPU Acceleration setting maps a percentage to the model's supported offload layers and automatically limits the selectable range using model size, context size, and available VRAM.

GPU allocation stores administrator intent separately from the live allocation. Managed runtime VRAM is treated as Lumi-owned usage, while external VRAM pressure can clamp the actual allocation without overwriting the saved intent. The admin page also provides safe model deletion, category-based storage cleanup, paged metrics, structured runtime logs, and assistant-pill visibility diagnostics.

Lumi Assistant uses an immutable identity and safety policy, with administrator-configurable support topics, domains, style, links, clarification behavior, answer length, and role-specific overrides. A plugin-local repository index under data/repo_index/ provides verified Lumi WebUI routes and support context. The admin visibility debugger reports backend eligibility and frontend slot, loader, response, and mount conditions.

Assistant replies normalize verified Lumi routes into safe links for the active WebUI host. Repository paths and implementation details are restricted to administrators; moderator code help is separately opt-in. Reply length limits are applied after inference to delivered text rather than prompt or retrieved context.

The WebUI assistant keeps a bounded per-user conversation and panel preference in browser storage so navigation does not discard the active chat. The panel opens at one-sixth of the viewport by default, supports vertical resizing, and restores its previous height and open state. Assistant Markdown is rendered as safe DOM content, including fenced code blocks with exact-copy controls.

The !assistant command and its default !lumi alias use the same provider, identity, scope, access restrictions, and rate limits as the WebUI assistant. Replies are normalized for the originating platform. AI bans, timeouts, and recent rate-limit denials are managed from the plugin settings page and stored under data/config/.

The access-control picker searches known Lumi profiles and linked platform identities. Runtime log listings and metrics use 25-row pages, while individual log views read only the latest bounded chunk.

Assistant panel rendering is validated before the core reports the panel as available. Template, locals, endpoint, HTML, and mount diagnostics are available to administrators, with failures recorded in data/logs/assistant-panel.log.

The test console no longer exposes a user-editable scope label. Clearly unrelated requests are rejected deterministically, while ambiguous requests are passed to the scoped Lumi system prompt instead of being rejected by a fixed keyword list.

Plugin API

Other Lumi plugins can use:

const ai = global.lumiFrameworks?.ai;
const health = await ai.health();
const result = await ai.generate({
  message: "Summarize this Lumi event.",
  user: requestingUser,
  sessionId: requestSessionId,
  scope: "my_plugin"
});

Available functions:

  • generate
  • classify
  • summarize
  • route_tool
  • health
  • capabilities
  • metrics_summary
  • registerContext
  • unregisterContext
  • registerTool

AI tools must provide an owning plugin, a synchronous permission check, a fixed argument schema, and an established workflow handler. Model output cannot execute SQL, shell commands, file operations, or arbitrary URLs.

Tool registration

ai.registerTool({
  tool_id: "example.action",
  display_name: "Example action",
  description: "Runs an existing plugin workflow.",
  owning_plugin: "example",
  required_role: "user",
  required_permission: "example.action.self",
  permission_check: ({ user, arguments: args }) => canRunWorkflow(user, args),
  schema: { target: "string", amount: "integer" },
  confirmation_required: true,
  risk_level: "sensitive",
  audit_category: "example",
  workflow_handler: ({ arguments: args, user, initiated_via_ai, ai_request_id }) =>
    existingWorkflow({ ...args, actor: user, initiated_via_ai, ai_request_id })
});

Verification

Run:

node plugins/lumi_ai/tests/verify.js
node plugins/lumi_ai/tests/verify-tools.js

The verification covers path confinement, size formatting, GPU intent and actual allocation, pagination, model and log deletion safety, assistant role access, Improvement Center permissions and activation, approved-only exports, tool schema and permission checks, queue limits, refusal behavior, and runtime resume persistence.