Lumi/plugins/lumi_ai_web_search
2026-06-14 04:18:36 +02:00
..
backend Add Lumi AI web search tool 2026-06-13 21:32:36 +02:00
data Add Lumi AI web search tool 2026-06-13 21:32:36 +02:00
public Add Lumi AI web search tool 2026-06-13 21:32:36 +02:00
tests Fix Lumi AI tool discovery and execution 2026-06-14 04:18:36 +02:00
views Add Lumi AI web search tool 2026-06-13 21:32:36 +02:00
index.js Fix Lumi AI tool discovery and execution 2026-06-14 04:18:36 +02:00
readme.md Fix Lumi AI tool discovery and execution 2026-06-14 04:18:36 +02:00
tool_info.json Fix Lumi AI tool discovery and execution 2026-06-14 04:18:36 +02:00

Lumi AI Web Search

lumi_ai_web_search is an AI tool plugin for controlled current-information lookup. It is loaded only by Lumi AI's tool manager and is not an ordinary core plugin.

Installation and enablement

  1. Install this directory as plugins/lumi_ai_web_search/.
  2. Install Lumi AI 0.8.0 or newer.
  3. Open Plugins -> Lumi AI -> Tools.
  4. Select Settings for Lumi AI Web Search.
  5. Configure the provider and URL policy, turn on Web search enabled, and save.
  6. Select Enable in the Tools list.

The tool is not registered with the assistant while disabled. If its internal enabled setting or provider endpoint is missing, Lumi AI marks it unavailable without preventing Lumi from starting.

Provider

The initial adapters accept JSON from a configured public endpoint:

  • searxng_json reads the SearxNG results array.
  • generic_json reads results, items, or web.results.value.

The configured query parameter defaults to q. The adapter adds format=json, safe-search level, and result count. API keys are stored in data/settings.json with restricted file permissions where supported. The settings API never returns the key and a blank save keeps the existing secret.

Provider requests use a strict timeout, a 2 MiB response limit, and at most three redirects. No page JavaScript is executed.

URL policy

The default is an empty whitelist, so no result URL is usable until an administrator adds explicit rules. Rules support:

  • Domain: docs.example.com
  • Domain and subdomains: example.com
  • Subdomain wildcard: *.example.com
  • Path prefix: example.com/docs
  • Full wildcard pattern: https://*.example.com/resources/*

Whitelist mode permits only matching result, page, and redirect URLs. Blacklist mode permits public URLs except matching rules.

Independent hard network rules always block:

  • localhost, .localhost, .local, and known metadata hostnames
  • Private, loopback, carrier-grade NAT, link-local, multicast, and reserved IP ranges
  • DNS names resolving to private or otherwise unsafe addresses
  • URL credentials
  • Non-HTTP/HTTPS protocols

The same checks run before each page fetch and after every redirect. Administrator rules cannot override these blocks.

Tool behavior

The registered tool ID is lumi_ai_web_search.search. It accepts:

  • query
  • reason: fact_lookup, resource_lookup, troubleshooting, documentation_lookup, news_or_recent, or general_lookup
  • Optional requested_depth: search or full_page
  • Optional freshness

The assistant should use this tool only for current or external information that is not available in verified local Lumi context.

The tool registers as a read-only lookup. This allows permitted Discord, Twitch, YouTube, Kick, and other contexts to use it even though those contexts cannot run WebUI action tools. Lumi AI evaluates the configured origin allowlist before including the tool in the model prompt and again before execution.

Results are sanitized and returned as structured data rather than raw provider JSON. Each result contains a title, permitted URL or no URL when links are disabled, domain, condensed snippet, source type, date, and relevance score. Documentation and troubleshooting searches prioritize authoritative sources; recent searches prioritize dated sources.

Optional full-page mode extracts bounded visible text only when the administrator enables it. It does not automate a browser, submit forms, execute scripts, or follow unrestricted links.

Origin limits and rate limits

Allowed origins and output budgets are independently configurable for WebUI, Discord, Twitch, YouTube, Kick, and other sources. Trusted runtime context determines the origin; a model-provided origin cannot elevate access.

Twitch is limited to compact output and at most two source references. Discord permits moderate detail. WebUI permits richer summaries and more results. The tool also applies a per-actor, per-origin, per-server/channel rolling request limit.

Auditing and storage

All writable data remains under this plugin:

  • data/settings.json: normalized settings and provider secret
  • data/audit.jsonl: query, reason, actor, origin, server/channel, policy outcome, result count, cache status, and timing

Provider credentials are not written to audit records or returned in tool results. Updates preserve data/ by default.

Security boundary

The plugin has no shell, SQL, arbitrary filesystem, browser automation, or code-execution feature. Network access is limited to the configured public search provider and policy-approved public result pages. Lumi AI's backend role and permission checks remain authoritative.

Verification

Run:

node plugins/lumi_ai_web_search/tests/verify.js

The suite covers whitelist/blacklist matching, hard private-network blocks, redirect checks, reason-aware formatting, origin budgets, provider failures, settings effects, registration availability, and audits.