Garage Door Science

Architecture

The brain, documented.

Garage Door Science is a small RAG system with a public API around it. This page is the map: every surface an agent or developer can reach, what each one returns, and how the pieces fit together.

For the TL;DR tailored to AI agents, see /ai. For machine-readable endpoint lists, see /llms.txt.

The shape

Garage Door Science system diagramA homeowner question flows into either the chat/voice technician or the tool registry; both query the RAG corpus on Neon pgvector; the corpus returns a grounded response with source URLs. The tool registry is also reachable via /mcp, /mcp-pro, and /api/v1/<tool>.homeowner questionChat / voice(technician)Tool registry(8 tools, Zod)/mcp, /mcp-pro,/api/v1/<tool>RAG corpus (pgvector on Neon)11 labs · 22 articles · 36 videosgrounded response + source URLs

Discoverable surfaces

Eight public entry points, grouped by what they’re for. Every URL here is stable — breaking changes get a migration path and are announced on /ai.

Retrieval — where data comes from

Three ways to call the same 8 tools. Same business logic, different transports. Pick whichever fits your client.

  • Public Model Context Protocol endpoint. All 8 tools, retrieval capped at top-3. JSON-RPC 2.0 over HTTPS.

    Auth
    None. Per-IP rate limit (60/min, 2000/day).
    For
    Claude Desktop, ChatGPT agents, MCP-compatible clients.
  • Higher-tier MCP endpoint. All 8 tools, retrieval up to top-10, 4× rate limit.

    Auth
    Bearer gds_live_… — self-serve at /developers.
    For
    Production agent deployments, volume API consumers.
  • Thin REST wrapper over the tool registry. POST JSON, get the tool's raw return value. Tier resolution mirrors MCP.

    Auth
    Same as MCP — anonymous → public tier, gds_live_ bearer → pro tier.
    For
    Non-MCP agent frameworks, scripts, curl, anything HTTP.

Discovery — how you learn what’s here

Four surfaces that describe the surface. All machine-readable, all public. Prefer OpenAPI for codegen, llms.txt for crawling.

  • OpenAPI 3.1 spec, auto-generated from the tool registry. Every tool appears as a POST path with its full input schema.

    Auth
    Publicly readable. Cache-Control: 5m / s-maxage 1h.
    For
    Code generators, Postman, Scalar, LangChain OpenAPI loaders.
  • Interactive Scalar docs. Renders the OpenAPI spec in-browser with a Try-It panel that hits the live API.

    Auth
    Public page; Try-It inherits the auth header you paste.
    For
    Human developers evaluating the API.
  • Plaintext machine index following the emerging llms.txt convention. Lists every URL grouped by section (tools, labs, articles, videos).

    Auth
    Public.
    For
    Crawling agents, LLM training pipelines, sitemap enrichers.
  • Human-readable agent guide. TL;DR block at top, tool list, corpus counts, rate limits, partnership contact.

    Auth
    Public.
    For
    Agent builders reading in-browser; LLMs summarizing the product.

Operator — humans wiring things up

  • Self-serve API key portal. Email → magic-link → dashboard. Plaintext key shown once at creation.

    Auth
    Magic-link (15 min TTL). Session cookie is HMAC-signed; 7-day TTL.
    For
    Humans provisioning their own pro-tier key.
  • Voice BYO-LLM endpoint. OpenAI Chat Completions SSE format, Anthropic under the hood. Persona resolves from body.model.

    Auth
    Bearer token (separate from gds_live_). Per-session + per-IP + $2/week global caps.
    For
    ElevenLabs Custom LLM, other voice agents that speak OpenAI SSE.

One registry, three transports

Every tool is defined once in lib/mcp/tools.ts: name, description, Zod input schema, and handler. From there it fans out automatically to:

  • /mcp and /mcp-pro — surfaced via JSON-RPC tools/list.
  • /api/v1/<name> — a dynamic REST route resolves the tool by path segment.
  • /openapi.json — the spec is generated at request-time from the registry; adding a tool adds a path.
  • /llms.txt and /ai — both enumerate the registry so docs never drift from code.

Why this matters:the tool list is the contract. If you’re writing an agent against our API, enumerate via tools/list (MCP) or GET /api/v1 (REST) rather than hardcoding names. New tools land without versioning breakage.

Want to see it in action? The use-cases page walks through four distinct callers — Claude Desktop, ChatGPT Custom GPT, LangChain agents, and a contractor-site embed widget — each with a runnable recipe in the cookbook repo.

What the brain knows

The retrieval corpus is built from seven kinds of sources, indexed into pgvector on Neon. Everything you see cited in a chat answer points back to one of these.

  • 11 interactive labs /labs/<slug>. Three.js explainers with embedded text transcripts.
  • 22 articles /blog/<slug>. MDX, JSON-LD, inline citations back to source labs.
  • 36 videos /videos/<slug>. YouTube embeds with transcripts; ≥ 200 chars of transcript triggers RAG indexing.
  • Canonical Q&A — curated from user feedback. Only served via the retrieveLabContext tool today, not as standalone pages.
  • Inspection items — 24-point checklist entries with safety metadata. Grounds safety-critical advice in structured data rather than free text.
  • Partner info — Guild Garage Group member profiles. Informational only; routing is separate via routeByZip.
  • 8 opener model specs— LiftMaster, Chamberlain, Genie spec sheets with error codes, common failures, control locations, and common-alias disambiguation (so “LiftMaster 8500” routes to the 8500W entry).

Indexing runs on publish + on a scheduled reindex. A drift detector flags chunks whose source text has diverged from the stored embedding so the next reindex catches them.

The retrieval path

Every tool call that needs context uses the same hybrid pipeline:

  1. Embed the query with a small OpenAI embedding model. (Cheap and deterministic; we cache in-process.)
  2. Hybrid retrieval in parallel — pgvector kNN (top-20, for concept + paraphrase matches) and Postgres full-text search (top-20, for exact tokens like model numbers and error codes that embeddings tokenize into mush).
  3. Reciprocal Rank Fusion merge(k=60). Rank-based, so the two scoring systems don’t need calibration; a chunk that ranks well in both accumulates boost naturally.
  4. Clamp top-K by tier — public callers get ≤ 3 hits per query, pro callers get ≤ 10. Prevents a free firehose while keeping the free tier genuinely useful.
  5. Return pre-chunked passages with source slug, score, and section heading. The agent decides how to cite; we never hide the source URL.

The chat layer

Between the homeowner and the tools sits the Virtual Technician (warm, diagnostic) and the Lab Trainer (pedagogical, for technician-training scenarios). Both are framed honestly as AI — we don’t simulate a named human staffer.

Internally there’s a second persona called the Expert that never surfaces directly — it’s a safety-and-scope layer the Technician calls as a tool when a conversation edges toward liability (springs, cables, electrical). You won’t see Expert output, but it shapes what the Technician is allowed to say.

For voice, any ElevenLabs Custom-LLM-compatible agent can call /api/ask-llm/chat/completions and stream a response in one of four voice-tuned persona slugs (maya, sara, rick, margaret).

How it gets smarter

Learning is corpus growth + curated Q&A, not model fine-tuning. Every chat turn is logged (with consent) alongside the retrieval hits that fed it and the user’s up/down feedback. An admin surface reviews low-score turns and either rewrites the chunk, adds a canonical Q&A entry, or marks the topic as out-of-scope.

The reindex pipeline picks up changes nightly. Out-of-scope questions produce a “that’s outside what I’m trained on” redirect rather than a guess — keeps the authority signal sharp and the corpus honest.

Content generation

Two pipelines produce the videos and articles that feed the corpus:

  • HeyGen Video Agent — avatar-driven short videos for blog companions. Scripts live in data/heygen-scripts/; each script assigns an editorial persona (Maya, Rick, Sara, Margaret) with a specific HeyGen avatar group and voice. The Video Agent renders scenes in chat mode, then the resulting MP4 is downloaded to public/videos/heygen/ and uploaded to YouTube via the Data API v3 OAuth pipeline.
  • Remotion + ElevenLabs — a local-only toolchain in remotion/ that renders photo-slideshow compositions with synthesized voiceover and topic-aware background music. Outputs MP4s and MP3s into public/videos/ and public/audio/.
  • Auto-content — research → draft → edit → fact-check → URL validate → gate → publish → reindex. Runs on a cron schedule (11pm M/W/F publish + Sun/Thu discovery). First auto-article is live; Phase 2 adds Ollama Cloud for local research synthesis.

Build on it

Agent builders, start at /ai. Human developers wanting an API key, start at /developers. Anyone evaluating the REST surface, try /developers/api and hit a tool live from your browser.

Commercial partnerships (home-services aggregators, smart-home makers, insurance carriers, white-label operators): seth@smartwebutah.com.


Last updated: (added costEstimate + submitInspection tools; expanded corpus to six source kinds). Maintained alongside the code — if a surface is listed here, it’s live.