Skip to main content

AI Agent Trucking V1 โ€” Implementation Plan

Companion to: docs/proposals/ai-agent-trucking-v1.md The proposal explains what and why. This plan tracks how and in what order.

A live execution copy of this plan also exists locally at ~/.cursor/plans/ai_agent_trucking_v1_*.plan.md for in-IDE todo tracking. The version here in the docs repo is the canonical, reviewable copy.

FieldValue
StatusDecisions locked โ€” code not started, kickoff deferred behind feature/tenant-roles-permissions and Anthropic key acquisition
OwnerScott Asher
Repos touchedattunelogic-api, attunelogic-service, attunelogic-docs
Branchesfeature/ai-agent-trucking-v1 (already created in api + service, per 44-release-branch-policy)
Last updated2026-05-16

Locked decisions live in the proposal doc's "Locked decisions" section. This plan file inherits them โ€” if a phase task and the proposal disagree, the proposal wins.

Pre-build decisions (snapshot from 2026-05-16 review)โ€‹

Mirrored from the proposal for quick reference while building. Authoritative copy lives in the proposal doc.

  • Build Phase 1+2 now; defer Phase 3 until Anthropic key + data-policy sign-off.
  • Deploy posture: AI_AGENT_GLOBAL_ENABLED=true in beta only; false in alpha/main.
  • Internal tenant cost handling: no auto-disable ceiling, alert only at $10/day and $50/month per tenant. External pilot ceilings remain $25/mo and $200/day.
  • AgentSession retention: 90-day TTL on startedAt (Mongo TTL index).
  • Dry-run default: per-tenant, no global default. Internal tenants start with dryRun = true, flip deliberately.
  • Names, paths, schemas, env vars, routes: locked as proposed โ€” see proposal for the full list.

Overviewโ€‹

Add an in-app AI assistant on the service web that lets trucking customers create multi-leg Jobs from natural language. The agent runs as an orchestrator on the API using Anthropic Claude with tool-use, resolves entities through scoped read-only tools, and commits via the existing handleExtractedJobCreate pipeline so created records flow through current validation, tenancy, and audit. Created jobs are flagged as AI-originated and surfaced in the drawer with a 5-minute Undo window.

For the full architectural context, decisions, kill-switch hierarchy, no-address PII model, and pre-release safety checklist, see the proposal doc.


Build phasesโ€‹

We build in 3 phases. Phases 1 and 2 do not require an Anthropic API key or the data-policy review โ€” only Phase 3 does. This means we can land all the safety infrastructure, gates, and scaffolding while the policy review is in flight.

Phase 1 โ€” Infrastructure & kill switches (no LLM, no Anthropic key needed)โ€‹

Goal: a fully gated, observable, killable system before a single token is spent.

  • api_global_kill_switch_env โ€” Add AI_AGENT_GLOBAL_ENABLED env var (default false). When false, every agent route returns 503 immediately, before tenant flag/industry checks, before any DB or LLM work. The launcher in service hides too via a public /system/ai-status endpoint that exposes the global flag (read-only).
  • api_global_kill_switch_runtime โ€” Add a runtime kill switch backed by a system-level Config doc (type=system, key=aiAgent.globallyEnabled). Super-admin can flip it from the UI without a redeploy. Evaluation order in middleware is env โ†’ system runtime โ†’ tenant flag โ†’ industry gate. Cached per-process for 30s with explicit cache bust on toggle.
  • api_feature_flag_registry โ€” Register aiAgent.enabled (lifecycle beta, defaultEnabled false, description, tenantAdminToggleable: true) in src/services/config/default-configs/feature-flags.js so it shows up in the SuperAdmin Feature Flags UI as a per-tenant Inherit/Force On/Force Off control.
  • Industry gate (L4) โ€” Add requireAppType("trucking") middleware. 403 for any non-trucking tenant even with the flag forced on.
  • api_aiagent_config_block โ€” Add aiAgent config block (model, monthlyTokenCap, perUserDailyMessageCap, monthlyCostCeilingUsd, allowedTools, defaultUndoWindowMs, dryRun) to default-configs so SuperAdmin/ops can tune limits per tenant alongside the on/off toggle.
  • api_pending_ai_review_flag โ€” Add Job.pendingAiReview boolean (default false) and an opt-in filter in schedule controllers (?excludePendingAiReview=true or default-exclude when reading on driver client). Keeps AI-created jobs out of dispatch schedules until the user approves, since approval.status alone does not gate schedule visibility today.
  • api_job_flags โ€” Add optional aiCreated + aiCreatedAt fields to Job model (backwards compatible).
  • api_tenant_admin_toggle โ€” Add tenant-scoped endpoint PATCH /api/v1/account/feature-flags (admin role, NOT superAdmin) that only accepts flags marked tenantAdminToggleable: true. Writes to Config.configs.featureFlagOverrides for the caller's parentCompany.
  • api_health_endpoint โ€” Add GET /api/v1/admin/ai-agent/health (super-admin only) returning env/runtime status (full payload populated in Phase 3). Used by the panic-button UI and as a manual smoke check.
  • /api/v1/system/ai-status โ€” No-auth endpoint returning { globalEnabled } (L1+L2 only).
  • api_feature_flag_enforcement โ€” Verify requireFeature("aiAgent.enabled") blocks both POST /messages and GET /sessions/:id with 403 when the tenant has the flag off (Inherit when default off, or Force Off).
  • service_useconfig_live_refresh โ€” Fix ConfigProvider in src/hooks/useConfig.tsx so it re-syncs configs state whenever configData changes (not just on first init). Required so the AI Assistant launcher shows/hides live after admin toggles the flag, without a page reload. Backwards compatible โ€” only changes the dependency on the existing useEffect.
  • service_tenant_admin_toggle โ€” Add an "AI Assistant" card to the existing tenant settings/account page (admin-visible) with an on/off toggle that calls PATCH /account/feature-flags. Optimistically updates useConfig so the launcher shows/hides immediately.
  • service_panic_button โ€” Add a "System Status" panel under SuperAdmin (or atop the Feature Flags page) showing the AI health endpoint output (global env/runtime status, Anthropic reachable, error rates, MTD cost, circuit breaker state) and a prominent "Disable AI globally" panic button that flips the runtime kill switch with a confirm dialog. Clearly displays current state at all times.
  • service_launcher_global_gate โ€” Drawer/launcher must hide when EITHER the per-tenant flag OR the global runtime status is off. Add a /system/ai-status RTK Query that polls every 60s and on focus, so a global kill propagates to active sessions without requiring re-login.

Ship value: every kill switch in place and verifiable. The platform can guarantee "AI is off" before any AI exists.

Phase 2 โ€” Tools & scaffolding with stubbed LLM (still no real Anthropic call)โ€‹

  • api_address_redaction โ€” Implement an addressRedactor utility used by every tool that returns location data. Tool responses returned to the LLM contain ONLY { id, name, city, state } โ€” never street, postalCode, country, lat/lng, or formatted address strings. The full Location doc stays server-side and is referenced by id when createDraftJob runs. Add unit tests asserting redaction (snapshot tests + a positive assertion that no address-shaped strings leak).
  • api_prompt_address_guard โ€” Add a lightweight server-side address detector that scans inbound user messages for address-shaped tokens (street suffixes, ZIP/postal, lat/lng pairs). On detection: (a) return a friendly 400 to the client asking the user to use a saved location name instead, (b) do NOT forward the message to Anthropic, (c) increment a metric. Pure regex, no PII leaves the API.
  • api_address_leak_tests โ€” Add a contract test suite tests/services/ai/no-address-leak.test.js that runs every tool against fixture data containing recognizable street addresses, then asserts none of those strings appear in the JSON sent back to the orchestrator. Also assert AgentSession persistence omits them. This is a regression gate โ€” runs in CI on every PR that touches src/services/ai/**.
  • api_session_model โ€” Create AgentSession model for transcripts, tool events, token usage, createdJobIds (tenant-scoped). IMPORTANT: tool result snapshots stored on the session must use the redacted (id+name+city+state) shape; never persist full addresses on AgentSession. User prompts are stored verbatim โ€” relying on the user-facing input warning to keep them address-free.
  • api_tools โ€” Implement scoped tools: searchClients (calls existing GET /clients?search=true), searchLocations (calls existing GET /locations?search=true&clientId=), createDraftJob (builds extractedData.legs payload using pre-resolved location IDs and delegates to existing createJob โ†’ handleExtractedJobCreate). NOTE: searchDrivers and any driver assignment dropped from v1 โ€” assignment triggers handleUserAssignedToLeg notifications.
  • api_dry_run_mode โ€” Add a per-tenant aiAgent.dryRun config flag. When true, the createDraftJob tool returns a preview payload instead of creating a Job. Lets us pilot in production with zero write risk. Default false; super-admin only setting.
  • api_route_controller โ€” Add controller src/controllers/ai/agent/index.js and route src/routes/api/v1/ai/agent.js (POST /messages, GET /sessions/:id) with verifyToken+verifyParent+requireFeature+rate limit; wire in src/routes/api/v1/index.js. Use a stubbed LLM provider so end-to-end tests can fire without an Anthropic key.
  • api_rate_limit โ€” Add aiAgentLimiter in src/middlewares/rateLimiting.js with per-user and per-tenant caps.
  • api_circuit_breaker (stub) โ€” Add an automatic circuit breaker that flips the runtime kill switch to OFF when the rolling 5-minute Anthropic error rate exceeds threshold (e.g. >25% errors over >=10 calls) OR when 24h platform-wide cost exceeds the platform ceiling. Sends Sentry alert + super-admin email. Auto-recovery requires manual flip back on. Wired against the stubbed provider in Phase 2 for unit testing; sees real Anthropic in Phase 3.
  • api_tests โ€” Add tests under tests/controllers/ai/agent for tool loop, tenancy isolation, feature flag, rate limit, AI-created Job flag, kill switches L1-L4.
  • service_rtk_slice โ€” Create src/redux/services/ai/agentApi.js with sendAgentMessage mutation and session query; invalidate Jobs/Schedule tags.
  • service_components โ€” Build src/components/AIAgent/{AgentLauncher,AgentDrawer,MessageList,InputBar,ToolCallPill,UndoBanner}.jsx using shared/Drawer + Button.
  • service_no_address_ux โ€” InputBar shows persistent helper text: "Refer to locations by name (e.g. 'Acme Dallas DC') โ€” please don't paste addresses." If the API returns the 400 ADDRESS_DETECTED error code, show an inline error nudging the user to use a saved location name and offer a quick link to create one. Tool-call pills render only redacted fields (name + city/state).
  • service_layout_wire โ€” Mount AgentLauncher + AgentDrawer in src/layouts/Dashboard/index.jsx alongside ChatWidget/ChatLauncher; respect --right-sidebar-offset and feature flag.
  • service_undo_flow โ€” Implement 5-minute undo banner using existing useDeleteJobMutation; convert to passive 'view job' link after expiry.
  • service_ai_activity_widget โ€” Add an "AI Activity" card to SuperAdmin > Feature Flags (rendered when a tenant is selected and aiAgent.enabled is in the registry). Shows month-to-date token usage + estimated cost, last 10 sessions table (time, user, prompt preview, tools used, jobsCreatedCount, status), and a 30-day sparkline. Uses GET /admin/ai-agent/:parentCompanyId/activity.
  • api_ai_activity_endpoint โ€” Add GET /api/v1/admin/ai-agent/:parentCompanyId/activity returning { recentSessions(10), tokenUsage: { monthToDate: { input, output, estimatedCostUsd }, perDay[] }, jobsCreatedCount }. Aggregates from AgentSession model. SuperAdmin only.
  • service_tests โ€” Add component tests for AgentDrawer message rendering, tool-call pills, and undo countdown behavior.

Ship value: entire system testable, killable, observable, and reviewable without any real LLM call.

Phase 3 โ€” Wire Anthropic (requires data-policy decision + API key)โ€‹

  • api_deps_env โ€” Add @anthropic-ai/sdk and ANTHROPIC_API_KEY / AI_AGENT_DEFAULT_MODEL to config/keys.js, config/index.js, and .env.example.
  • api_anthropic_wrapper โ€” Create src/services/ai/anthropic.js (client singleton, default model, token usage helper). Fail-closed when ANTHROPIC_API_KEY missing โ†’ 503.
  • api_agent_orchestrator โ€” Build src/services/ai/agent/{index.js,systemPrompt.js} implementing the Claude tool-use loop with iteration cap.
  • Cost estimation against pricing.js table; enforce per-tenant monthly $ ceiling and platform 24h ceiling; circuit breaker flips L2 on threshold breach.
  • Wire health endpoint to report anthropicReachable, lastSuccessfulCallAt, real error rates, MTD cost.
  • Internal smoke test in dry-run mode against a test tenant. Then internal employee tenant with real writes. Then 1 friendly external pilot tenant.

Pre-release & rolloutโ€‹

  • prerelease_checklist โ€” Run the full pre-release safety checklist (kill-switch hierarchy verified end-to-end, dry-run mode tested, circuit breaker unit-tested, panic button verified in alpha, runbook published in attunelogic-docs, load test against rate limits, cost ceiling triggers tested, all logs verified PII-free, no-address tests green) before flipping any production tenant on. Full checklist lives in the proposal doc.
  • prerelease_runbook โ€” Add docs/operations/ai-agent-runbook.md covering the kill-switch hierarchy, panic button procedure, circuit breaker recovery, common failure modes, on-call escalation, and how to interpret AI Activity widget signals.
  • rollout โ€” Branch feature/ai-agent-trucking-v1 in both repos; enable flag per pilot tenant via Config.featureFlagOverrides; promote feature -> beta -> alpha -> main. Pilot stages: internal alpha (dry-run) โ†’ internal beta โ†’ 1 friendly external โ†’ 3-5 โ†’ GA.

Cross-referencesโ€‹