Technical Requirements
API Specifications

API Specifications

Relationship to other TRD sections:

Section 3 (Component Specifications) defines the logic that calls and is called by these APIs.

Section 4 (Data Models) defines the schemas (SessionState, ContextPacket, HandoffRequest) referenced here.

Section 6 (Infrastructure Requirements) defines the environment variables that configure these endpoints.

Section 8 (Security Requirements) defines the rate limiting and authentication rules enforced at the API layer.

What this section specifies and what it does not:
This section specifies the contracts — request/response schemas, SSE event formats, error codes,
and internal interface signatures — that allow frontend and backend to build in parallel.
It does not specify implementation detail (FastAPI route handlers, middleware order) or
retry logic (defined per-component in Section 3). It does not repeat rationale for technology
choices already recorded in ADRs.

Chat Endpoint

The primary conversation interface. Accepts a visitor message and returns a streaming
Server-Sent Events response. Each turn maps to one request-response cycle. There is no
persistent WebSocket connection — the visitor’s next message is a new HTTP request.

Request

POST /chat
Content-Type: application/json
Accept: text/event-stream
ZGC-Session-ID: <uuid-v4>
ZGC-API-KEY: <static key>

Headers:

Header	Required	Description
`ZGC-Session-ID`	Yes	UUID v4 generated client-side on `connectedCallback`. Used as the LangGraph `thread_id` for checkpointer lookup. Sent on every request. No cross-session persistence (FR-07a).
`ZGC-API-KEY`	Yes	Static key issued per widget deployment. Validated before the request reaches the orchestrator. Rotated per environment.
`Content-Type`	Yes	Must be `application/json`.
`Accept`	Yes	Must be `text/event-stream`. If absent, the server returns 400.

Body:

{
  "message": "string — visitor's message text"
}

Field	Type	Required	Constraints
`message`	string	Yes	Non-empty. Max 2,000 characters. Stripped of leading/trailing whitespace before processing.

Response

Content-Type: text/event-stream
Transfer-Encoding: chunked
Cache-Control: no-cache
Connection: keep-alive

The response is a stream of SSE events. Each event is a line of the form:

data: <json-payload>\n\n

SSE event types:

`token`

Emitted for each LLM response token as it arrives from the Anthropic streaming API.

data: {"type": "token", "content": "..."}

Field	Type	Description
`type`	`"token"`	Event type discriminator
`content`	string	One or more characters of the LLM response

The widget appends content to the displayed message in order. No buffering — tokens are
forwarded as received.

`done`

Emitted once, when the turn is complete and write_state has persisted the updated
SessionState. This event carries the session state metadata the widget needs to fire
analytics events (Section 3 — Analytics Events).

data: {
  "type": "done",
  "session_id": "uuid-v4",
  "lead_level": "hot" | "warm" | "cold",
  "current_stage": 1 | 2 | 3,
  "stage3_proposal_issued": true | false,
  "handoff_reason": "hot_lead" | "explicit_request" | "stall" | "llm_failure" | null,
  "turn_count": integer
}

Field	Type	Description
`type`	`"done"`	Event type discriminator
`session_id`	string	Echo of the `ZGC-Session-ID` header
`lead_level`	enum	Current lead classification after this turn
`current_stage`	1 \| 2 \| 3	Conversation stage at end of turn
`stage3_proposal_issued`	boolean	`true` if `propose_handoff` executed on this turn. Widget uses this to fire `zgc:escalation_triggered`.
`handoff_reason`	enum \| null	Populated when `stage3_proposal_issued` is `true`; null otherwise
`turn_count`	integer	Absolute turn index for this session

Widget behaviour on done:

If lead_level differs from the previously received value → fire zgc:qualification_state_changed
If stage3_proposal_issued == true → fire zgc:escalation_triggered
Re-enable the message input

`error`

Emitted when a non-recoverable error occurs during the turn. The stream closes after this event.

data: {"type": "error", "code": "string", "message": "string"}

Code	Condition	Widget behaviour
`LLM_UNAVAILABLE`	Anthropic API unreachable or returning 5xx	Show inline error message for this turn; session continues
`STREAM_TIMEOUT`	First token not received within `LLM_STREAM_TIMEOUT_MS`	Same as `LLM_UNAVAILABLE`
`ORCHESTRATOR_ERROR`	Unhandled exception in the orchestration graph	Show inline error; session continues
`SESSION_CORRUPTED`	Checkpointer read returns an unresolvable state	Show inline error; widget reloads session (new `session_id`)

Important: error events are turn-level failures. They do not activate the widget’s
fallback state (Section 3 — Graceful Degradation). Fallback is only activated by
HTTP-level failures before the stream opens (see HTTP Error Responses below).

HTTP Error Responses (pre-stream)

These are returned as standard JSON responses before the SSE stream is opened. On receiving
any of these, the widget does not attempt to render a partial stream.

{
  "error": {
    "code": "string",
    "message": "string"
  }
}

Status	Code	Condition	Widget behaviour
400	`INVALID_MESSAGE`	`message` empty, exceeds 2,000 characters, or `session_id` not a valid UUID v4	Show inline validation message; do not activate fallback
400	`MISSING_ACCEPT_HEADER`	`Accept: text/event-stream` absent	Configuration error — log to console
401	`INVALID_API_KEY`	`ZGC-API-KEY` absent or does not match any issued key	Activate fallback (widget deployment misconfiguration)
429	`RATE_LIMITED`	Per-IP or per-session limit exceeded (EC-12; limits defined in Section 8)	Show rate limit message for this turn; include `retry_after_seconds` in the response body
503	`SERVICE_UNAVAILABLE`	AI backend not reachable (upstream health check failed)	Activate fallback state

429 response body extension:

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many messages. Please wait before sending another.",
    "retry_after_seconds": 30
  }
}

The Chat API is a transport layer. It does not inspect or transform message content.
PII scrubbing (Section 4) is applied inside the orchestrator before content reaches
the Anthropic API. The Chat API does not log message body content — it logs only
session_id, request timestamp, HTTP status, and latency.

No visitor data is stored by the API layer. The single source of storage for session
content is the PostgreSQL checkpointer, governed by the retention rules in Section 4.

Chat Endpoint Configuration

Variable	Description	Section reference
`ZGC_API_KEY`	Static key for widget authentication	Section 8
`LLM_STREAM_TIMEOUT_MS`	Max ms to wait for first token before emitting `STREAM_TIMEOUT` error event	Section 3 — Orchestrator Configuration
Rate limiting variables	Per-IP and per-session limits	Section 8 — Security Requirements

Handoff Delivery Interfaces

The Human Handoff Subsystem (Section 3) delivers the ContextPacket to two external
destinations. This section specifies the outbound contracts — the payloads sent by the
system, not endpoints the system exposes. Both deliveries are triggered by a HandoffRequest
from the propose_handoff node and are dispatched in parallel.

Full delivery logic (retry, partial failure, email fallback) is specified in Section 3.
This section specifies only the payload schemas and confirmation criteria.

Slack — `#new-leads` Webhook

Destination: Incoming webhook URL, configured via SLACK_WEBHOOK_URL.

Method: POST

Confirmation criterion: HTTP 200 response from the Slack webhook endpoint. Any non-200
response triggers the retry sequence defined in Section 3.

Payload (Slack Block Kit):

This is the canonical Slack payload. Section 3 references this specification.

{
  "blocks": [
    {
      "type": "header",
      "text": {
        "type": "plain_text",
        "text": "{emoji} {lead_level_label} Lead — {visitor_company}"
      }
    },
    {
      "type": "section",
      "fields": [
        { "type": "mrkdwn", "text": "*Email:*\n{visitor_email}" },
        { "type": "mrkdwn", "text": "*Role:*\n{visitor_role}" },
        { "type": "mrkdwn", "text": "*Trigger:*\n{handoff_reason}" },
        { "type": "mrkdwn", "text": "*Turns:*\n{turn_count}" }
      ]
    },
    {
      "type": "section",
      "text": { "type": "mrkdwn", "text": "*Summary:*\n{conversation_summary}" }
    },
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "*Qualification:* Problem: {problem_fit} | Authority: {authority_fit} | Company: {company_fit} | Timing: {timing_fit}"
      }
    },
    {
      "type": "actions",
      "elements": [
        {
          "type": "button",
          "text": { "type": "plain_text", "text": "View CRM Record" },
          "url": "{crm_record_url}"
        }
      ]
    }
  ]
}

Field rendering rules:

Template field	Source	Fallback if absent
`{emoji}`	Derived from `lead_level`: hot → 🔥, warm → 🌡️, cold → ❄️. Outside hours prefix: 📬	—
`{lead_level_label}`	`lead_level` capitalised: `Hot`, `Warm`, `Cold`	—
`{visitor_company}`	`ContextPacket.visitor.company`	`"Unknown company"`
`{visitor_email}`	`ContextPacket.visitor.email`	`"Not captured"`
`{visitor_role}`	`ContextPacket.visitor.role`	`"Unknown"`
`{handoff_reason}`	`HandoffRequest.handoff_reason`	—
`{turn_count}`	`ContextPacket.conversation.turn_count`	—
`{conversation_summary}`	`ContextPacket.conversation_summary`	—
`{problem_fit}` / `{authority_fit}` / `{company_fit}` / `{timing_fit}`	`ContextPacket.qualification.*_fit`	—
`{crm_record_url}`	`LeadCreationResult.crm_record_url` — available only after CRM delivery succeeds	Button omitted if CRM fails

CRM URL availability: Slack and CRM are dispatched in parallel. If the Slack delivery
completes before the CRM record ID is available, the message is sent without the button.
Once the CRM record ID is received, the Slack message is updated via chat.update API
to add the button. If CRM delivery fails entirely, the button is permanently omitted.
This requires the SLACK_BOT_TOKEN environment variable in addition to SLACK_WEBHOOK_URL
(the chat.update API requires a bot token; the incoming webhook alone is insufficient).

CRM Delivery

Platform (v1): PostgreSQL leads table in the existing Neon instance, via PostgresCRMClient — a concrete implementation of the CRMClient interface (resolved by ADR-009). No external CRM is integrated in v1.

Integration pattern: The Human Handoff Subsystem calls the abstract CRMClient interface. The v1 concrete implementation (PostgresCRMClient) writes the ContextPacket to the leads table instead of calling an external HTTP API. In v2, swapping to an external CRM requires only a new implementation class — no changes to the subsystem or interface.

CRMClient interface:

class CRMClient(Protocol):
    async def create_lead(self, payload: CRMLeadPayload) -> LeadCreationResult:
        """
        Create a lead record in the CRM.
        Returns LeadCreationResult on success.
        Raises CRMDeliveryError on failure — caller handles retry.
        """
        ...

@dataclass
class LeadCreationResult:
    crm_record_id:  str   # CRM-assigned record identifier
    crm_record_url: str   # Direct URL to the record in the CRM UI

@dataclass
class CRMDeliveryError(Exception):
    http_status: int | None
    message:     str

v1 concrete implementation:

class PostgresCRMClient(CRMClient):
    """v1 concrete implementation — writes to the `leads` table in Neon PostgreSQL."""

    async def create_lead(self, payload: CRMLeadPayload) -> LeadCreationResult:
        leads_id = await _insert_lead(payload)   # raises CRMDeliveryError on DB error
        return LeadCreationResult(
            crm_record_id=str(leads_id),
            crm_record_url="",                   # no CRM UI in v1
        )

Confirmation criterion: create_lead() returns a LeadCreationResult with a
non-null crm_record_id. For PostgresCRMClient, a successful INSERT into the
leads table returns the row’s id (cast to str) as crm_record_id; a database
error raises CRMDeliveryError. The crm_record_url field is an empty string in v1
(no CRM UI to link to), so the Slack “View in CRM” button is permanently omitted.

Canonical CRM payload schema (platform-agnostic):

This is the input to create_lead(). The concrete CRM adapter maps these fields to
the platform’s own schema.

{
  "contact": {
    "email":   "string | null",
    "name":    "string | null",
    "company": "string | null",
    "role":    "string | null"
  },
  "lead": {
    "source":         "website-chat",
    "lead_level":     "hot | warm | cold",
    "handoff_reason": "hot_lead | explicit_request | stall | llm_failure",
    "triggered_at":   "ISO 8601 UTC datetime",
    "session_id":     "uuid-v4"
  },
  "qualification": {
    "problem_fit":        "not_detected | partially_confirmed | confirmed",
    "authority_fit":      "not_detected | partially_confirmed | confirmed",
    "company_fit":        "not_detected | partially_confirmed | confirmed",
    "timing_fit":         "not_detected | partially_confirmed | confirmed",
    "is_consultant":      "boolean",
    "referral_mentioned": "boolean"
  },
  "notes": {
    "summary":          "string — output of build_summary()",
    "signals_observed": "serialised list[SignalEntry]",
    "turn_count":       "integer"
  }
}

ContextPacket schema: see Data Models — ContextPacket.

Email Fallback

Trigger: Both Slack and CRM have exhausted retries (total failure path — Section 3).

Destination: sales@ — configured via HANDOFF_FALLBACK_EMAIL.

Implementation: SMTP via the configured mail provider. This is not a visitor-facing
email — it is an internal operational fallback. Configured via SMTP_HOST, SMTP_PORT,
SMTP_USER, SMTP_PASSWORD.

Content: Plain text summary derived from the ContextPacket:

Subject: [CHAT FALLBACK] {lead_level} Lead — {visitor_company or Unknown}

This lead notification was delivered by email because both Slack and CRM delivery failed.

Session ID:   {session_id}
Lead level:   {lead_level}
Trigger:      {handoff_reason}
Timestamp:    {triggered_at}

Visitor:
  Email:      {visitor_email or Not captured}
  Name:       {visitor_name or Unknown}
  Company:    {visitor_company or Unknown}
  Role:       {visitor_role or Unknown}

Qualification:
  Problem:    {problem_fit}
  Authority:  {authority_fit}
  Company:    {company_fit}
  Timing:     {timing_fit}

Summary:
{conversation_summary}

---
Slack delivery status:  FAILED
CRM delivery status:    FAILED

This email is the last-resort channel. It has no confirmation mechanism beyond SMTP
delivery. If SMTP also fails, the failure is logged at CRITICAL and no further
delivery is attempted. The HandoffRecord records all three channels’ final status.

Fallback Form

There is no fallback form endpoint in this system.

EC-07 is resolved by design: the widget’s graceful degradation state displays a link to fallback-url — an HTML attribute pointing to the existing company contact form or any external form URL. Form submission is handled entirely by the host site’s own infrastructure.

This is an explicit architectural boundary: the fallback submission path has zero dependency on the AI backend. If the AI backend is down, the fallback form still works because it does not interact with this system at all.

Implications:

No backend endpoint is built for fallback form submission.
The fallback-url attribute is validated as a non-empty string on widget
connectedCallback. If absent, the widget logs a ConfigurationError and renders
a fallback message without a link (degraded but functional — Section 3).
Leads submitted via the fallback form are not automatically created in the CRM by
this system. They are handled by whatever process currently handles the company
contact form. This is a known gap — the sales team has been informed (human-handoff.md).

Internal Component Interfaces

These are the contracts between internal components — not HTTP APIs, but typed function
interfaces. Specifying them here allows components to be developed and tested in isolation
against stubs before integration.

`retrieve_knowledge` — LLM Tool Definition

This is the tool specification passed to Claude Haiku 4.5 in the generate_response node.
It is reproduced here as the canonical definition; Section 3.1 references this section.

{
  "name": "retrieve_knowledge",
  "description": "Retrieve relevant information from the company knowledge base. Call this tool when the visitor asks about company services, case studies, team expertise, engagement models, or any question that requires specific company information beyond what is in your instructions. Do not call this tool for pricing questions, handoff mechanics, or general conversation process — those are handled from your instructions.",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query to use for retrieval. Should be a precise restatement of what the visitor needs to know, not a copy of the visitor's exact words."
      }
    },
    "required": ["query"]
  }
}

Invocation contract:

# The orchestrator registers this as a LangGraph tool callback:
async def retrieve_knowledge(query: str) -> RetrievalResult:
    """
    Called by the LLM when it issues a retrieve_knowledge tool call.
    Delegates to the RAG Triage Module (Section 3).
    Returns RetrievalResult — see Section 3 for the full schema.
    """
    ...

Single call per turn: MAX_TOOL_CALLS_PER_TURN = 1 is enforced by the orchestrator.
If the LLM issues a second retrieve_knowledge call within the same turn, the call is
ignored and logged as rag_extra_tool_call_ignored at WARN. The generate_response node
proceeds with the result from the first call only.

`dispatch_handoff` — Orchestrator → Human Handoff Subsystem

async def dispatch_handoff(request: HandoffRequest) -> None:
    """
    Called by the propose_handoff node when a handoff is triggered.
    Fire-and-forget from the orchestrator's perspective — the orchestrator
    does not await delivery confirmation. Delivery status is tracked
    independently by the Human Handoff Subsystem and persisted to HandoffRecord.
    """
    ...

HandoffRequest schema (canonical definition — Section 3 and Section 4 reference this):

@dataclass
class HandoffRequest:
    session_id:     str
    handoff_reason: Literal["hot_lead", "explicit_request", "stall", "llm_failure"]
    lead_level:     Literal["hot", "warm", "cold"]
    business_hours: bool          # from Business Hours Detection Module (Section 3)
    session_state:  SessionState  # full snapshot at point of handoff trigger
    triggered_at:   datetime      # UTC

Fire-and-forget rationale: The propose_handoff node streams a response to the visitor
before dispatch_handoff completes. Awaiting delivery confirmation would block the response
stream and add 1–10s of latency (including retry wait time) to the visitor’s perceived
response time. The delivery outcome does not affect what the visitor sees.

`is_business_hours` — Business Hours Detection Module

def is_business_hours(at: datetime | None = None) -> bool:
    """
    Returns True if the given UTC datetime falls within Company business hours:
    Monday–Friday, 09:00–18:00 CET/CEST.

    Uses IANA timezone identifier 'Europe/Madrid' via Python zoneinfo.
    DST transitions are handled automatically.

    If `at` is None, uses datetime.now(UTC).
    No public holiday awareness in v1 — documented as a known limitation (EC-04).
    """
    ...

Called by: propose_handoff node to determine which proposal template to use
(in-hours direct connection offer vs. outside-hours capture flow — Section 3).

Implementation note (EC-04): The function uses zoneinfo.ZoneInfo("Europe/Madrid")
and Python’s standard datetime library for DST-aware conversion. It does not use a
fixed UTC offset. Public holiday awareness is not implemented in v1 — the service
operates as if every weekday is a working day. This is a known limitation, documented
in Section 11.

`emit_event` — Analytics Event Interface

async def emit_event(event: AnalyticsEvent) -> None:
    """
    Emits a backend analytics event to the analytics pipeline.
    Called by the write_state node at the end of each turn.
    Fire-and-forget — analytics failures do not affect the session.
    """
    ...

Backend analytics events (complement to the client-side events defined in Section 3):

Event name	Trigger	Backend fields
`qualification_state_changed`	A `QualificationState` dimension changed level on this turn	`session_id`, `dimension`, `from_level`, `to_level`, `signal_type`, `turn_index`, `timestamp`
`handoff_dispatched`	`dispatch_handoff` called	`session_id`, `handoff_reason`, `lead_level`, `business_hours`, `timestamp`
`handoff_delivered`	Both channels confirmed	`session_id`, `slack_ok`, `crm_ok`, `timestamp`
`handoff_partial_failure`	One channel failed after retries	`session_id`, `failed_channel`, `timestamp`
`handoff_total_failure`	Both channels failed	`session_id`, `timestamp`
`rag_retrieved`	`retrieve_knowledge` returned results above threshold	`session_id`, `query_length`, `chunks_returned`, `top_score`, `turn_index`, `timestamp`
`rag_no_result`	`retrieve_knowledge` returned no results above threshold	`session_id`, `turn_index`, `timestamp`
`prompt_compliance_violation`	LLM generated a Stage 3 proposal outside `propose_handoff`	`session_id`, `turn_index`, `timestamp`

Client-side analytics events (fired by the widget) are specified in Section 3 —
Analytics Events. Backend events defined here are complementary and fired server-side.

SSE Format Agreement (Blocker for Phase 2)

ADR-005 explicitly leaves the SSE event format as an open item to be agreed between
frontend and backend engineers before Phase 2 integration begins. Previous section defines
this format. The items below must be confirmed by both teams before Phase 2 starts:

Item	Specification (this document)	Action required
SSE event format	`data: <json>\n\n` — one JSON object per event, no `id:` or `event:` fields	Confirm with frontend
Event types	`token`, `done`, `error`	Confirm with frontend
`done` event fields	`lead_level`, `current_stage`, `stage3_proposal_issued`, `handoff_reason`, `turn_count`	Confirm widget analytics event mapping (Section 3)
LangGraph `astream_events` → SSE mapping	`on_chat_model_stream` events → `token` events; graph completion → `done` event	Confirm with backend implementer
Error event scope	Turn-level only; does not activate widget fallback	Confirm with frontend
HTTP 503 activates fallback	Yes — pre-stream HTTP errors trigger fallback state	Confirm with frontend

This agreement must be documented as a comment block in the assistant-ui runtime adapter implementation.

Engineering concerns resolved by this section:
— EC-07: No fallback form endpoint exists in this system. The fallback is a widget UI state linking to fallback-url. The submission path is entirely independent of the AI backend.

API Specifications

Chat Endpoint

Request

Response

token

done

error

HTTP Error Responses (pre-stream)

GDPR and PII at the API layer