# Blob Backend — Companion Mode POC

> **Branch:** `feat/companion-mode`
> **Review Date:** May 2026
> **Architecture:** v4 — FastAPI + WebSocket + Redis + Firebase + Gemini

---

## Table of Contents

1. [Sprint Goal & Success Criteria](#1-sprint-goal--success-criteria)
2. [Architecture Overview](#2-architecture-overview)
3. [Tech Stack](#3-tech-stack)
4. [Data Flows & Workflows](#4-data-flows--workflows)
5. [REST API Endpoints](#5-rest-api-endpoints)
6. [WebSocket Connection & Message Contract](#6-websocket-connection--message-contract)
7. [Dev Environment Configuration](#7-dev-environment-configuration)
8. [GitHub Secrets & Environments](#8-github-secrets--environments)
9. [Docker Reference](#9-docker-reference)
10. [Run Locally](#10-run-locally)
11. [VM Initial Setup](#11-vm-initial-setup)
12. [CI/CD Deployment Pipeline](#12-cicd-deployment-pipeline)
13. [Testing & Debugging in VM](#13-testing--debugging-in-vm)

---

## 1. Sprint Goal & Success Criteria

### Sprint Goal

Build a **stable Companion Mode POC** that supports:

| # | Requirement |
|---|-------------|
| 1 | Parent logs in **once** — child never logs in |
| 2 | Continuous free-form conversation between Blob and child |
| 3 | Pause/Resume support via WebSocket messages |
| 4 | Session continuity across WebSocket disconnects |
| 5 | Conversation continuity across app close/reopen |
| 6 | WebSocket-first real-time audio communication |
| 7 | STT → LLM → Client text pipeline |
| 8 | JWT refresh **without** forcing parent re-login |
| 9 | App reopen within 5 minutes → same session continues |

### ✅ Success Criteria

> **A child can open the app → talk to Blob → close the app → reopen later → continue the same conversation — without requiring parent re-authentication.**

### What Companion Mode Is NOT

- Not a lesson execution engine
- Not a quiz runner or progress tracker
- Not a scored experience
- No lesson curriculum enforcement

### Critical Bugs Fixed in This Sprint

| Bug | Severity | Fix |
|-----|----------|-----|
| JWT `children` claim from stale Firebase custom claims → all child endpoints return 403 | CRITICAL | Claims now built from Firestore `family_links` on every token exchange |
| `session.stop()` deletes Redis state instead of suspending | CRITICAL | Replaced with `SessionRepository.suspend()` |
| Heartbeat pong handler never wired → sessions die after 60s | HIGH | `pong` handler registered in `MessageRouter` |
| REST session and WebSocket session disconnected (new UUID each WS connect) | HIGH | WebSocket accepts `session_id` param; loads from Redis |
| Conversation not reloaded on reconnect | HIGH | `_reload_conversation()` called on WS reconnect |

---

## 2. Architecture Overview

### High-Level Diagram

```
┌─────────────────────────────────────────────────────────────────────┐
│  Flutter Client                                                      │
│  (Parent's device)                                                   │
└───────────────────┬─────────────────────┬───────────────────────────┘
                    │ REST API             │ WebSocket (binary PCM)
                    ▼                     ▼
┌─────────────────────────────────────────────────────────────────────┐
│  FastAPI Backend  (port 8000 → Nginx :80/:443)                      │
│                                                                      │
│  ┌──────────────┐   ┌──────────────┐   ┌──────────────────────────┐│
│  │ Auth Router  │   │ REST Routers │   │   WebSocket /ws/voice    ││
│  │ /auth/*      │   │ /api/v1/*    │   │   (VoiceSession)         ││
│  └──────────────┘   └──────────────┘   └──────────┬───────────────┘│
│                                                    │                 │
│                                         ┌──────────▼───────────┐   │
│                                         │  Pipeline Workers     │   │
│                                         │  ┌────┐ ┌────┐ ┌────┐│   │
│                                         │  │STT │ │LLM │ │TTS ││   │
│                                         │  │wrkr│ │wrkr│ │wrkr││   │
│                                         │  └────┘ └────┘ └────┘│   │
│                                         └──────────────────────┘   │
└────────────────────┬────────────────────────────────────────────────┘
                     │
          ┌──────────┴──────────┐
          ▼                     ▼
  ┌──────────────┐    ┌──────────────────┐
  │    Redis     │    │    Firestore     │
  │  (hot state) │    │   (cold state)   │
  │  session TTL │    │  users, children │
  │  1h active   │    │  family_links    │
  │  5m suspended│    │  lesson_progress │
  │  24h conv    │    └──────────────────┘
  └──────────────┘
          │                     │
          └──────────┬──────────┘
                     ▼
          ┌──────────────────┐
          │   Google Gemini  │    ← LLM (gemini-3.1-flash-lite-preview)
          │   faster-whisper │    ← STT (local, CPU, int8, base model)
          │   Edge TTS       │    ← TTS (server sends text; client synthesises)
          └──────────────────┘
```

### Redis Key Schema

| Key Pattern | TTL | Content |
|-------------|-----|---------|
| `session:{session_id}` | 1 hour | Full session JSON (user_id, child_id, state, conv_id, etc.) |
| `suspended:{session_id}` | 5 minutes | Session JSON with `state: suspended` + pending events |
| `child_session:{child_id}` | 1 hour | Pointer → active `session_id` for the child |
| `conversation:{conv_id}` | 24 hours | `ConversationSnapshot` JSON (all turns + _owner_user_id) |
| `idempotency:{key}` | 1 hour | Idempotency record for dedup |
| `rl:{user_id}` | 1 minute | Rate-limit counter |

### Firestore Collections

| Collection | Purpose |
|------------|---------|
| `users/{uid}` | Parent user profile |
| `users/{uid}/family_links/{child_id}` | Parent→child permission mapping |
| `children/{child_id}` | Child profile (name, grade_level, current_level) |
| `lesson_progress/{child_id}_{lesson_id}` | Per-child lesson progress |
| `sessions/{session_id}` | Archived (ended) sessions |

---

## 3. Tech Stack

| Layer | Technology | Notes |
|-------|-----------|-------|
| Web Framework | FastAPI (Python 3.12) | Async/await throughout |
| Real-time | WebSocket via Starlette | Binary + JSON frames |
| Auth (Firebase) | Firebase Admin SDK (RS256) | ID token verification only |
| Auth (Backend) | HS256 JWT | `BACKEND_JWT_SECRET`, 24h TTL for POC |
| Hot State | Redis 7 (asyncio client) | Sessions, conversations, rate limits |
| Cold State | Firestore (Gyfsoo-dev project) | Users, children, progress |
| STT | faster-whisper (local, CPU, int8) | `base` model, min 1.0s utterance |
| LLM | Google Gemini API | `gemini-3.1-flash-lite-preview`, 10s timeout |
| TTS | Client-side | Server sends text + `tts_hint`; Flutter synthesises |
| Event Loop | uvloop | Higher performance than asyncio default |
| Container | Docker (multi-stage) + Docker Compose | Python 3.12-slim image |
| Reverse Proxy | Nginx | Port 80/443 → localhost:8000 |
| Registry | GitHub Container Registry (GHCR) | `ghcr.io/gyfsoo/blob-backend/backend` |

---

## 4. Data Flows & Workflows

### 4.1 Parent Authentication Flow (First Launch)

```
1. Parent opens app (no stored JWT)
2. Firebase Auth: email/password or social sign-in
   └─ Firebase returns Firebase ID Token (RS256, short-lived)

3. POST /api/v1/auth/token-exchange
   Body: { firebase_token: "<Firebase ID token>" }
   ──────────────────────────────────────────────►
   Server:
     a. verify_id_token(firebase_token)         ← Firebase Admin SDK
     b. Get or create user in Firestore users/{uid}
     c. update_last_login(uid)
     d. Load family_links from Firestore         ← NEVER from Firebase custom claims
     e. Build children claims: { child_id: permission }
     f. Issue Backend JWT (HS256, 24h TTL for POC)
   ◄──────────────────────────────────────────────
   Response: {
     success: true,
     data: {
       token: "<Backend JWT>",
       user: { uid, email, display_name, role },
       children: [{ child_id, name, grade_level, permission }]
     }
   }

4. Client stores Backend JWT securely (e.g., secure storage)
```

### 4.2 JWT Refresh Flow (Transparent to Child)

```
1. Backend JWT expires (24h for POC; 1h in production)
2. Next REST call returns 401 TOKEN_EXPIRED
3. Client: Firebase getIdToken(forceRefresh: true)
4. POST /api/v1/auth/refresh
   Body: { firebase_token: "<fresh Firebase token>" }
   Server issues a new Backend JWT — NO session state is touched.
5. Client replaces stored JWT and retries the REST call.

NOTE: The WebSocket connection is NOT affected — WS authenticated at
      handshake time; the JWT is NOT re-verified on each message.
```

### 4.3 Session Creation Flow

```
POST /api/v1/sessions
Body: { child_id: "child_xxx", mode: "companion" }
Authorization: Bearer <Backend JWT>
──────────────────────────────────────────────────────►
Server:
  a. Verify JWT (HS256)
  b. Check permission: UserRepository.get_family_link(uid, child_id)
     └─ NEVER from JWT children claims (those can be stale)
  c. find_active_for_child(child_id):
     - If active session found     → return existing session
     - If suspended session found  → resume() it → return resumed session
     - If none found               → create new session
  d. Generate conversation_id = "conv_" + uuid.hex[:12]
  e. Store in Redis: session:{sess_id} (1h TTL)
  f. Set child_session:{child_id} pointer
◄──────────────────────────────────────────────────────
Response: {
  session_id: "sess_abc123",
  session_token: "stok_xyz",
  child_id: "child_xxx",
  mode: "companion",
  active_conversation_id: "conv_abc456",
  state: "active" | "suspended",
  ws_url: "wss://api.dev.gyfsoo.com/ws/voice",
  ttl_seconds: 3600,
  reconnect_window_seconds: 300
}
```

### 4.4 WebSocket Connection & Session Binding

```
WebSocket: wss://api.dev.gyfsoo.com/ws/voice?token=<jwt>&session_id=<sess_id>
────────────────────────────────────────────────────────────────────────────►
Server:
  a. Verify Backend JWT (HS256 signature + expiry)
     - Close WS with code 4001 if expired
  b. Load session from Redis: session:{sess_id} or suspended:{sess_id}
     - Close WS with code 4002 if not found
  c. Verify session.user_id == JWT.sub (anti-hijack)
     - Close WS with code 4003 if mismatch
  d. If suspended: SessionRepository.resume()
  e. Load conversation snapshot from Redis → inject into ConversationSessionManager
  f. Accept WebSocket
  g. Send session event:
     - New session:    { action: "session.created", payload: { session_id } }
     - Resumed session: { action: "session.resumed", payload: { resumed_turns_count: N } }
  h. Start pipeline workers: stt_worker, llm_worker
  i. Start HeartbeatManager (ping every 30s, kill after 2 missed pongs)
◄────────────────────────────────────────────────────────────────────────────
```

### 4.5 Audio Pipeline Flow (Real-Time Voice Turn)

```
Client sends binary v4 frames:
  [0xB1][0x04][seq_uint32_le][thread_uuid_16bytes][PCM s16le body]

Server receive_audio():
  1. Parse binary header → extract sequence_index, thread_id
  2. Sequence gap check → warning log if gap detected
  3. validate_pcm_audio() → reject if not 16-bit aligned
  4. Push PCM bytes to audio_queue
  5. Send audio_ack: { sequence, buffer_state: "buffering" }

stt_worker():
  - Session paused? → skip chunk (PAUSE-02)
  - Emit conversation.state = LISTENING
  - Buffer PCM with VAD (energy threshold 500 RMS)
  - On silence (800ms gap) or EOS → trigger transcription:
    a. Emit conversation.state = TRANSCRIBING
    b. pcm_to_wav(pcm_bytes) → bytes with RIFF header
    c. faster-whisper transcribe → { text, confidence, no_speech_prob }
    d. Apply confidence gate (threshold 0.6) → discard low-confidence
    e. Emit stt_result: { transcript, confidence, language, duration_ms }
    f. Push text to text_queue

llm_worker():
  - Session paused? → skip (PAUSE-02)
  - Emit conversation.state = THINKING
  - Build conversation history (last 8 turns)
  - Gemini API call (10s timeout)
  - Emit conversation.state = RESPONDING
  - Push ai_response to text_out_queue
  - Persist conversation snapshot to Redis
  - Emit conversation.state = IDLE

send_text():
  - Read from text_out_queue
  - Send ai_response envelope to client:
    {
      action: "ai_response",
      payload: { text, tts_hint: { voice, speed, language } },
      meta: { session_id, conversation_id, request_id, thread_id, timestamp }
    }
```

### 4.6 Pause / Resume Flow

```
Client → Server (WebSocket JSON):
  { type: "request", action: "pause", payload: { session_id: "sess_xxx" } }

Server PauseHandler:
  - Sets session.is_paused = True
  - stt_worker: sees is_paused → skips processing (continues loop, discards audio)
  - llm_worker: sees is_paused → skips processing
  - New audio_stream handlers: reject with SESSION_PAUSED error
  - Sends session.paused event:
    { action: "session.paused", payload: { session_id } }

Client → Server:
  { type: "request", action: "resume", payload: { session_id: "sess_xxx" } }

Server ResumeHandler:
  - Sets session.is_paused = False
  - Processing resumes on next audio chunk
  - Sends session.resumed event:
    { action: "session.resumed", payload: { session_id } }
```

### 4.7 App Close & Reconnect Flow

```
[App close / WS disconnect]
  Server detects WebSocketDisconnect
  VoiceSession.stop(reason="disconnect"):
    1. Persist all conversation snapshots to Redis (24h TTL)
       └─ Each snapshot includes _owner_user_id for REST API ownership check
    2. SessionRepository.suspend(session_id)
       └─ Moves: session:{id} → suspended:{id} (5-minute TTL)
    3. Heartbeat stopped; pipeline workers cancelled

[0 – 5 minutes later: App reopens]
  POST /api/v1/sessions { child_id }
  → find_active_for_child → finds suspended:{id}
  → SessionRepository.resume() → restores session:{id} with full 1h TTL
  → Returns same session_id + conversation_id

  WebSocket: /ws/voice?token=<jwt>&session_id=<same_session_id>
  → _reload_conversation() loads snapshot from conversation:{conv_id}
  → ConversationSessionManager has all prior turns
  → Sends session.resumed { resumed_turns_count: N }
  → Blob continues with full context

[> 5 minutes later: Reconnect window expired]
  Suspended session TTL elapsed; suspended:{id} gone
  POST /api/v1/sessions { child_id }
  → No active or suspended session found
  → Creates new session
  → conversation:{conv_id} may still be in Redis (24h TTL)
  → If client provides conversation_id in request → can reattach context
  → Or: child starts a fresh conversation (Blob has no memory of prior session)
```

### 4.8 GET /users/me — App Bootstrap Flow

```
On app launch (after JWT obtained):

GET /api/v1/users/me
Authorization: Bearer <Backend JWT>
──────────────────────────────────────────────────────►
Server (parallel fetches):
  a. UserRepository.get_by_uid(uid) → user profile
  b. UserRepository.get_family_links(uid) → list of children
  c. ChildRepository.list_by_ids(child_ids) → child profiles (batched)
  d. SessionRepository.find_active_for_child(cid) for each child (parallel)
◄──────────────────────────────────────────────────────
Response: {
  success: true,
  data: {
    user: {
      uid, email, display_name, role, is_active, created_at, last_login_at
    },
    children: [
      { child_id, name, grade_level, current_level, permission }
    ],
    active_child: { child_id } | null,
    active_session: {
      session_id, status, child_id, conversation_id,
      started_at, last_activity_at, ws_required: true
    } | null,
    feature_flags: {
      companion_mode: true,
      biometric_enabled: false,
      lesson_mode_enabled: false
    }
  }
}
```

---

## 5. REST API Endpoints

**Base URL:**
- Local: `http://localhost:8000`
- Dev VM: `https://api.dev.gyfsoo.com`

All responses use the v4 envelope:
```json
{ "success": true, "data": {}, "meta": { "request_id": "...", "timestamp": "...", "version": "4.0" } }
```

### Authentication

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `POST` | `/api/v1/auth/token-exchange` | Firebase ID token in body | Exchange Firebase token for Backend JWT |
| `POST` | `/api/v1/auth/refresh` | Firebase ID token in body | Refresh Backend JWT (session unaffected) |

**POST /api/v1/auth/token-exchange**
```json
Request:  { "firebase_token": "<Firebase ID token>" }
Response: {
  "token": "<Backend JWT>",
  "user": { "uid": "...", "email": "...", "display_name": "...", "role": "parent" },
  "children": [{ "child_id": "...", "name": "...", "grade_level": "...", "permission": "manage" }]
}
```

### Users

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/api/v1/users/me` | Backend JWT | Enriched profile: user + children + active session + feature flags |
| `PATCH` | `/api/v1/users/me` | Backend JWT | Update display_name, locale, timezone |

### Children

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `POST` | `/api/v1/children` | Backend JWT | Create a child profile |
| `GET` | `/api/v1/children` | Backend JWT | List all children the user has access to |
| `GET` | `/api/v1/children/{child_id}` | Backend JWT | Enriched: child + latest_progress + active_session |
| `PATCH` | `/api/v1/children/{child_id}` | Backend JWT (manage) | Update child profile |
| `DELETE` | `/api/v1/children/{child_id}` | Backend JWT (manage) | Soft-delete child |
| `POST` | `/api/v1/children/{child_id}/permissions` | Backend JWT (manage) | Grant another user access |

**GET /api/v1/children/{child_id} — Response:**
```json
{
  "child": { "child_id": "...", "name": "...", "grade_level": "pre_k", "current_level": "beginner" },
  "latest_progress": {
    "mastery_score": 0.72,
    "lessons_completed": 3,
    "last_lesson_id": "lesson_abc",
    "last_accessed": "2026-05-24T10:00:00Z"
  },
  "active_session": { "session_id": "sess_...", "conversation_id": "conv_...", "state": "active" }
}
```

**Permission Hierarchy:** `manage > interact > view`
All permission checks use Firestore `family_links` — never stale JWT children claims.

### Sessions

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `POST` | `/api/v1/sessions` | Backend JWT | Create or discover session for a child |
| `GET` | `/api/v1/sessions/{session_id}` | Backend JWT (owner) | Get session snapshot |
| `POST` | `/api/v1/sessions/{session_id}/end` | Backend JWT (owner) | Gracefully end session |
| `POST` | `/api/v1/sessions/{session_id}/pause` | Backend JWT (owner) | Pause session |
| `POST` | `/api/v1/sessions/{session_id}/resume` | Backend JWT (owner) | Resume session |

**POST /api/v1/sessions — Request:**
```json
{ "child_id": "child_xxx", "mode": "companion" }
```

**POST /api/v1/sessions — Response:**
```json
{
  "session_id": "sess_abc123-def4-5678",
  "session_token": "stok_abc123",
  "child_id": "child_xxx",
  "mode": "companion",
  "active_conversation_id": "conv_abc123456789",
  "state": "active",
  "ttl_seconds": 3600,
  "reconnect_window_seconds": 300,
  "ws_url": "wss://api.dev.gyfsoo.com/ws/voice",
  "created_at": "2026-05-24T10:00:00Z",
  "expires_at": "2026-05-24T11:00:00Z"
}
```

### Conversations

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/api/v1/conversations/{conversation_id}` | Backend JWT (owner) | Get conversation snapshot |

**GET /api/v1/conversations/{conversation_id} — Response:**
```json
{
  "conversation_id": "conv_abc123456789",
  "session_id": "sess_abc123-def4-5678",
  "status": "idle",
  "message_count": 10,
  "turn_count": 5,
  "last_activity_at": "2026-05-24T10:30:00Z",
  "created_at": "2026-05-24T10:00:00Z",
  "summary": "5 exchange(s) — started with: \"tell me a story\" / last response: \"once upon a time...\""
}
```

### Progress

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/api/v1/children/{child_id}/progress` | Backend JWT (view+) | All lesson progress for a child |
| `GET` | `/api/v1/children/{child_id}/progress/{lesson_id}` | Backend JWT (view+) | Progress for specific lesson |

### Lessons

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/api/v1/lessons` | Backend JWT | List lessons (filter: subject, grade, difficulty) |
| `GET` | `/api/v1/lessons/{lesson_id}` | Backend JWT | Get lesson detail with steps |

### Health

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/health` | None | Basic health check |
| `GET` | `/health/services` | None | Detailed: Firebase, Firestore, Redis, STT, LLM status |

---

## 6. WebSocket Connection & Message Contract

### Connection

```
URL: ws[s]://<host>/ws/voice?token=<Backend JWT>&session_id=<session_id>
```

### Close Codes

| Code | Meaning |
|------|---------|
| `1000` | Normal closure (session ended cleanly) |
| `1001` | Server going away (shutdown) |
| `1008` | Policy violation (auth failed) |
| `4001` | JWT expired at connect time |
| `4002` | Session not found in Redis |
| `4003` | Permission denied (user_id mismatch) |
| `4429` | Rate limited (25 WS messages/second) |

### Binary Audio Frame (v4 — Primary Path)

```
Bytes [0]:     Magic byte   = 0xB1
Bytes [1]:     Version      = 0x04
Bytes [2–5]:   sequence_index (uint32, little-endian)
Bytes [6–21]:  thread_id (UUID, 16 bytes, big-endian)
Bytes [22+]:   PCM s16le audio body (16 kHz, mono, 16-bit signed)
```

### Message Catalog

#### Client → Server

| Action | Type | Description |
|--------|------|-------------|
| `audio_stream` | request | Binary v4 frame or JSON/base64 audio chunk |
| `pause` | request | Pause session — suspend audio processing |
| `resume` | request | Resume session |
| `pong` | response | Heartbeat pong reply to server ping |

#### Server → Client

| Action | Type | Description |
|--------|------|-------------|
| `session.created` | event | New session established |
| `session.resumed` | event | Session resumed from suspension |
| `session.paused` | event | Session paused |
| `session.resumed` | event | Session resumed after pause |
| `audio_ack` | response | Acknowledge received audio chunk |
| `stt_result` | event | Transcription result (before LLM) |
| `conversation.state` | event | Pipeline state: LISTENING / TRANSCRIBING / THINKING / RESPONDING / IDLE |
| `ai_response` | response | LLM text response + TTS hint |
| `tts_stream` | event | TTS audio chunks (if server-side TTS enabled) |
| `ping` | request | Heartbeat ping (client must reply with pong) |
| `error` | error | Protocol or processing error |

### Key Message Shapes

**ai_response (Server → Client):**
```json
{
  "type": "response",
  "action": "ai_response",
  "payload": {
    "text": "Once upon a time...",
    "tts_hint": { "voice": "en-IN-NeerjaNeural", "speed": 1.0, "language": "en-IN" }
  },
  "meta": {
    "session_id": "sess_abc123",
    "conversation_id": "conv_abc456",
    "request_id": "uuid",
    "thread_id": "uuid",
    "timestamp": "2026-05-24T10:05:02Z"
  }
}
```

**stt_result (Server → Client):**
```json
{
  "type": "event",
  "action": "stt_result",
  "payload": {
    "transcript": "Tell me a story",
    "confidence": 0.94,
    "language": "en",
    "duration_ms": 1800
  }
}
```

**conversation.state (Server → Client):**
```json
{
  "type": "event",
  "action": "conversation.state",
  "payload": { "state": "THINKING", "conversation_id": "conv_abc456" }
}
```

**audio_ack (Server → Client):**
```json
{
  "type": "response",
  "action": "audio_ack",
  "payload": { "sequence": 42, "buffer_state": "buffering" }
}
```

---

## 7. Dev Environment Configuration

### Prerequisites

- Python 3.12
- Docker + Docker Compose plugin
- `gcloud` CLI (for Firebase ADC)
- Redis 7 (via Docker or local)

### Step 1: Firebase Application Default Credentials (ADC)

This is required for Firebase Auth + Firestore to work locally.

```bash
# Authenticate with your Google account that has access to gyfsoo-dev project
gcloud auth application-default login --project=gyfsoo-dev

# Verify credentials were saved
cat ~/.config/gcloud/application_default_credentials.json
```

The ADC file is automatically mounted into Docker containers via the `docker-compose.yml` volume:
```yaml
volumes:
  - ${HOME}/.config/gcloud:/root/.config/gcloud:ro
```

### Step 2: Environment File

```bash
cp .env.example .env
```

Then edit `.env` with your actual values:

```bash
# ── Critical fields to set ────────────────────────────────────────────
AUTH_PROVIDER=firebase               # "firebase" (hybrid) or "jwt" (legacy)
FIREBASE_PROJECT_ID=gyfsoo-dev

BACKEND_JWT_SECRET=<run: openssl rand -hex 32>
BACKEND_JWT_TTL_SECONDS=86400        # 24h for POC

GOOGLE_API_KEY=<your Gemini API key>
GEMINI_MODEL=gemini-3.1-flash-lite-preview

REDIS_URL=redis://redis:6379/0       # Inside Docker / redis://localhost:6379/0 for local

# ── STT / Audio ───────────────────────────────────────────────────────
STT_MIN_SECONDS=1.0                  # Min utterance length (1.0s for children)
STT_CONFIDENCE_THRESHOLD=0.6
VAD_ENERGY_THRESHOLD=500
VAD_SILENCE_MS=800
SAMPLE_RATE=16000

# ── Protocol flags ────────────────────────────────────────────────────
ENABLE_STANDARD_PROTOCOL=True
LEGACY_BINARY_SUPPORT=True
BYPASSLLM=False

# ── Session ───────────────────────────────────────────────────────────
SESSION_TTL_SECONDS=3600
SESSION_RECONNECT_WINDOW_SECONDS=300

# ── Admin Console ─────────────────────────────────────────────────────
ADMIN_AUTH_MODE=firebase_password
FIREBASE_WEB_API_KEY=<from Firebase Console → Project Settings>
ADMIN_SESSION_SECRET=<run: openssl rand -hex 32>
```

### Key Environment Variables Reference

| Variable | Default | Description |
|----------|---------|-------------|
| `AUTH_PROVIDER` | `jwt` | `firebase` = hybrid Firebase+Backend JWT; `jwt` = legacy only |
| `FIREBASE_PROJECT_ID` | — | GCP project ID (e.g., `gyfsoo-dev`) |
| `FIREBASE_SERVICE_ACCOUNT_PATH` | — | Optional: path to service account JSON key (leave empty for ADC) |
| `BACKEND_JWT_SECRET` | `change-me` | HS256 signing key — must be 32+ chars in production |
| `BACKEND_JWT_TTL_SECONDS` | `3600` | JWT lifetime (set to `86400` for POC) |
| `GOOGLE_API_KEY` | — | Gemini API key |
| `GEMINI_MODEL` | `gemini-2.0-flash` | Gemini model name |
| `REDIS_URL` | `redis://localhost:6379/0` | Redis connection string |
| `STT_MIN_SECONDS` | `2.5` | Min audio duration before STT (use `1.0` for children) |
| `STT_CONFIDENCE_THRESHOLD` | `0.6` | Discard STT results below this confidence |
| `VAD_ENERGY_THRESHOLD` | `500` | RMS energy threshold for voice activity |
| `VAD_SILENCE_MS` | `800` | Silence duration before triggering STT |
| `ENABLE_STANDARD_PROTOCOL` | `False` | Enable v4 message envelope for WS messages |
| `LEGACY_BINARY_SUPPORT` | `True` | Accept legacy binary audio frames |
| `BYPASSLLM` | `False` | If `True`, return a static joke instead of calling Gemini |
| `SESSION_TTL_SECONDS` | `3600` | Active session Redis TTL |
| `SESSION_RECONNECT_WINDOW_SECONDS` | `300` | Suspended session Redis TTL (5-minute reconnect window) |
| `ML_SERVICE_URL` | — | Leave empty to use local STT+LLM pipeline (Channel B disabled) |

---

## 8. GitHub Secrets & Environments

### Repository Secrets (Settings → Secrets and variables → Actions)

| Secret | Description |
|--------|-------------|
| `GOOGLE_API_KEY` | Gemini API key (used in CI tests) |
| `GHCR_PAT` | GitHub Personal Access Token with `write:packages` scope |
| `DEV_VM_USER` | SSH username on the dev VM (e.g., `ubuntu` or `sreelesh`) |
| `DEV_VM_SSH_KEY` | Private SSH key corresponding to the key added by `vm-setup.sh` |

### Repository Variables (Settings → Secrets and variables → Variables)

| Variable | Description |
|----------|-------------|
| `DEV_VM_IP` | External IP of the dev VM (e.g., `34.170.250.232`) |

### GitHub Environment: `dev`

Navigate to **Settings → Environments → dev** and add:
- **URL:** `https://api.dev.gyfsoo.com/health`
- **Required reviewers:** (optional — remove for auto-deploy on push to `dev`)

### Secret Generation Commands

```bash
# Generate BACKEND_JWT_SECRET
openssl rand -hex 32

# Generate ADMIN_SESSION_SECRET
openssl rand -hex 32

# Create GHCR_PAT: GitHub → Settings → Developer settings → Personal access tokens → Fine-grained
# Scopes needed: read:packages, write:packages, delete:packages
```

---

## 9. Docker Reference

### Image

```
ghcr.io/gyfsoo/blob-backend/backend:<tag>
```

Tags:
- `dev-latest` — latest `dev` branch build
- `sha-<7char>` — per-commit SHA tag
- `v1.2.3` — production release tags

### Docker Compose Files

| File | Use |
|------|-----|
| `docker-compose.yml` | Local development (binds to all interfaces, mounts source code) |
| `docker-compose.dev.yml` | Dev VM (uses pre-built GHCR image, binds only to `127.0.0.1`) |
| `docker-compose.prod.yml` | Production (uses release-tagged image) |

### Common Docker Commands

```bash
# ── Local development ─────────────────────────────────────────────────

# Start all services (backend + redis)
docker compose up --build

# Start in background
docker compose up -d --build

# View logs (follow)
docker compose logs -f backend

# View Redis logs
docker compose logs -f redis

# Stop all
docker compose down

# Stop and remove volumes (wipes Redis data)
docker compose down -v

# Rebuild backend image only
docker compose build backend

# ── Dev VM (docker-compose.dev.yml) ──────────────────────────────────

# Start (uses pre-built GHCR image)
IMAGE_TAG=dev-latest docker compose -f docker-compose.dev.yml up -d

# Pull latest image and restart
IMAGE_TAG=dev-latest docker compose -f docker-compose.dev.yml pull && \
  docker compose -f docker-compose.dev.yml up -d

# View live logs
docker compose -f docker-compose.dev.yml logs -f backend

# Show running containers
docker compose -f docker-compose.dev.yml ps

# Restart backend only (keeps Redis running)
docker compose -f docker-compose.dev.yml restart backend

# ── Exec into running container ───────────────────────────────────────

# Open a shell
docker exec -it blob-backend-backend-1 bash

# List a directory
docker compose -f docker-compose.dev.yml exec backend ls -la /root/.config/gcloud/

# Inspect ADC credentials file
docker exec -it blob-backend-backend-1 cat /root/.config/gcloud/application_default_credentials.json

# Check Python package installed
docker exec -it blob-backend-backend-1 pip show faster-whisper

# Run a one-off script inside the container
docker compose -f docker-compose.dev.yml exec backend python scripts/reseed_family_mode.py

# ── Redis CLI ─────────────────────────────────────────────────────────

# Open Redis CLI
docker compose exec redis redis-cli

# List all session keys
docker exec -it blob-backend-redis-1 redis-cli KEYS "session:*"

# Inspect a session
docker exec -it blob-backend-redis-1 redis-cli GET "session:<sess_id>"

# List conversation keys
docker exec -it blob-backend-redis-1 redis-cli KEYS "conversation:*"

# Check TTL of a key
docker exec -it blob-backend-redis-1 redis-cli TTL "conversation:<conv_id>"

# Flush all Redis data (⚠ destructive)
docker exec -it blob-backend-redis-1 redis-cli FLUSHALL

# ── Image management ──────────────────────────────────────────────────

# List all blob images
docker images | grep blob-backend

# Remove dangling images
docker image prune -f

# Log into GHCR
echo "<GHCR_PAT>" | docker login ghcr.io -u <github_username> --password-stdin

# Pull specific tag
docker pull ghcr.io/gyfsoo/blob-backend/backend:dev-latest
```

### Dockerfile Summary

The build uses a **two-stage build** to keep the final image small:
- **Stage 1 (builder):** `python:3.12-slim` — installs all pip packages
- **Stage 2 (runtime):** `python:3.12-slim` — copies only site-packages + app code; installs `libgomp1` for faster-whisper

Key env vars set in the image:
```
PYTHONUNBUFFERED=1
PYTHONDONTWRITEBYTECODE=1
PYTHONPATH=/app
PYTHONNOUSERSITE=1   ← prevents split-site-packages issues
```

---

## 10. Run Locally

### Option A: Python directly (fastest for development)

```bash
# 1. Clone the repo
git clone git@github.com:GYFSOO/blob-backend.git
cd blob-backend
git checkout feat/companion-mode

# 2. Create virtual environment
python3.12 -m venv .venv
source .venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Set up environment
cp .env.example .env
# Edit .env with real values (see Section 7)

# 5. Start Redis (via Docker)
docker run -d -p 6379:6379 --name blob-redis redis:7-alpine

# 6. Authenticate with Firebase (ADC)
gcloud auth application-default login --project=gyfsoo-dev

# 7. Run the backend
export PYTHONPATH=$PYTHONPATH:.
python3 -m app.main

# App is now running at http://localhost:8000
# API docs: http://localhost:8000/docs
# Health:   http://localhost:8000/health
```

### Option B: Docker Compose (closest to production)

```bash
# 1. Ensure ADC credentials exist
gcloud auth application-default login --project=gyfsoo-dev

# 2. Set up environment
cp .env.example .env
# Edit .env

# 3. Start everything
docker compose up --build

# App: http://localhost:8000
# Redis: localhost:6379
```

### Run Tests

```bash
# All tests (142 tests, ~4s)
python3 -m pytest tests/ -v

# Single test file
python3 -m pytest tests/test_auth.py -v

# With coverage
python3 -m pytest tests/ --cov=app --cov-report=term-missing

# Stop on first failure
python3 -m pytest tests/ -x -q
```

---

## 11. VM Initial Setup

Run this **once** on a fresh GCP VM before the first deployment.

```bash
# SSH into the VM
ssh <user>@<vm_ip>

# Clone the repo (or copy the script)
git clone git@github.com:GYFSOO/blob-backend.git
cd blob-backend

# Make executable and run (defaults to 'dev' environment)
chmod +x scripts/vm-setup.sh
./scripts/vm-setup.sh dev
```

**What `vm-setup.sh` does:**

| Step | Action |
|------|--------|
| 1 | Install system packages: `git`, `curl`, `python3`, `nginx`, `docker-compose-plugin`, `docker.io` |
| 2 | Add current user to `docker` group |
| 3 | Add GitHub Actions SSH public key to `~/.ssh/authorized_keys` |
| 4 | Clone / update the repository to `~/blob-backend` |
| 5 | `chmod +x` all deploy scripts |
| 6 | Configure Nginx: port `80` → `127.0.0.1:8000` (with WebSocket upgrade headers) |
| 7 | Create `.env.dev` from `.env.example` with environment-specific defaults |
| 8 | Print VM external IP and SSH username for GitHub Secrets |

### Post-Setup: Fill in Secrets

```bash
nano ~/blob-backend/.env.dev
```

Required fields:
```bash
GOOGLE_API_KEY=<Gemini API key>
FIREBASE_WEB_API_KEY=<Firebase Console → Project Settings → Web API Key>
BACKEND_JWT_SECRET=<openssl rand -hex 32>
ADMIN_SESSION_SECRET=<openssl rand -hex 32>
```

### Set up Firebase ADC on the VM

```bash
# Install gcloud CLI on the VM if not present
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
gcloud init

# Authenticate (opens browser link — copy/paste if headless)
gcloud auth application-default login --project=gyfsoo-dev

# Verify credentials
cat ~/.config/gcloud/application_default_credentials.json
```

### Verify the VM is ready

```bash
# After first deployment (see Section 12):
curl http://localhost/health
curl http://localhost/health/services
```

---

## 12. CI/CD Deployment Pipeline

### Overview

```
┌──────────────┐     push to dev     ┌─────────────────────────────────────┐
│  Developer   │ ─────────────────► │  GitHub Actions: deploy-dev.yml      │
│  (local dev) │                     │                                       │
└──────────────┘                     │  1. Run CI tests (ci.yml)            │
                                     │  2. Build Docker image               │
                                     │  3. Push to GHCR as :dev-latest      │
                                     │  4. SSH to Dev VM                     │
                                     │     git pull origin dev              │
                                     │     ./scripts/deploy.sh dev dev-latest│
                                     └──────────────────┬────────────────────┘
                                                        │
                                                        ▼
                                               Dev VM runs deploy.sh:
                                                 1. docker login ghcr.io
                                                 2. docker compose pull
                                                 3. docker compose up -d
                                                 4. Wait 60s (Whisper model load)
                                                 5. health-check.sh (6 retries × 15s)
                                                 6. ✅ Deployed or ❌ Rollback
```

### Workflow Files

| File | Trigger | Action |
|------|---------|--------|
| `.github/workflows/ci.yml` | Called by other workflows | Run pytest, Docker build smoke |
| `.github/workflows/pr-checks.yml` | PR to `main` or `dev` | Run CI, lint |
| `.github/workflows/deploy-dev.yml` | Push to `dev` branch | Build + deploy to dev VM |
| `.github/workflows/deploy-production.yml` | Push to `main` or `v*` tag | Build + deploy to production |

### Manual Deployment to Dev VM

```bash
# SSH into VM
ssh <user>@<vm_ip>
cd ~/blob-backend

# Deploy a specific image tag
GHCR_PAT=<token> GHCR_USER=<github_username> \
  ./scripts/deploy.sh dev dev-latest

# Or a specific SHA tag
GHCR_PAT=<token> GHCR_USER=<github_username> \
  ./scripts/deploy.sh dev sha-abc1234
```

### Rollback

```bash
# SSH into VM
ssh <user>@<vm_ip>
cd ~/blob-backend

# Roll back to the automatically recorded previous tag
./scripts/rollback.sh dev

# Or specify an explicit tag
./scripts/rollback.sh dev sha-previouscommit
```

The `deploy.sh` script records the previously running image tag in `/tmp/blob-previous-tag-dev` before updating, enabling instant one-command rollback.

### deploy.sh Step-by-Step

```
1. Verify .env.dev exists on the VM (fails if not)
2. Check for port conflicts (Docker containers on :80, processes on :8000)
3. docker login ghcr.io (using GHCR_PAT)
4. Record current image tag to /tmp/blob-previous-tag-dev
5. docker compose -f docker-compose.dev.yml pull
6. docker compose -f docker-compose.dev.yml up -d --remove-orphans
7. sleep 60 (faster-whisper model load time)
8. ./scripts/health-check.sh localhost 8000 6 15
   - 6 attempts × 15s interval = 90s maximum wait
   - Checks GET /health → { "status": "ok" }
   - On success: cleanup old images (>24h)
   - On failure: auto-rollback to previous tag
```

---

## 13. Testing & Debugging in VM

### View Live Logs

```bash
# Follow all backend logs
docker compose -f docker-compose.dev.yml logs -f backend

# Follow only the last 200 lines
docker compose -f docker-compose.dev.yml logs --tail=200 -f backend

# Structured JSON logs (if log driver is json-file)
docker logs blob-backend-backend-1 --follow

# Docker's rotating log files (on host)
sudo tail -f /var/lib/docker/containers/<container_id>/<container_id>-json.log
```

### Search / Filter Logs with grep

```bash
# Container name helper
CONTAINER="blob-backend-backend-1"

# ── Auth & Firebase errors ────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "firebase\|auth\|403\|forbidden\|token"

# ── Firestore errors ──────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "firestore\|firestore_unavailable\|503"

# ── Session errors ────────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "session\|redis\|suspend\|resume"

# ── WebSocket errors ──────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "websocket\|ws\|disconnect\|close"

# ── STT pipeline ──────────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "stt\|whisper\|transcrib\|confidence"

# ── LLM / Gemini ──────────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "llm\|gemini\|timeout\|bypassllm"

# ── Errors only ───────────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i '"level":"error"\|ERROR\|Exception\|Traceback'

# ── Rate limiting hits ────────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "rate_limit\|429\|too many"

# ── Specific session_id ───────────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep "sess_abc123"

# ── Combine filters with AND ──────────────────────────────────────────
docker logs $CONTAINER 2>&1 | grep -i "firestore" | grep -i "error"

# ── Last N lines, live tail with filter ───────────────────────────────
docker logs $CONTAINER --follow 2>&1 | grep -i "error\|warning"

# ── Export logs to a file for analysis ───────────────────────────────
docker logs $CONTAINER > /tmp/backend-$(date +%Y%m%d-%H%M%S).log 2>&1
```

### Check Firebase / ADC Credentials in Container

```bash
# Verify ADC file is mounted
docker compose -f docker-compose.dev.yml exec backend ls -la /root/.config/gcloud/

# View ADC credentials (check project_id and client_email)
docker exec -it blob-backend-backend-1 cat /root/.config/gcloud/application_default_credentials.json

# Check which Firebase project is configured
docker compose -f docker-compose.dev.yml exec backend env | grep FIREBASE

# Test Firebase connection from inside the container
docker exec -it blob-backend-backend-1 python3 -c "
from app.core.firebase import initialize_firebase, get_firestore_client
initialize_firebase()
db = get_firestore_client()
print('Firestore connected:', db is not None)
"
```

### Check Health Endpoints

```bash
# Basic health (from VM host)
curl http://localhost/health

# Detailed health (Firebase, Firestore, Redis status)
curl http://localhost/health/services | python3 -m json.tool

# From outside (replace with actual domain)
curl https://api.dev.gyfsoo.com/health/services | python3 -m json.tool
```

### Inspect Redis State

```bash
# All active sessions
docker exec -it blob-backend-redis-1 redis-cli KEYS "session:*"

# All suspended sessions
docker exec -it blob-backend-redis-1 redis-cli KEYS "suspended:*"

# All conversation snapshots
docker exec -it blob-backend-redis-1 redis-cli KEYS "conversation:*"

# Read session data (pretty print)
SESSION_ID="sess_abc123-def4-5678"
docker exec -it blob-backend-redis-1 redis-cli GET "session:${SESSION_ID}" | python3 -m json.tool

# Check child→session pointer
CHILD_ID="child_64d8d568a237"
docker exec -it blob-backend-redis-1 redis-cli GET "child_session:${CHILD_ID}"

# TTL of a conversation
CONV_ID="conv_abc123456789"
docker exec -it blob-backend-redis-1 redis-cli TTL "conversation:${CONV_ID}"

# Count active sessions
docker exec -it blob-backend-redis-1 redis-cli SCARD "active_sessions"
```

### Test the API Manually

```bash
# 1. Token exchange (replace with real Firebase ID token)
TOKEN_RESPONSE=$(curl -sX POST http://localhost:8000/api/v1/auth/token-exchange \
  -H "Content-Type: application/json" \
  -d '{"firebase_token": "<your Firebase ID token>"}')
echo $TOKEN_RESPONSE | python3 -m json.tool

# Extract the backend JWT
JWT=$(echo $TOKEN_RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['token'])")

# 2. Get current user profile
curl -s http://localhost:8000/api/v1/users/me \
  -H "Authorization: Bearer $JWT" | python3 -m json.tool

# 3. Create session
SESSION_RESPONSE=$(curl -sX POST http://localhost:8000/api/v1/sessions \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{"child_id": "<child_id>", "mode": "companion"}')
echo $SESSION_RESPONSE | python3 -m json.tool

SESSION_ID=$(echo $SESSION_RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['session_id'])")

# 4. WebSocket test (requires wscat: npm install -g wscat)
wscat -c "ws://localhost:8000/ws/voice?token=${JWT}&session_id=${SESSION_ID}"
```

### Common Issues and Fixes

| Symptom | Likely Cause | Fix |
|---------|--------------|-----|
| `503 Database temporarily unavailable` | Firestore / ADC not configured | Run `gcloud auth application-default login`, verify mount in docker-compose |
| `401 TOKEN_INVALID` on API calls | Wrong JWT or not using Backend JWT | Use token from `/auth/token-exchange`, not Firebase token |
| `403 FORBIDDEN` on child endpoints | Permission check failed | Verify child was created + family_link exists in Firestore |
| `4002 Session Not Found` on WS | session_id not in Redis or expired | Run `POST /sessions` again to get a fresh session |
| `4001 Token Expired` on WS | Backend JWT expired | Call `/auth/refresh` to get a new JWT |
| Container keeps restarting | faster-whisper model load OOM or crash | Check `docker logs` for Python traceback; ensure VM has ≥ 2GB RAM |
| Port 8000 conflict | Another process using port 8000 | `ss -tlnp | grep 8000`, kill the process or change APP_PORT |
| `No speech recognized` in STT | Audio too short or too quiet | Check `STT_MIN_SECONDS=1.0` and `VAD_ENERGY_THRESHOLD` |
| Gemini API timeout | `GOOGLE_API_KEY` missing or quota exceeded | Check `.env` for key; check GCP console for quota |

### Container Resource Check

```bash
# CPU/Memory usage of all containers
docker stats --no-stream

# Disk usage of Docker volumes
docker system df

# Check container restart count (high = crash loop)
docker inspect blob-backend-backend-1 | python3 -c "
import sys,json
data = json.load(sys.stdin)[0]
state = data['State']
print('Status:', state['Status'])
print('Restarts:', data['RestartCount'])
print('ExitCode:', state['ExitCode'])
"
```

---

*For architecture deep-dives, see the [Documents/DocsLates/Companion-POC/](Documents/DocsLates/Companion-POC/) folder:*

- `01_Architecture_Assessment.md` — full bug analysis + as-implemented data flow
- `02_Companion_Mode_Architecture.md` — session/conversation state machines
- `05_WebSocket_Message_Contract.md` — complete WS message catalog
- `07_Pause_Resume_Architecture.md` — pause/resume state machine
- `08_STT_LLM_Pipeline_Architecture.md` — audio pipeline deep-dive
- `10_POC_Implementation_Plan.md` — phase-by-phase implementation plan
- `11_Detailed_TODO_List.md` — full task list with completion status