Deployment & Security

The AI Agent Service runs as two Docker containers — the Python agent service and the Qdrant vector store — alongside your existing AutoCom services.

Docker Setup

Services in docker-compose.prod.yml

qdrant:
  image: qdrant/qdrant:latest
  container_name: autocom-qdrant
  restart: always
  expose:
    - "6333"
    - "6334"
  volumes:
    - qdrant_data:/qdrant/storage
  networks:
    - autocom

agent:
  build:
    context: ./services/ai-agent
    dockerfile: Dockerfile
  container_name: autocom-agent
  restart: always
  expose:
    - "8100"
  environment:
    - AGENT_SHARED_SECRET=${AGENT_SHARED_SECRET}
    - AGENT_LOG_LEVEL=INFO
    - AGENT_QDRANT_URL=http://qdrant:6333
    - AGENT_QDRANT_COLLECTION=autocom_kb
  depends_on:
    app:
      condition: service_healthy
    qdrant:
      condition: service_healthy
  networks:
    - autocom

Neither service is exposed externally — they communicate via the internal Docker bridge network.

Build and Deploy

# Build the agent image
docker compose -f docker-compose.prod.yml build agent

# Start both services
docker compose -f docker-compose.prod.yml up -d qdrant agent

# Verify health
docker ps --filter "name=autocom-agent|autocom-qdrant"

Run Migrations

After deploying, run the new tenant migrations:

docker compose -f docker-compose.prod.yml exec app php artisan tenants:migrate

This creates the ai_conversations, ai_conversation_messages, and ai_graph_runs tables in each tenant database.

Environment Variables

Agent Service

Variable	Default	Description
`AGENT_SHARED_SECRET`	(required)	HMAC key for service-to-service auth. Must match Laravel's config.
`AGENT_LOG_LEVEL`	`INFO`	Python log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`
`AGENT_QDRANT_URL`	`http://qdrant:6333`	Qdrant service URL
`AGENT_QDRANT_COLLECTION`	`autocom_kb`	Qdrant collection name
`AGENT_DEFAULT_EMBEDDING_MODEL`	`text-embedding-3-small`	Default model for RAG embeddings
`AGENT_EMBEDDING_DIMENSIONS`	`1536`	Vector dimensions (must match model)
`AGENT_MAX_CONVERSATION_HISTORY`	`50`	Max messages loaded from history
`AGENT_MAX_RAG_RESULTS`	`10`	Max KB chunks retrieved per query
`AGENT_CALLBACK_TIMEOUT`	`30`	Timeout (seconds) for callbacks to Laravel
`AGENT_DEBUG`	`false`	Enable FastAPI docs at `/docs`

Laravel (.env)

# Add to your backend .env
AGENT_SHARED_SECRET=your-random-64-character-secret-key-here
AGENT_SERVICE_URL=http://agent:8100

Generate a secure secret:

openssl rand -hex 32

Security Model

Three Security Boundaries

1. Client → Laravel
   • Passport OAuth token
   • X-Tenant header
   • RBAC permission check

2. Laravel → Agent Service
   • X-Service-Token header (HMAC-signed timestamp)
   • Full TenantContext in body
   • Internal Docker network only

3. Agent Service → Laravel (callbacks)
   • X-Agent-Callback-Token (HMAC-signed, tenant-scoped, 15-min TTL)
   • X-Tenant header
   • Laravel re-validates permissions via ModuleApiBus

Service Token (Laravel → Python)

Laravel generates an HMAC-signed timestamp:

X-Service-Token: {timestamp}:{hmac_sha256(timestamp, shared_secret)}

The Python service validates:

Token format is valid
Timestamp is within 5 minutes (prevents replay attacks)
HMAC signature matches using the shared secret

Callback Token (Python → Laravel)

For each request, Laravel generates a short-lived callback token:

X-Agent-Callback-Token: {tenant_id}:{expiry_timestamp}:{hmac_sha256}

The Python service includes this on every callback to Laravel. Laravel validates:

Token tenant matches the X-Tenant header
Token hasn't expired (15-minute TTL)
HMAC signature is valid

What's NOT Persisted in Python

The Python service never stores:

Tenant API keys (received per-request, discarded after)
Tenant database credentials (never has them)
User data (all data access goes through Laravel callbacks)

The only persistent data in the Python service is:

Qdrant vectors (tenant-isolated via payload filters)
Application logs

Monitoring

Health Checks

# Agent service
curl http://agent:8100/api/v1/health
# {"status": "ok", "service": "autocom-agent"}

# Qdrant
curl http://qdrant:6333/
# {"title": "qdrant", "version": "..."}

Logs

# Agent service logs
docker logs autocom-agent -f

# Qdrant logs
docker logs autocom-qdrant -f

Usage Analytics

All agent operations are logged to ai_usage_logs in the tenant database. View via:

GET /api/v1/ai/usage?operation=agent_chat:support
GET /api/v1/ai/usage/summary?period=week

Scaling

Horizontal Scaling

The agent service is stateless — run multiple replicas behind a load balancer:

agent:
  deploy:
    replicas: 3

Each replica can handle any tenant's request since all context comes in the request body.

Qdrant Scaling

For small-to-medium deployments (< 1M vectors), single-node Qdrant is sufficient. For larger deployments, Qdrant supports distributed mode with sharding.

Troubleshooting

Agent Returns "Invalid service token"

The shared secret doesn't match between Laravel and the Python service. Ensure AGENT_SHARED_SECRET is identical in both .env files.

Callback Fails with 401

The callback token has expired (15-minute TTL) or the tenant ID doesn't match. This can happen if an agent request takes longer than 15 minutes — increase callback_token_ttl_seconds if needed.

RAG Returns No Results

Check if articles are indexed: GET /embeddings/stats?tenant_id=your-tenant
If vector count is 0, trigger re-indexing: POST /ai/agent/knowledge/reindex
Ensure the tenant has an OpenAI-compatible provider configured (Claude doesn't support embeddings)

Qdrant Connection Failed

Verify Qdrant is running and the agent can reach it:

docker exec autocom-agent python -c "
from qdrant_client import QdrantClient
c = QdrantClient(url='http://qdrant:6333')
print(c.get_collections())
"