Deployment & Security

The AI Agent Service runs as two Docker containers — the Python agent service and the Qdrant vector store — alongside your existing AutoCom services.

Docker Setup

Services in docker-compose.prod.yml

qdrant:
  image: qdrant/qdrant:latest
  container_name: autocom-qdrant
  restart: always
  expose:
    - "6333"
    - "6334"
  volumes:
    - qdrant_data:/qdrant/storage
  networks:
    - autocom

agent:
  build:
    context: ./services/ai-agent
    dockerfile: Dockerfile
  container_name: autocom-agent
  restart: always
  expose:
    - "8100"
  environment:
    - AGENT_SHARED_SECRET=${AGENT_SHARED_SECRET}
    - AGENT_LOG_LEVEL=INFO
    - AGENT_QDRANT_URL=http://qdrant:6333
    - AGENT_QDRANT_COLLECTION=autocom_kb
  depends_on:
    app:
      condition: service_healthy
    qdrant:
      condition: service_healthy
  networks:
    - autocom

Neither service is exposed externally — they communicate via the internal Docker bridge network.

Build and Deploy

# Build the agent image
docker compose -f docker-compose.prod.yml build agent

# Start both services
docker compose -f docker-compose.prod.yml up -d qdrant agent

# Verify health
docker ps --filter "name=autocom-agent|autocom-qdrant"

Run Migrations

After deploying, run the new tenant migrations:

docker compose -f docker-compose.prod.yml exec app php artisan tenants:migrate

This creates the ai_conversations, ai_conversation_messages, and ai_graph_runs tables in each tenant database.

Environment Variables

Agent Service

Variable Default Description
AGENT_SHARED_SECRET (required) HMAC key for service-to-service auth. Must match Laravel's config.
AGENT_LOG_LEVEL INFO Python log level: DEBUG, INFO, WARNING, ERROR
AGENT_QDRANT_URL http://qdrant:6333 Qdrant service URL
AGENT_QDRANT_COLLECTION autocom_kb Qdrant collection name
AGENT_DEFAULT_EMBEDDING_MODEL text-embedding-3-small Default model for RAG embeddings
AGENT_EMBEDDING_DIMENSIONS 1536 Vector dimensions (must match model)
AGENT_MAX_CONVERSATION_HISTORY 50 Max messages loaded from history
AGENT_MAX_RAG_RESULTS 10 Max KB chunks retrieved per query
AGENT_CALLBACK_TIMEOUT 30 Timeout (seconds) for callbacks to Laravel
AGENT_DEBUG false Enable FastAPI docs at /docs

Laravel (.env)

# Add to your backend .env
AGENT_SHARED_SECRET=your-random-64-character-secret-key-here
AGENT_SERVICE_URL=http://agent:8100

Generate a secure secret:

openssl rand -hex 32

Security Model

Three Security Boundaries

1. Client → Laravel
   • Passport OAuth token
   • X-Tenant header
   • RBAC permission check

2. Laravel → Agent Service
   • X-Service-Token header (HMAC-signed timestamp)
   • Full TenantContext in body
   • Internal Docker network only

3. Agent Service → Laravel (callbacks)
   • X-Agent-Callback-Token (HMAC-signed, tenant-scoped, 15-min TTL)
   • X-Tenant header
   • Laravel re-validates permissions via ModuleApiBus

Service Token (Laravel → Python)

Laravel generates an HMAC-signed timestamp:

X-Service-Token: {timestamp}:{hmac_sha256(timestamp, shared_secret)}

The Python service validates:

  • Token format is valid
  • Timestamp is within 5 minutes (prevents replay attacks)
  • HMAC signature matches using the shared secret

Callback Token (Python → Laravel)

For each request, Laravel generates a short-lived callback token:

X-Agent-Callback-Token: {tenant_id}:{expiry_timestamp}:{hmac_sha256}

The Python service includes this on every callback to Laravel. Laravel validates:

  • Token tenant matches the X-Tenant header
  • Token hasn't expired (15-minute TTL)
  • HMAC signature is valid

What's NOT Persisted in Python

The Python service never stores:

  • Tenant API keys (received per-request, discarded after)
  • Tenant database credentials (never has them)
  • User data (all data access goes through Laravel callbacks)

The only persistent data in the Python service is:

  • Qdrant vectors (tenant-isolated via payload filters)
  • Application logs

Monitoring

Health Checks

# Agent service
curl http://agent:8100/api/v1/health
# {"status": "ok", "service": "autocom-agent"}

# Qdrant
curl http://qdrant:6333/
# {"title": "qdrant", "version": "..."}

Logs

# Agent service logs
docker logs autocom-agent -f

# Qdrant logs
docker logs autocom-qdrant -f

Usage Analytics

All agent operations are logged to ai_usage_logs in the tenant database. View via:

GET /api/v1/ai/usage?operation=agent_chat:support
GET /api/v1/ai/usage/summary?period=week

Scaling

Horizontal Scaling

The agent service is stateless — run multiple replicas behind a load balancer:

agent:
  deploy:
    replicas: 3

Each replica can handle any tenant's request since all context comes in the request body.

Qdrant Scaling

For small-to-medium deployments (< 1M vectors), single-node Qdrant is sufficient. For larger deployments, Qdrant supports distributed mode with sharding.

Troubleshooting

Agent Returns "Invalid service token"

The shared secret doesn't match between Laravel and the Python service. Ensure AGENT_SHARED_SECRET is identical in both .env files.

Callback Fails with 401

The callback token has expired (15-minute TTL) or the tenant ID doesn't match. This can happen if an agent request takes longer than 15 minutes — increase callback_token_ttl_seconds if needed.

RAG Returns No Results

  1. Check if articles are indexed: GET /embeddings/stats?tenant_id=your-tenant
  2. If vector count is 0, trigger re-indexing: POST /ai/agent/knowledge/reindex
  3. Ensure the tenant has an OpenAI-compatible provider configured (Claude doesn't support embeddings)

Qdrant Connection Failed

Verify Qdrant is running and the agent can reach it:

docker exec autocom-agent python -c "
from qdrant_client import QdrantClient
c = QdrantClient(url='http://qdrant:6333')
print(c.get_collections())
"