Deployment & Security
The AI Agent Service runs as two Docker containers — the Python agent service and the Qdrant vector store — alongside your existing AutoCom services.
Docker Setup
Services in docker-compose.prod.yml
qdrant:
image: qdrant/qdrant:latest
container_name: autocom-qdrant
restart: always
expose:
- "6333"
- "6334"
volumes:
- qdrant_data:/qdrant/storage
networks:
- autocom
agent:
build:
context: ./services/ai-agent
dockerfile: Dockerfile
container_name: autocom-agent
restart: always
expose:
- "8100"
environment:
- AGENT_SHARED_SECRET=${AGENT_SHARED_SECRET}
- AGENT_LOG_LEVEL=INFO
- AGENT_QDRANT_URL=http://qdrant:6333
- AGENT_QDRANT_COLLECTION=autocom_kb
depends_on:
app:
condition: service_healthy
qdrant:
condition: service_healthy
networks:
- autocom
Neither service is exposed externally — they communicate via the internal Docker bridge network.
Build and Deploy
# Build the agent image
docker compose -f docker-compose.prod.yml build agent
# Start both services
docker compose -f docker-compose.prod.yml up -d qdrant agent
# Verify health
docker ps --filter "name=autocom-agent|autocom-qdrant"
Run Migrations
After deploying, run the new tenant migrations:
docker compose -f docker-compose.prod.yml exec app php artisan tenants:migrate
This creates the ai_conversations, ai_conversation_messages, and ai_graph_runs tables in each tenant database.
Environment Variables
Agent Service
| Variable | Default | Description |
|---|---|---|
AGENT_SHARED_SECRET |
(required) | HMAC key for service-to-service auth. Must match Laravel's config. |
AGENT_LOG_LEVEL |
INFO |
Python log level: DEBUG, INFO, WARNING, ERROR |
AGENT_QDRANT_URL |
http://qdrant:6333 |
Qdrant service URL |
AGENT_QDRANT_COLLECTION |
autocom_kb |
Qdrant collection name |
AGENT_DEFAULT_EMBEDDING_MODEL |
text-embedding-3-small |
Default model for RAG embeddings |
AGENT_EMBEDDING_DIMENSIONS |
1536 |
Vector dimensions (must match model) |
AGENT_MAX_CONVERSATION_HISTORY |
50 |
Max messages loaded from history |
AGENT_MAX_RAG_RESULTS |
10 |
Max KB chunks retrieved per query |
AGENT_CALLBACK_TIMEOUT |
30 |
Timeout (seconds) for callbacks to Laravel |
AGENT_DEBUG |
false |
Enable FastAPI docs at /docs |
Laravel (.env)
# Add to your backend .env
AGENT_SHARED_SECRET=your-random-64-character-secret-key-here
AGENT_SERVICE_URL=http://agent:8100
Generate a secure secret:
openssl rand -hex 32
Security Model
Three Security Boundaries
1. Client → Laravel
• Passport OAuth token
• X-Tenant header
• RBAC permission check
2. Laravel → Agent Service
• X-Service-Token header (HMAC-signed timestamp)
• Full TenantContext in body
• Internal Docker network only
3. Agent Service → Laravel (callbacks)
• X-Agent-Callback-Token (HMAC-signed, tenant-scoped, 15-min TTL)
• X-Tenant header
• Laravel re-validates permissions via ModuleApiBus
Service Token (Laravel → Python)
Laravel generates an HMAC-signed timestamp:
X-Service-Token: {timestamp}:{hmac_sha256(timestamp, shared_secret)}
The Python service validates:
- Token format is valid
- Timestamp is within 5 minutes (prevents replay attacks)
- HMAC signature matches using the shared secret
Callback Token (Python → Laravel)
For each request, Laravel generates a short-lived callback token:
X-Agent-Callback-Token: {tenant_id}:{expiry_timestamp}:{hmac_sha256}
The Python service includes this on every callback to Laravel. Laravel validates:
- Token tenant matches the
X-Tenantheader - Token hasn't expired (15-minute TTL)
- HMAC signature is valid
What's NOT Persisted in Python
The Python service never stores:
- Tenant API keys (received per-request, discarded after)
- Tenant database credentials (never has them)
- User data (all data access goes through Laravel callbacks)
The only persistent data in the Python service is:
- Qdrant vectors (tenant-isolated via payload filters)
- Application logs
Monitoring
Health Checks
# Agent service
curl http://agent:8100/api/v1/health
# {"status": "ok", "service": "autocom-agent"}
# Qdrant
curl http://qdrant:6333/
# {"title": "qdrant", "version": "..."}
Logs
# Agent service logs
docker logs autocom-agent -f
# Qdrant logs
docker logs autocom-qdrant -f
Usage Analytics
All agent operations are logged to ai_usage_logs in the tenant database. View via:
GET /api/v1/ai/usage?operation=agent_chat:support
GET /api/v1/ai/usage/summary?period=week
Scaling
Horizontal Scaling
The agent service is stateless — run multiple replicas behind a load balancer:
agent:
deploy:
replicas: 3
Each replica can handle any tenant's request since all context comes in the request body.
Qdrant Scaling
For small-to-medium deployments (< 1M vectors), single-node Qdrant is sufficient. For larger deployments, Qdrant supports distributed mode with sharding.
Troubleshooting
Agent Returns "Invalid service token"
The shared secret doesn't match between Laravel and the Python service. Ensure AGENT_SHARED_SECRET is identical in both .env files.
Callback Fails with 401
The callback token has expired (15-minute TTL) or the tenant ID doesn't match. This can happen if an agent request takes longer than 15 minutes — increase callback_token_ttl_seconds if needed.
RAG Returns No Results
- Check if articles are indexed:
GET /embeddings/stats?tenant_id=your-tenant - If vector count is 0, trigger re-indexing:
POST /ai/agent/knowledge/reindex - Ensure the tenant has an OpenAI-compatible provider configured (Claude doesn't support embeddings)
Qdrant Connection Failed
Verify Qdrant is running and the agent can reach it:
docker exec autocom-agent python -c "
from qdrant_client import QdrantClient
c = QdrantClient(url='http://qdrant:6333')
print(c.get_collections())
"