AI Agent Service
AutoCom's AI Agent Service is a Python microservice that provides multi-step AI agent orchestration with tool use, conversation memory, RAG-powered knowledge retrieval, and graph-based workflows with human-in-the-loop approval gates.
Why a Separate Service?
The existing Laravel AI module handles simple, single-call operations well — classify text, extract an order, generate a product description. These stay in PHP.
The agent service handles complex operations where:
- An AI needs to call multiple tools, reason about results, and decide next steps
- Conversations need to persist across requests (memory)
- Responses should be grounded in your knowledge base articles (RAG)
- Multi-step workflows need human approval gates (graph flows)
- Different tenants need different AI configurations
Architecture
Client → Laravel API → Agent Service (Python) → LLM (OpenAI/Claude/Groq/...)
↑ ↓
└─ tool callbacks → ModuleApiBus → Orders/Products/Customers
→ Qdrant (vector search for KB articles)
Key design principle: The Python service is stateless regarding tenant configuration. Laravel sends a complete TenantContext payload with every request — provider credentials, available tools, permissions, callback URLs. The Python service executes and discards. No tenant config is persisted in Python.
What Lives Where
| Laravel AI Module (unchanged) | Python Agent Service (new) |
|---|---|
| Provider CRUD, credential storage | Multi-step agent chat with tools + memory |
| Chat playground (single-turn) | RAG-augmented conversations |
| Order extraction, classification | Graph flows (NDR, fraud, refund) |
| Product description generation | Human-in-the-loop workflows |
| KB article CRUD | Semantic KB search (vector embeddings) |
| LLM-powered translation (44+ langs) | Cross-conversation memory |
| Usage logging (AIUsageLog) | Agent role management |
| Workflow nodes (ai:chat, etc.) |
Docker Services
# In docker-compose.prod.yml
qdrant: # Vector store for RAG embeddings (port 6333 internal)
agent: # PydanticAI agent service (port 8100 internal)
Both services run on the internal Docker network — not exposed externally. Laravel calls http://agent:8100 directly.
Core Concepts
TenantContext
Every request from Laravel carries a TenantContext — the complete tenant configuration:
{
"tenant_id": "acme-corp",
"tenant_name": "Acme Corp",
"user_id": "user-uuid",
"user_permissions": ["orders.view", "orders.create", "ai.view"],
"providers": [{
"id": "openai",
"type": "openai_compatible",
"base_url": "https://api.openai.com/v1",
"api_key": "sk-...",
"model": "gpt-4o"
}],
"default_provider_id": "openai",
"agent": {
"role": "support",
"system_prompt": "You are a support agent for Acme Corp...",
"allowed_tools": [
{"name": "orders.get", "module": "Orders", "description": "Get order by ID"}
],
"temperature": 0.3
},
"callback": {
"base_url": "http://app:9000/api/v1/internal/agent",
"token": "acme-corp:1711234567:hmac_signature"
}
}
The Python service uses this to:
- Create a PydanticAI model instance with the tenant's API key
- Generate tools filtered by the tenant's permissions
- Set the system prompt for the active role
- Know where to call back for tool execution and data persistence
Tool Bridge
When an agent decides to call a tool (e.g., "look up order ORD-123"), the request flows back to Laravel:
Agent decides to call orders.get
→ Python POSTs to http://app:9000/api/v1/internal/agent/tools/orders.get
→ Laravel validates callback token
→ Laravel calls Bus::call('Orders', 'orders.get', $params) in tenant context
→ Result flows back to Python → agent continues reasoning
This keeps all database access, permission enforcement, and business logic in Laravel.
Agent Roles
Roles define what an agent can do — system prompt, allowed tools, temperature, and LLM provider:
| Role | Purpose | Tools | Temperature |
|---|---|---|---|
| support | Customer inquiries | orders.get, customers.* |
0.3 |
| operations | Order management | orders.* |
0.1 |
| analytics | Business questions | orders.getStats |
0.2 |
Tenants can customize roles and create new ones via the API.
In This Section
- Configuration — Set up LLM, STT, and TTS providers, credentials, and defaults
- Chat & Streaming — Agent chat with tool use, conversation memory, and SSE streaming
- Voice Assistant — Voice-powered AI with STT, TTS, and Voice Activity Detection
- RAG & Knowledge — Vector embeddings, semantic search, and KB-grounded responses
- Graph Flows — Multi-step workflows with human-in-the-loop
- Roles & Tools — Agent role configuration, tool permissions, and RBAC
- Adding Tools — How to expose module functionality as agent tools
- Adding Flows — How to build new graph flows
- Deployment — Docker setup, environment variables, and security