AI Agent Service

AutoCom's AI Agent Service is a Python microservice that provides multi-step AI agent orchestration with tool use, conversation memory, RAG-powered knowledge retrieval, and graph-based workflows with human-in-the-loop approval gates.

Why a Separate Service?

The existing Laravel AI module handles simple, single-call operations well — classify text, extract an order, generate a product description. These stay in PHP.

The agent service handles complex operations where:

An AI needs to call multiple tools, reason about results, and decide next steps
Conversations need to persist across requests (memory)
Responses should be grounded in your knowledge base articles (RAG)
Multi-step workflows need human approval gates (graph flows)
Different tenants need different AI configurations

Architecture

Client → Laravel API → Agent Service (Python) → LLM (OpenAI/Claude/Groq/...)
              ↑               ↓
              └─ tool callbacks → ModuleApiBus → Orders/Products/Customers
                             → Qdrant (vector search for KB articles)

Key design principle: The Python service is stateless regarding tenant configuration. Laravel sends a complete TenantContext payload with every request — provider credentials, available tools, permissions, callback URLs. The Python service executes and discards. No tenant config is persisted in Python.

What Lives Where

Laravel AI Module (unchanged)	Python Agent Service (new)
Provider CRUD, credential storage	Multi-step agent chat with tools + memory
Chat playground (single-turn)	RAG-augmented conversations
Order extraction, classification	Graph flows (NDR, fraud, refund)
Product description generation	Human-in-the-loop workflows
KB article CRUD	Semantic KB search (vector embeddings)
LLM-powered translation (44+ langs)	Cross-conversation memory
Usage logging (AIUsageLog)	Agent role management
Workflow nodes (ai:chat, etc.)

Docker Services

# In docker-compose.prod.yml
qdrant:     # Vector store for RAG embeddings (port 6333 internal)
agent:      # PydanticAI agent service (port 8100 internal)

Both services run on the internal Docker network — not exposed externally. Laravel calls http://agent:8100 directly.

Core Concepts

TenantContext

Every request from Laravel carries a TenantContext — the complete tenant configuration:

{
  "tenant_id": "acme-corp",
  "tenant_name": "Acme Corp",
  "user_id": "user-uuid",
  "user_permissions": ["orders.view", "orders.create", "ai.view"],

  "providers": [{
    "id": "openai",
    "type": "openai_compatible",
    "base_url": "https://api.openai.com/v1",
    "api_key": "sk-...",
    "model": "gpt-4o"
  }],
  "default_provider_id": "openai",

  "agent": {
    "role": "support",
    "system_prompt": "You are a support agent for Acme Corp...",
    "allowed_tools": [
      {"name": "orders.get", "module": "Orders", "description": "Get order by ID"}
    ],
    "temperature": 0.3
  },

  "callback": {
    "base_url": "http://app:9000/api/v1/internal/agent",
    "token": "acme-corp:1711234567:hmac_signature"
  }
}

The Python service uses this to:

Create a PydanticAI model instance with the tenant's API key
Generate tools filtered by the tenant's permissions
Set the system prompt for the active role
Know where to call back for tool execution and data persistence

Tool Bridge

When an agent decides to call a tool (e.g., "look up order ORD-123"), the request flows back to Laravel:

Agent decides to call orders.get
  → Python POSTs to http://app:9000/api/v1/internal/agent/tools/orders.get
  → Laravel validates callback token
  → Laravel calls Bus::call('Orders', 'orders.get', $params) in tenant context
  → Result flows back to Python → agent continues reasoning

This keeps all database access, permission enforcement, and business logic in Laravel.

Agent Roles

Roles define what an agent can do — system prompt, allowed tools, temperature, and LLM provider:

Role	Purpose	Tools	Temperature
support	Customer inquiries	`orders.get`, `customers.*`	0.3
operations	Order management	`orders.*`	0.1
analytics	Business questions	`orders.getStats`	0.2

Tenants can customize roles and create new ones via the API.

In This Section

Configuration — Set up LLM, STT, and TTS providers, credentials, and defaults
Chat & Streaming — Agent chat with tool use, conversation memory, and SSE streaming
Voice Assistant — Voice-powered AI with STT, TTS, and Voice Activity Detection
RAG & Knowledge — Vector embeddings, semantic search, and KB-grounded responses
Graph Flows — Multi-step workflows with human-in-the-loop
Roles & Tools — Agent role configuration, tool permissions, and RBAC
Adding Tools — How to expose module functionality as agent tools
Adding Flows — How to build new graph flows
Deployment — Docker setup, environment variables, and security