AI Agent Service

AutoCom's AI Agent Service is a Python microservice that provides multi-step AI agent orchestration with tool use, conversation memory, RAG-powered knowledge retrieval, and graph-based workflows with human-in-the-loop approval gates.

Why a Separate Service?

The existing Laravel AI module handles simple, single-call operations well — classify text, extract an order, generate a product description. These stay in PHP.

The agent service handles complex operations where:

  • An AI needs to call multiple tools, reason about results, and decide next steps
  • Conversations need to persist across requests (memory)
  • Responses should be grounded in your knowledge base articles (RAG)
  • Multi-step workflows need human approval gates (graph flows)
  • Different tenants need different AI configurations

Architecture

Client → Laravel API → Agent Service (Python) → LLM (OpenAI/Claude/Groq/...)
              ↑               ↓
              └─ tool callbacks → ModuleApiBus → Orders/Products/Customers
                             → Qdrant (vector search for KB articles)

Key design principle: The Python service is stateless regarding tenant configuration. Laravel sends a complete TenantContext payload with every request — provider credentials, available tools, permissions, callback URLs. The Python service executes and discards. No tenant config is persisted in Python.

What Lives Where

Laravel AI Module (unchanged) Python Agent Service (new)
Provider CRUD, credential storage Multi-step agent chat with tools + memory
Chat playground (single-turn) RAG-augmented conversations
Order extraction, classification Graph flows (NDR, fraud, refund)
Product description generation Human-in-the-loop workflows
KB article CRUD Semantic KB search (vector embeddings)
LLM-powered translation (44+ langs) Cross-conversation memory
Usage logging (AIUsageLog) Agent role management
Workflow nodes (ai:chat, etc.)

Docker Services

# In docker-compose.prod.yml
qdrant:     # Vector store for RAG embeddings (port 6333 internal)
agent:      # PydanticAI agent service (port 8100 internal)

Both services run on the internal Docker network — not exposed externally. Laravel calls http://agent:8100 directly.

Core Concepts

TenantContext

Every request from Laravel carries a TenantContext — the complete tenant configuration:

{
  "tenant_id": "acme-corp",
  "tenant_name": "Acme Corp",
  "user_id": "user-uuid",
  "user_permissions": ["orders.view", "orders.create", "ai.view"],

  "providers": [{
    "id": "openai",
    "type": "openai_compatible",
    "base_url": "https://api.openai.com/v1",
    "api_key": "sk-...",
    "model": "gpt-4o"
  }],
  "default_provider_id": "openai",

  "agent": {
    "role": "support",
    "system_prompt": "You are a support agent for Acme Corp...",
    "allowed_tools": [
      {"name": "orders.get", "module": "Orders", "description": "Get order by ID"}
    ],
    "temperature": 0.3
  },

  "callback": {
    "base_url": "http://app:9000/api/v1/internal/agent",
    "token": "acme-corp:1711234567:hmac_signature"
  }
}

The Python service uses this to:

  1. Create a PydanticAI model instance with the tenant's API key
  2. Generate tools filtered by the tenant's permissions
  3. Set the system prompt for the active role
  4. Know where to call back for tool execution and data persistence

Tool Bridge

When an agent decides to call a tool (e.g., "look up order ORD-123"), the request flows back to Laravel:

Agent decides to call orders.get
  → Python POSTs to http://app:9000/api/v1/internal/agent/tools/orders.get
  → Laravel validates callback token
  → Laravel calls Bus::call('Orders', 'orders.get', $params) in tenant context
  → Result flows back to Python → agent continues reasoning

This keeps all database access, permission enforcement, and business logic in Laravel.

Agent Roles

Roles define what an agent can do — system prompt, allowed tools, temperature, and LLM provider:

Role Purpose Tools Temperature
support Customer inquiries orders.get, customers.* 0.3
operations Order management orders.* 0.1
analytics Business questions orders.getStats 0.2

Tenants can customize roles and create new ones via the API.

In This Section

  • Configuration — Set up LLM, STT, and TTS providers, credentials, and defaults
  • Chat & Streaming — Agent chat with tool use, conversation memory, and SSE streaming
  • Voice Assistant — Voice-powered AI with STT, TTS, and Voice Activity Detection
  • RAG & Knowledge — Vector embeddings, semantic search, and KB-grounded responses
  • Graph Flows — Multi-step workflows with human-in-the-loop
  • Roles & Tools — Agent role configuration, tool permissions, and RBAC
  • Adding Tools — How to expose module functionality as agent tools
  • Adding Flows — How to build new graph flows
  • Deployment — Docker setup, environment variables, and security