AI Configuration

The AI Configuration page (/ai/config) is where you set up the AI providers that power all AI features — chat, agents, order extraction, auto-responses, product descriptions, knowledge base AI, and voice transcription.

LLM Providers

LLM (Large Language Model) providers handle text-based AI operations — chat, classification, extraction, auto-responses, and agent workflows.

Adding a Provider

  1. Go to AI Operations → AI Configuration
  2. Click Add Provider
  3. Choose from presets or configure a custom provider

Available Presets

Provider Models Best For
OpenAI gpt-4o, gpt-4o-mini, gpt-4-turbo General purpose, embeddings, most reliable
Claude (Anthropic) claude-sonnet-4, claude-3.5-haiku Advanced reasoning, long context
Groq llama-3.3-70b, mixtral-8x7b Ultra-fast inference, cost-effective
Together AI Llama 3.3 70B, Mixtral Open-source models, good pricing
Fireworks AI Llama 3.3 70B Fast open-source inference
OpenRouter Access to 100+ models Multi-model gateway
DeepSeek deepseek-chat, deepseek-reasoner Reasoning, very cheap
Mistral mistral-large, mistral-small, codestral European AI, code generation
Ollama llama3, mistral, codellama Local/self-hosted, free
Google Gemini gemini-2.0-flash, gemini-1.5-pro Multimodal, long context

Custom Providers

Any OpenAI-compatible API works. Click Add Provider → Custom Provider and provide:

  • Provider ID — unique identifier (lowercase, e.g., my-provider)
  • Name — display name
  • Typeopenai_compatible or anthropic
  • Base URL — API endpoint (e.g., https://api.example.com/v1)
  • Models — available model names

Credentials

After adding a provider, expand its card and enter:

  • API Key — your provider's API key (stored encrypted)
  • Model — which model to use by default

Click Test Connection to verify before saving.

Default & Fallback

Under General Settings:

  • Default Provider — used for all AI operations unless overridden
  • Fallback Provider — automatically used when the default fails (API errors, rate limits)

This means if OpenAI goes down, your AI features continue working via the fallback provider.

Speech-to-Text Providers

STT providers handle voice transcription — converting audio to text for the voice assistant and voice-based order creation.

Available STT Presets

Provider Models Pricing Features
Voxtral (Mistral) voxtral-mini, voxtral-small $0.003-0.006/min 13 languages, speaker diarization
Whisper (OpenAI) whisper-1 $0.006/min 99 languages, most accurate
Whisper (Groq) whisper-large-v3-turbo, whisper-large-v3 Free tier available Ultra-fast via Groq hardware

Adding an STT Provider

  1. On the AI Configuration page, scroll to Speech-to-Text Providers
  2. Click Add STT Provider
  3. Select a preset (Voxtral, Whisper OpenAI, or Whisper Groq)
  4. Configure credentials (API key + model selection)

STT Configuration

  • Default STT Provider — used by the voice assistant and transcription endpoints
  • Fallback STT Provider — used when the default fails

Audio Limits

Setting Value
Max file size 25 MB
Max duration 5 minutes
Supported formats m4a, mp3, wav, webm, ogg, flac, mp4
Sample rate 16kHz mono (optimized for voice)

Text-to-Speech Providers

TTS providers handle voice synthesis — converting AI text responses into spoken audio for the voice assistant.

Available TTS Presets

Provider Voices Pricing Features
Voxtral (Mistral) 30+ voices Usage-based Multi-language, streaming support
OpenAI alloy, echo, fable, onyx, nova, shimmer $15/1M chars 6 built-in voices, high quality
ElevenLabs 100+ voices $5-330/mo Voice cloning, most natural
Google Cloud TTS 400+ voices $4-16/1M chars WaveNet & Neural2, SSML support

Adding a TTS Provider

  1. On the AI Configuration page, scroll to Text-to-Speech Providers
  2. Click Add TTS Provider
  3. Select a preset and configure credentials
  4. Use Browse Voices to preview and select a default voice

Voice Selection

Each TTS provider offers different voices. After adding a provider, you can:

  • Browse available voices via GET /ai/tts/voices
  • Set a default voice via PUT /ai/tts/voice/default
  • Voices are stored per-tenant in settings

TTS Configuration

  • Default TTS Provider — used by the voice assistant for AI responses
  • Fallback TTS Provider — used when the default fails
  • Default Voice — the voice used for synthesis (provider-specific voice ID)

Synthesis Endpoints

Method Path Description
POST /ai/tts/synthesize Synthesize text to audio (returns base64)
POST /ai/tts/synthesize/stream Streaming synthesis (chunked response)
GET /ai/tts/voices List available voices for current provider
GET /ai/tts/voice/default Get current default voice setting
PUT /ai/tts/voice/default Set default voice

Synthesis request:

POST /api/v1/ai/tts/synthesize
{
  "text": "Your order has been shipped and will arrive tomorrow.",
  "format": "wav",
  "voice_id": "alloy"
}

Response:

{
  "audio_data": "UklGRi...",
  "format": "wav",
  "duration_ms": 2340
}

Supported formats: wav (fastest), mp3, opus, flac.

Translation

The AI module includes an LLM-powered translation service that reuses your already-configured AI providers — no separate translation API keys needed. Just select which provider and model to use for translation.

How It Works

Translation uses an LLM with a low-temperature (0.1) system prompt optimized for precise, tone-preserving translation. This approach supports 44+ languages and handles e-commerce content well — preserving order numbers, URLs, brand names, and formatting.

Configuration

On the AI Configuration page, scroll to the Translation section:

  1. Translation Provider — select from your configured LLM providers
  2. Translation Model — optionally override the model (leave empty for provider default)

Or via API:

PUT /api/v1/ai/translate/config

{
  "provider_id": "openrouter",
  "model": "google/gemma-4-26b-a4b-it"
}

Recommended Models

Model Provider Input/M Output/M Best For
Gemma 4 26B A4B OpenRouter $0.12 $0.40 Best quality-to-cost ratio, 140+ languages, fast (MoE)
Gemma 3 27B OpenRouter $0.08 $0.16 Cheapest option, good quality, fastest
DeepSeek V3 OpenRouter $0.20 $0.77 Best for Chinese, high quality across all languages
Mistral Small 4 OpenRouter/Mistral $0.15 $0.60 Strong European languages

Note: Gemma 4 26B A4B uses a Mixture-of-Experts architecture (only 4B params active per token), making it both fast and cost-efficient while maintaining translation quality comparable to much larger models.

Translation Endpoints

Method Path Permission Description
POST /ai/translate ai.view Translate text to a target language
POST /ai/translate/batch ai.view Translate to multiple languages at once
POST /ai/translate/detect ai.view Detect the language of text
GET /ai/translate/languages ai.view List 44 supported languages
GET /ai/translate/config ai.view Get current translation config
PUT /ai/translate/config ai.configure Update translation provider + model

Translate Text

POST /api/v1/ai/translate

{
  "text": "Your order has been shipped and will arrive tomorrow.",
  "target": "ml",
  "source": "en"
}

Response:

{
  "data": {
    "translated_text": "നിങ്ങളുടെ ഓർഡർ അയച്ചുവിട്ടു, നാളെ എത്തും.",
    "source_language": "en",
    "target_language": "ml",
    "provider": "openrouter",
    "model": "google/gemma-4-26b-a4b-it",
    "usage": { "input_tokens": 128, "output_tokens": 32 }
  }
}

The source field is optional — if omitted, the model auto-detects the source language.

Batch Translation

Translate to multiple languages in one call (max 10):

POST /api/v1/ai/translate/batch

{
  "text": "Hello! 20% discount on all products this weekend.",
  "targets": ["es", "fr", "de", "hi", "ar"]
}

Language Detection

POST /api/v1/ai/translate/detect

{ "text": "Bonjour, comment allez-vous?" }
{
  "data": {
    "language": "fr",
    "language_name": "French",
    "confidence": 0.99
  }
}

Supported Languages

44 languages including English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese (Simplified/Traditional), Arabic, Hindi, Bengali, Tamil, Telugu, Malayalam, Kannada, Marathi, Gujarati, Punjabi, Urdu, Thai, Vietnamese, Indonesian, Malay, Turkish, Polish, Ukrainian, Swedish, Danish, Norwegian, Finnish, Greek, Hebrew, Czech, Romanian, Hungarian, Bulgarian, Croatian, Slovak, Swahili, and Afrikaans.

The LLM can also translate to languages outside this list — these are just the ones with ISO 639-1 codes in the dropdown.

API Endpoints

LLM Provider Management

Method Path Description
GET /ai/providers List all LLM providers with status
POST /ai/providers Add a new provider
PUT /ai/providers/{id} Update provider metadata
DELETE /ai/providers/{id} Remove provider
POST /ai/providers/{id}/test Test connection
PUT /ai/providers/{id}/credentials Save API key + model
GET /ai/presets List available presets
GET /ai/config Get current settings
PUT /ai/config Update default/fallback/tracking settings

STT Provider Management

Method Path Description
GET /ai/stt/providers List STT providers with status
POST /ai/stt/providers Add STT provider from preset
DELETE /ai/stt/providers/{id} Remove STT provider
PUT /ai/stt/providers/{id}/credentials Save STT API key + model
GET /ai/stt/presets List STT presets
PUT /ai/stt/config Update STT default/fallback

TTS Provider Management

Method Path Description
GET /ai/tts/providers List TTS providers with status
POST /ai/tts/providers Add TTS provider from preset
DELETE /ai/tts/providers/{id} Remove TTS provider
PUT /ai/tts/providers/{id}/credentials Save TTS API key + model
GET /ai/tts/presets List TTS presets
PUT /ai/tts/config Update TTS default/fallback

Which Provider Do I Need?

Feature Requires Recommendation
Agent Chat LLM provider OpenAI (gpt-4o) or Claude
Order Extraction LLM provider Any — works with all
Auto-Responses LLM provider Any — works with all
Product Descriptions LLM provider Claude or GPT-4o (best writing)
RAG Embeddings OpenAI-compatible LLM OpenAI (text-embedding-3-small)
Voice Transcription STT provider Groq Whisper (free + fast) or Voxtral
Voice Responses TTS provider OpenAI (consistent) or ElevenLabs (natural)
Translation LLM provider Gemma 4 26B A4B (best value) or DeepSeek V3 (quality)
Agent Graph Flows LLM provider OpenAI or Claude (best reasoning)

Note: RAG embeddings require an OpenAI-compatible provider specifically. Anthropic (Claude) does not support embeddings. If Claude is your only LLM provider, add a second provider (even a free Groq account works) for embeddings.