AI Configuration
The AI Configuration page (/ai/config) is where you set up the AI providers that power all AI features — chat, agents, order extraction, auto-responses, product descriptions, knowledge base AI, and voice transcription.
LLM Providers
LLM (Large Language Model) providers handle text-based AI operations — chat, classification, extraction, auto-responses, and agent workflows.
Adding a Provider
- Go to AI Operations → AI Configuration
- Click Add Provider
- Choose from presets or configure a custom provider
Available Presets
| Provider | Models | Best For |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo | General purpose, embeddings, most reliable |
| Claude (Anthropic) | claude-sonnet-4, claude-3.5-haiku | Advanced reasoning, long context |
| Groq | llama-3.3-70b, mixtral-8x7b | Ultra-fast inference, cost-effective |
| Together AI | Llama 3.3 70B, Mixtral | Open-source models, good pricing |
| Fireworks AI | Llama 3.3 70B | Fast open-source inference |
| OpenRouter | Access to 100+ models | Multi-model gateway |
| DeepSeek | deepseek-chat, deepseek-reasoner | Reasoning, very cheap |
| Mistral | mistral-large, mistral-small, codestral | European AI, code generation |
| Ollama | llama3, mistral, codellama | Local/self-hosted, free |
| Google Gemini | gemini-2.0-flash, gemini-1.5-pro | Multimodal, long context |
Custom Providers
Any OpenAI-compatible API works. Click Add Provider → Custom Provider and provide:
- Provider ID — unique identifier (lowercase, e.g.,
my-provider) - Name — display name
- Type —
openai_compatibleoranthropic - Base URL — API endpoint (e.g.,
https://api.example.com/v1) - Models — available model names
Credentials
After adding a provider, expand its card and enter:
- API Key — your provider's API key (stored encrypted)
- Model — which model to use by default
Click Test Connection to verify before saving.
Default & Fallback
Under General Settings:
- Default Provider — used for all AI operations unless overridden
- Fallback Provider — automatically used when the default fails (API errors, rate limits)
This means if OpenAI goes down, your AI features continue working via the fallback provider.
Speech-to-Text Providers
STT providers handle voice transcription — converting audio to text for the voice assistant and voice-based order creation.
Available STT Presets
| Provider | Models | Pricing | Features |
|---|---|---|---|
| Voxtral (Mistral) | voxtral-mini, voxtral-small | $0.003-0.006/min | 13 languages, speaker diarization |
| Whisper (OpenAI) | whisper-1 | $0.006/min | 99 languages, most accurate |
| Whisper (Groq) | whisper-large-v3-turbo, whisper-large-v3 | Free tier available | Ultra-fast via Groq hardware |
Adding an STT Provider
- On the AI Configuration page, scroll to Speech-to-Text Providers
- Click Add STT Provider
- Select a preset (Voxtral, Whisper OpenAI, or Whisper Groq)
- Configure credentials (API key + model selection)
STT Configuration
- Default STT Provider — used by the voice assistant and transcription endpoints
- Fallback STT Provider — used when the default fails
Audio Limits
| Setting | Value |
|---|---|
| Max file size | 25 MB |
| Max duration | 5 minutes |
| Supported formats | m4a, mp3, wav, webm, ogg, flac, mp4 |
| Sample rate | 16kHz mono (optimized for voice) |
Text-to-Speech Providers
TTS providers handle voice synthesis — converting AI text responses into spoken audio for the voice assistant.
Available TTS Presets
| Provider | Voices | Pricing | Features |
|---|---|---|---|
| Voxtral (Mistral) | 30+ voices | Usage-based | Multi-language, streaming support |
| OpenAI | alloy, echo, fable, onyx, nova, shimmer | $15/1M chars | 6 built-in voices, high quality |
| ElevenLabs | 100+ voices | $5-330/mo | Voice cloning, most natural |
| Google Cloud TTS | 400+ voices | $4-16/1M chars | WaveNet & Neural2, SSML support |
Adding a TTS Provider
- On the AI Configuration page, scroll to Text-to-Speech Providers
- Click Add TTS Provider
- Select a preset and configure credentials
- Use Browse Voices to preview and select a default voice
Voice Selection
Each TTS provider offers different voices. After adding a provider, you can:
- Browse available voices via
GET /ai/tts/voices - Set a default voice via
PUT /ai/tts/voice/default - Voices are stored per-tenant in settings
TTS Configuration
- Default TTS Provider — used by the voice assistant for AI responses
- Fallback TTS Provider — used when the default fails
- Default Voice — the voice used for synthesis (provider-specific voice ID)
Synthesis Endpoints
| Method | Path | Description |
|---|---|---|
POST |
/ai/tts/synthesize |
Synthesize text to audio (returns base64) |
POST |
/ai/tts/synthesize/stream |
Streaming synthesis (chunked response) |
GET |
/ai/tts/voices |
List available voices for current provider |
GET |
/ai/tts/voice/default |
Get current default voice setting |
PUT |
/ai/tts/voice/default |
Set default voice |
Synthesis request:
POST /api/v1/ai/tts/synthesize
{
"text": "Your order has been shipped and will arrive tomorrow.",
"format": "wav",
"voice_id": "alloy"
}
Response:
{
"audio_data": "UklGRi...",
"format": "wav",
"duration_ms": 2340
}
Supported formats: wav (fastest), mp3, opus, flac.
Translation
The AI module includes an LLM-powered translation service that reuses your already-configured AI providers — no separate translation API keys needed. Just select which provider and model to use for translation.
How It Works
Translation uses an LLM with a low-temperature (0.1) system prompt optimized for precise, tone-preserving translation. This approach supports 44+ languages and handles e-commerce content well — preserving order numbers, URLs, brand names, and formatting.
Configuration
On the AI Configuration page, scroll to the Translation section:
- Translation Provider — select from your configured LLM providers
- Translation Model — optionally override the model (leave empty for provider default)
Or via API:
PUT /api/v1/ai/translate/config
{
"provider_id": "openrouter",
"model": "google/gemma-4-26b-a4b-it"
}
Recommended Models
| Model | Provider | Input/M | Output/M | Best For |
|---|---|---|---|---|
| Gemma 4 26B A4B | OpenRouter | $0.12 | $0.40 | Best quality-to-cost ratio, 140+ languages, fast (MoE) |
| Gemma 3 27B | OpenRouter | $0.08 | $0.16 | Cheapest option, good quality, fastest |
| DeepSeek V3 | OpenRouter | $0.20 | $0.77 | Best for Chinese, high quality across all languages |
| Mistral Small 4 | OpenRouter/Mistral | $0.15 | $0.60 | Strong European languages |
Note: Gemma 4 26B A4B uses a Mixture-of-Experts architecture (only 4B params active per token), making it both fast and cost-efficient while maintaining translation quality comparable to much larger models.
Translation Endpoints
| Method | Path | Permission | Description |
|---|---|---|---|
POST |
/ai/translate |
ai.view |
Translate text to a target language |
POST |
/ai/translate/batch |
ai.view |
Translate to multiple languages at once |
POST |
/ai/translate/detect |
ai.view |
Detect the language of text |
GET |
/ai/translate/languages |
ai.view |
List 44 supported languages |
GET |
/ai/translate/config |
ai.view |
Get current translation config |
PUT |
/ai/translate/config |
ai.configure |
Update translation provider + model |
Translate Text
POST /api/v1/ai/translate
{
"text": "Your order has been shipped and will arrive tomorrow.",
"target": "ml",
"source": "en"
}
Response:
{
"data": {
"translated_text": "നിങ്ങളുടെ ഓർഡർ അയച്ചുവിട്ടു, നാളെ എത്തും.",
"source_language": "en",
"target_language": "ml",
"provider": "openrouter",
"model": "google/gemma-4-26b-a4b-it",
"usage": { "input_tokens": 128, "output_tokens": 32 }
}
}
The source field is optional — if omitted, the model auto-detects the source language.
Batch Translation
Translate to multiple languages in one call (max 10):
POST /api/v1/ai/translate/batch
{
"text": "Hello! 20% discount on all products this weekend.",
"targets": ["es", "fr", "de", "hi", "ar"]
}
Language Detection
POST /api/v1/ai/translate/detect
{ "text": "Bonjour, comment allez-vous?" }
{
"data": {
"language": "fr",
"language_name": "French",
"confidence": 0.99
}
}
Supported Languages
44 languages including English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese (Simplified/Traditional), Arabic, Hindi, Bengali, Tamil, Telugu, Malayalam, Kannada, Marathi, Gujarati, Punjabi, Urdu, Thai, Vietnamese, Indonesian, Malay, Turkish, Polish, Ukrainian, Swedish, Danish, Norwegian, Finnish, Greek, Hebrew, Czech, Romanian, Hungarian, Bulgarian, Croatian, Slovak, Swahili, and Afrikaans.
The LLM can also translate to languages outside this list — these are just the ones with ISO 639-1 codes in the dropdown.
API Endpoints
LLM Provider Management
| Method | Path | Description |
|---|---|---|
GET |
/ai/providers |
List all LLM providers with status |
POST |
/ai/providers |
Add a new provider |
PUT |
/ai/providers/{id} |
Update provider metadata |
DELETE |
/ai/providers/{id} |
Remove provider |
POST |
/ai/providers/{id}/test |
Test connection |
PUT |
/ai/providers/{id}/credentials |
Save API key + model |
GET |
/ai/presets |
List available presets |
GET |
/ai/config |
Get current settings |
PUT |
/ai/config |
Update default/fallback/tracking settings |
STT Provider Management
| Method | Path | Description |
|---|---|---|
GET |
/ai/stt/providers |
List STT providers with status |
POST |
/ai/stt/providers |
Add STT provider from preset |
DELETE |
/ai/stt/providers/{id} |
Remove STT provider |
PUT |
/ai/stt/providers/{id}/credentials |
Save STT API key + model |
GET |
/ai/stt/presets |
List STT presets |
PUT |
/ai/stt/config |
Update STT default/fallback |
TTS Provider Management
| Method | Path | Description |
|---|---|---|
GET |
/ai/tts/providers |
List TTS providers with status |
POST |
/ai/tts/providers |
Add TTS provider from preset |
DELETE |
/ai/tts/providers/{id} |
Remove TTS provider |
PUT |
/ai/tts/providers/{id}/credentials |
Save TTS API key + model |
GET |
/ai/tts/presets |
List TTS presets |
PUT |
/ai/tts/config |
Update TTS default/fallback |
Which Provider Do I Need?
| Feature | Requires | Recommendation |
|---|---|---|
| Agent Chat | LLM provider | OpenAI (gpt-4o) or Claude |
| Order Extraction | LLM provider | Any — works with all |
| Auto-Responses | LLM provider | Any — works with all |
| Product Descriptions | LLM provider | Claude or GPT-4o (best writing) |
| RAG Embeddings | OpenAI-compatible LLM | OpenAI (text-embedding-3-small) |
| Voice Transcription | STT provider | Groq Whisper (free + fast) or Voxtral |
| Voice Responses | TTS provider | OpenAI (consistent) or ElevenLabs (natural) |
| Translation | LLM provider | Gemma 4 26B A4B (best value) or DeepSeek V3 (quality) |
| Agent Graph Flows | LLM provider | OpenAI or Claude (best reasoning) |
Note: RAG embeddings require an OpenAI-compatible provider specifically. Anthropic (Claude) does not support embeddings. If Claude is your only LLM provider, add a second provider (even a free Groq account works) for embeddings.