AI Configuration

The AI Configuration page (/ai/config) is where you set up the AI providers that power all AI features — chat, agents, order extraction, auto-responses, product descriptions, knowledge base AI, and voice transcription.

LLM Providers

LLM (Large Language Model) providers handle text-based AI operations — chat, classification, extraction, auto-responses, and agent workflows.

Adding a Provider

Go to AI Operations → AI Configuration
Click Add Provider
Choose from presets or configure a custom provider

Available Presets

Provider	Models	Best For
OpenAI	gpt-4o, gpt-4o-mini, gpt-4-turbo	General purpose, embeddings, most reliable
Claude (Anthropic)	claude-sonnet-4, claude-3.5-haiku	Advanced reasoning, long context
Groq	llama-3.3-70b, mixtral-8x7b	Ultra-fast inference, cost-effective
Together AI	Llama 3.3 70B, Mixtral	Open-source models, good pricing
Fireworks AI	Llama 3.3 70B	Fast open-source inference
OpenRouter	Access to 100+ models	Multi-model gateway
DeepSeek	deepseek-chat, deepseek-reasoner	Reasoning, very cheap
Mistral	mistral-large, mistral-small, codestral	European AI, code generation
Ollama	llama3, mistral, codellama	Local/self-hosted, free
Google Gemini	gemini-2.0-flash, gemini-1.5-pro	Multimodal, long context

Custom Providers

Any OpenAI-compatible API works. Click Add Provider → Custom Provider and provide:

Provider ID — unique identifier (lowercase, e.g., my-provider)
Name — display name
Type — openai_compatible or anthropic
Base URL — API endpoint (e.g., https://api.example.com/v1)
Models — available model names

Credentials

After adding a provider, expand its card and enter:

API Key — your provider's API key (stored encrypted)
Model — which model to use by default

Click Test Connection to verify before saving.

Default & Fallback

Under General Settings:

Default Provider — used for all AI operations unless overridden
Fallback Provider — automatically used when the default fails (API errors, rate limits)

This means if OpenAI goes down, your AI features continue working via the fallback provider.

Speech-to-Text Providers

STT providers handle voice transcription — converting audio to text for the voice assistant and voice-based order creation.

Available STT Presets

Provider	Models	Pricing	Features
Voxtral (Mistral)	voxtral-mini, voxtral-small	$0.003-0.006/min	13 languages, speaker diarization
Whisper (OpenAI)	whisper-1	$0.006/min	99 languages, most accurate
Whisper (Groq)	whisper-large-v3-turbo, whisper-large-v3	Free tier available	Ultra-fast via Groq hardware

Adding an STT Provider

On the AI Configuration page, scroll to Speech-to-Text Providers
Click Add STT Provider
Select a preset (Voxtral, Whisper OpenAI, or Whisper Groq)
Configure credentials (API key + model selection)

STT Configuration

Default STT Provider — used by the voice assistant and transcription endpoints
Fallback STT Provider — used when the default fails

Audio Limits

Setting	Value
Max file size	25 MB
Max duration	5 minutes
Supported formats	m4a, mp3, wav, webm, ogg, flac, mp4
Sample rate	16kHz mono (optimized for voice)

Text-to-Speech Providers

TTS providers handle voice synthesis — converting AI text responses into spoken audio for the voice assistant.

Available TTS Presets

Provider	Voices	Pricing	Features
Voxtral (Mistral)	30+ voices	Usage-based	Multi-language, streaming support
OpenAI	alloy, echo, fable, onyx, nova, shimmer	$15/1M chars	6 built-in voices, high quality
ElevenLabs	100+ voices	$5-330/mo	Voice cloning, most natural
Google Cloud TTS	400+ voices	$4-16/1M chars	WaveNet & Neural2, SSML support

Adding a TTS Provider

On the AI Configuration page, scroll to Text-to-Speech Providers
Click Add TTS Provider
Select a preset and configure credentials
Use Browse Voices to preview and select a default voice

Voice Selection

Each TTS provider offers different voices. After adding a provider, you can:

Browse available voices via GET /ai/tts/voices
Set a default voice via PUT /ai/tts/voice/default
Voices are stored per-tenant in settings

TTS Configuration

Default TTS Provider — used by the voice assistant for AI responses
Fallback TTS Provider — used when the default fails
Default Voice — the voice used for synthesis (provider-specific voice ID)

Synthesis Endpoints

Method	Path	Description
`POST`	`/ai/tts/synthesize`	Synthesize text to audio (returns base64)
`POST`	`/ai/tts/synthesize/stream`	Streaming synthesis (chunked response)
`GET`	`/ai/tts/voices`	List available voices for current provider
`GET`	`/ai/tts/voice/default`	Get current default voice setting
`PUT`	`/ai/tts/voice/default`	Set default voice

Synthesis request:

POST /api/v1/ai/tts/synthesize
{
  "text": "Your order has been shipped and will arrive tomorrow.",
  "format": "wav",
  "voice_id": "alloy"
}

Response:

{
  "audio_data": "UklGRi...",
  "format": "wav",
  "duration_ms": 2340
}

Supported formats: wav (fastest), mp3, opus, flac.

Translation

The AI module includes an LLM-powered translation service that reuses your already-configured AI providers — no separate translation API keys needed. Just select which provider and model to use for translation.

How It Works

Translation uses an LLM with a low-temperature (0.1) system prompt optimized for precise, tone-preserving translation. This approach supports 44+ languages and handles e-commerce content well — preserving order numbers, URLs, brand names, and formatting.

Configuration

On the AI Configuration page, scroll to the Translation section:

Translation Provider — select from your configured LLM providers
Translation Model — optionally override the model (leave empty for provider default)

Or via API:

PUT /api/v1/ai/translate/config

{
  "provider_id": "openrouter",
  "model": "google/gemma-4-26b-a4b-it"
}

Recommended Models

Model	Provider	Input/M	Output/M	Best For
Gemma 4 26B A4B	OpenRouter	$0.12	$0.40	Best quality-to-cost ratio, 140+ languages, fast (MoE)
Gemma 3 27B	OpenRouter	$0.08	$0.16	Cheapest option, good quality, fastest
DeepSeek V3	OpenRouter	$0.20	$0.77	Best for Chinese, high quality across all languages
Mistral Small 4	OpenRouter/Mistral	$0.15	$0.60	Strong European languages

Note: Gemma 4 26B A4B uses a Mixture-of-Experts architecture (only 4B params active per token), making it both fast and cost-efficient while maintaining translation quality comparable to much larger models.

Translation Endpoints

Method	Path	Permission	Description
`POST`	`/ai/translate`	`ai.view`	Translate text to a target language
`POST`	`/ai/translate/batch`	`ai.view`	Translate to multiple languages at once
`POST`	`/ai/translate/detect`	`ai.view`	Detect the language of text
`GET`	`/ai/translate/languages`	`ai.view`	List 44 supported languages
`GET`	`/ai/translate/config`	`ai.view`	Get current translation config
`PUT`	`/ai/translate/config`	`ai.configure`	Update translation provider + model

Translate Text

POST /api/v1/ai/translate

{
  "text": "Your order has been shipped and will arrive tomorrow.",
  "target": "ml",
  "source": "en"
}

Response:

{
  "data": {
    "translated_text": "നിങ്ങളുടെ ഓർഡർ അയച്ചുവിട്ടു, നാളെ എത്തും.",
    "source_language": "en",
    "target_language": "ml",
    "provider": "openrouter",
    "model": "google/gemma-4-26b-a4b-it",
    "usage": { "input_tokens": 128, "output_tokens": 32 }
  }
}

The source field is optional — if omitted, the model auto-detects the source language.

Batch Translation

Translate to multiple languages in one call (max 10):

POST /api/v1/ai/translate/batch

{
  "text": "Hello! 20% discount on all products this weekend.",
  "targets": ["es", "fr", "de", "hi", "ar"]
}

Language Detection

POST /api/v1/ai/translate/detect

{ "text": "Bonjour, comment allez-vous?" }

{
  "data": {
    "language": "fr",
    "language_name": "French",
    "confidence": 0.99
  }
}

Supported Languages

44 languages including English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese (Simplified/Traditional), Arabic, Hindi, Bengali, Tamil, Telugu, Malayalam, Kannada, Marathi, Gujarati, Punjabi, Urdu, Thai, Vietnamese, Indonesian, Malay, Turkish, Polish, Ukrainian, Swedish, Danish, Norwegian, Finnish, Greek, Hebrew, Czech, Romanian, Hungarian, Bulgarian, Croatian, Slovak, Swahili, and Afrikaans.

The LLM can also translate to languages outside this list — these are just the ones with ISO 639-1 codes in the dropdown.

API Endpoints

LLM Provider Management

Method	Path	Description
`GET`	`/ai/providers`	List all LLM providers with status
`POST`	`/ai/providers`	Add a new provider
`PUT`	`/ai/providers/{id}`	Update provider metadata
`DELETE`	`/ai/providers/{id}`	Remove provider
`POST`	`/ai/providers/{id}/test`	Test connection
`PUT`	`/ai/providers/{id}/credentials`	Save API key + model
`GET`	`/ai/presets`	List available presets
`GET`	`/ai/config`	Get current settings
`PUT`	`/ai/config`	Update default/fallback/tracking settings

STT Provider Management

Method	Path	Description
`GET`	`/ai/stt/providers`	List STT providers with status
`POST`	`/ai/stt/providers`	Add STT provider from preset
`DELETE`	`/ai/stt/providers/{id}`	Remove STT provider
`PUT`	`/ai/stt/providers/{id}/credentials`	Save STT API key + model
`GET`	`/ai/stt/presets`	List STT presets
`PUT`	`/ai/stt/config`	Update STT default/fallback

TTS Provider Management

Method	Path	Description
`GET`	`/ai/tts/providers`	List TTS providers with status
`POST`	`/ai/tts/providers`	Add TTS provider from preset
`DELETE`	`/ai/tts/providers/{id}`	Remove TTS provider
`PUT`	`/ai/tts/providers/{id}/credentials`	Save TTS API key + model
`GET`	`/ai/tts/presets`	List TTS presets
`PUT`	`/ai/tts/config`	Update TTS default/fallback

Which Provider Do I Need?

Feature	Requires	Recommendation
Agent Chat	LLM provider	OpenAI (gpt-4o) or Claude
Order Extraction	LLM provider	Any — works with all
Auto-Responses	LLM provider	Any — works with all
Product Descriptions	LLM provider	Claude or GPT-4o (best writing)
RAG Embeddings	OpenAI-compatible LLM	OpenAI (text-embedding-3-small)
Voice Transcription	STT provider	Groq Whisper (free + fast) or Voxtral
Voice Responses	TTS provider	OpenAI (consistent) or ElevenLabs (natural)
Translation	LLM provider	Gemma 4 26B A4B (best value) or DeepSeek V3 (quality)
Agent Graph Flows	LLM provider	OpenAI or Claude (best reasoning)

Note: RAG embeddings require an OpenAI-compatible provider specifically. Anthropic (Claude) does not support embeddings. If Claude is your only LLM provider, add a second provider (even a free Groq account works) for embeddings.