Configurations - Feather

What Are Configurations?

Configurations are the AI model settings that power your voice agents. They control how agents listen, think, and speak during conversations. Proper configuration is essential for creating natural, responsive, and effective voice AI experiences. Configurations include:

STT (Speech-to-Text) - How agents convert customer speech to text
TTS (Text-to-Speech) - How agents convert responses to speech
LLM (Language Models) - The AI that powers conversation understanding and generation
Voices - Specific voice characteristics for agent speech

Configuration Architecture

Default vs Override Configurations

Agents can use configurations in two ways: Default Configurations:

{
  sttConfigId: "nova-3-stt",      // Reference to predefined STT config
  ttsConfigId: "aura-2-tts",      // Reference to predefined TTS config
  llmConfigId: "gpt-4o-mini-llm", // Reference to predefined LLM config
  voiceId: "6011b4c8-6140-4b7e-8a92-d9880de97b77"
}

Override Configurations:

{
  sttConfigId: "nova-3-stt",
  overrideSTTConfig: {
    // Custom STT settings
    language: "es-US",
    enablePunctuation: true
  },

  ttsConfigId: "aura-2-tts",
  overrideTTSConfig: {
    // Custom TTS settings
    speed: 1.1,
    pitch: 0
  },

  llmConfigId: "gpt-4o-mini-llm",
  overrideLLMConfig: {
    // Custom LLM settings
    temperature: 0.7,
    maxTokens: 500
  }
}

Speech-to-Text (STT) Configurations

What is STT?

STT converts customer speech into text that the LLM can understand. Accurate STT is critical for:

Understanding customer intent
Capturing key information
Reducing misunderstandings
Enabling fast responses

Available STT Configs

Get available STT configurations:

const response = await fetch('https://prod.featherhq.com/api/v1/stt-configs', {
  headers: {
    'X-API-Key': API_KEY
  }
});

const sttConfigs = await response.json();

sttConfigs.forEach(config => {
  console.log(`${config.id}: ${config.name}`);
  console.log(`  Provider: ${config.provider}`);
  console.log(`  Languages: ${config.supportedLanguages.join(', ')}`);
});

STT Configuration Options

Common STT configurations include:

{
  // Base config to use
  sttConfigId: "nova-3-stt",

  // Override settings
  overrideSTTConfig: {
    // Language and locale
    language: "en-US",          // Language code

    // Accuracy settings
    enablePunctuation: true,    // Add punctuation to transcript
    enableDiarization: false,   // Separate speakers
    profanityFilter: false,     // Filter profanity

    // Performance settings
    interimResults: true,       // Stream partial results
    singleUtterance: false,     // End on first pause

    // Model selection
    model: "latest",            // Use latest model version

    // Audio settings
    sampleRate: 16000,          // Audio sample rate (Hz)
    encoding: "LINEAR16"        // Audio encoding format
  }
}

Common STT Providers

Deepgram Nova 3 - High accuracy, low latency

{
  sttConfigId: "nova-3-stt",
  overrideSTTConfig: {
    language: "en-US",
    model: "nova-3",
    enablePunctuation: true
  }
}

Google Speech-to-Text - Wide language support

{
  sttConfigId: "google-stt",
  overrideSTTConfig: {
    language: "en-US",
    model: "phone_call",
    enableAutomaticPunctuation: true
  }
}

OpenAI Whisper - Excellent for noisy environments

{
  sttConfigId: "whisper-stt",
  overrideSTTConfig: {
    language: "en",
    model: "whisper-1"
  }
}

Text-to-Speech (TTS) Configurations

What is TTS?

TTS converts agent text responses into natural-sounding speech. Good TTS creates:

Natural conversation flow
Clear, understandable speech
Appropriate pacing and emotion
Consistent voice quality

Available TTS Configs

Get available TTS configurations:

const response = await fetch('https://prod.featherhq.com/api/v1/tts-configs', {
  headers: {
    'X-API-Key': API_KEY
  }
});

const ttsConfigs = await response.json();

ttsConfigs.forEach(config => {
  console.log(`${config.id}: ${config.name}`);
  console.log(`  Provider: ${config.provider}`);
  console.log(`  Quality: ${config.quality}`);
});

TTS Configuration Options

{
  // Base config to use
  ttsConfigId: "aura-2-tts",

  // Voice selection
  voiceId: "6011b4c8-6140-4b7e-8a92-d9880de97b77",

  // Override settings
  overrideTTSConfig: {
    // Speech characteristics
    speed: 1.0,              // 0.5 - 2.0 (1.0 = normal)
    pitch: 0,                // -20 to +20 (0 = normal)
    volume: 0,               // -96 to +16 dB (0 = normal)

    // Quality settings
    sampleRate: 24000,       // Audio quality (Hz)
    encoding: "LINEAR16",    // Audio encoding

    // Prosody
    emphasisLevel: "moderate",  // "strong", "moderate", "reduced"

    // Effects
    phoneFilterEnabled: true    // Simulate phone call quality
  }
}

Common TTS Providers

Deepgram Aura 2 - Ultra-low latency

{
  ttsConfigId: "aura-2-tts",
  voiceId: "aura-helios-en",  // Warm, friendly male voice
  overrideTTSConfig: {
    speed: 1.0
  }
}

ElevenLabs - Highly natural, expressive

{
  ttsConfigId: "elevenlabs-tts",
  voiceId: "pNInz6obpgDQGcFmaJgB",  // Adam voice
  overrideTTSConfig: {
    stability: 0.5,          // Voice consistency
    similarityBoost: 0.75    // Match to original voice
  }
}

OpenAI TTS - Natural conversation

{
  ttsConfigId: "openai-tts",
  voiceId: "alloy",          // Neutral, balanced voice
  overrideTTSConfig: {
    model: "tts-1-hd",       // High-definition model
    speed: 1.0
  }
}

Language Model (LLM) Configurations

What is an LLM?

The LLM is the “brain” that:

Understands customer intent
Generates appropriate responses
Decides when to use tools
Maintains conversation context
Makes decisions during calls

Available LLM Configs

Get available LLM configurations:

const response = await fetch('https://prod.featherhq.com/api/v1/llm-configs', {
  headers: {
    'X-API-Key': API_KEY
  }
});

const llmConfigs = await response.json();

llmConfigs.forEach(config => {
  console.log(`${config.id}: ${config.name}`);
  console.log(`  Provider: ${config.provider}`);
  console.log(`  Context Window: ${config.contextWindow} tokens`);
});

LLM Configuration Options

{
  // Base config to use
  llmConfigId: "gpt-4o-mini-llm",

  // Override settings
  overrideLLMConfig: {
    // Creativity vs consistency
    temperature: 0.7,         // 0.0 - 2.0 (lower = more consistent)

    // Response length
    maxTokens: 500,           // Max response length

    // Sampling control
    topP: 1.0,               // Nucleus sampling (0.0 - 1.0)
    frequencyPenalty: 0.0,   // Penalize repetition (-2.0 to 2.0)
    presencePenalty: 0.0,    // Encourage topic diversity (-2.0 to 2.0)

    // Advanced
    stop: ["\n\n", "###"],   // Stop sequences
    logitBias: {}            // Token probability adjustments
  }
}

Common LLM Models

GPT-4o - Most capable, best reasoning

{
  llmConfigId: "gpt-4o-llm",
  overrideLLMConfig: {
    temperature: 0.7,
    maxTokens: 500
  }
}

Best for: Complex conversations, nuanced understanding
Context window: 128K tokens
Speed: Moderate
Cost: Higher

GPT-4o-mini - Fast and cost-effective

{
  llmConfigId: "gpt-4o-mini-llm",
  overrideLLMConfig: {
    temperature: 0.7,
    maxTokens: 300
  }
}

Best for: Simple conversations, high volume
Context window: 128K tokens
Speed: Very fast
Cost: Lower

Claude 3.5 Sonnet - Excellent conversation quality

{
  llmConfigId: "claude-3-5-sonnet-llm",
  overrideLLMConfig: {
    temperature: 0.7,
    maxTokens: 500
  }
}

Best for: Natural dialogue, ethical reasoning
Context window: 200K tokens
Speed: Fast
Cost: Moderate

Temperature Settings Guide

// Very consistent - customer support
temperature: 0.2
// Agent gives same answer every time
// Good for: FAQs, policies, factual information

// Balanced - general use
temperature: 0.7
// Natural variation while staying on topic
// Good for: Sales, general conversations

// Creative - exploratory
temperature: 1.0
// More creative and varied responses
// Good for: Brainstorming, discovery calls

Voices

Available Voices

Get list of available voices:

const response = await fetch('https://prod.featherhq.com/api/v1/available-voices', {
  headers: {
    'X-API-Key': API_KEY
  }
});

const voices = await response.json();

voices.forEach(voice => {
  console.log(`${voice.name} (${voice.gender}, ${voice.language})`);
  console.log(`  ID: ${voice.id}`);
  console.log(`  Provider: ${voice.provider}`);
  console.log(`  ${voice.description}`);
});

Voice Selection

Choose voices based on: Brand alignment:

Professional and authoritative
Friendly and approachable
Young and energetic
Mature and experienced

Use case:

Support: Calm, patient, helpful
Sales: Confident, enthusiastic
Scheduling: Efficient, clear
Notifications: Neutral, informative

Demographics:

Match target audience
Consider cultural preferences
Gender considerations
Age appropriateness

Voice Examples

// Professional male - tech support
{
  voiceId: "aura-helios-en",
  overrideTTSConfig: {
    speed: 0.95,  // Slightly slower for clarity
    pitch: -2      // Slightly lower for authority
  }
}

// Friendly female - sales
{
  voiceId: "aura-stella-en",
  overrideTTSConfig: {
    speed: 1.05,   // Slightly faster for energy
    pitch: 2       // Slightly higher for warmth
  }
}

// Neutral - automated notifications
{
  voiceId: "aura-luna-en",
  overrideTTSConfig: {
    speed: 1.0,
    pitch: 0
  }
}

Configuration Best Practices

STT Best Practices

Language matching - Use correct language code for your audience
Enable punctuation - Improves LLM understanding
Test in production conditions - Account for phone line quality
Monitor accuracy - Track misrecognitions in transcripts
Consider latency - Balance accuracy with response time

TTS Best Practices

Natural speech rate - 0.95-1.05 speed for most use cases
Consistent voice - Use same voice throughout conversation
Test with real users - Verify voice quality and clarity
Match brand personality - Voice should align with brand
Phone optimization - Enable phone filter for call quality

LLM Best Practices

Start conservative - Lower temperature (0.5-0.7) for consistency
Limit response length - Shorter responses for voice (200-300 tokens)
Monitor token usage - Optimize for cost and performance
Test extensively - Validate behavior across scenarios
Version control - Track configuration changes

Voice Best Practices

Listen to samples - Test voices before deployment
Consider context - Different voices for different agents
Get feedback - Ask customers about voice quality
A/B test - Compare voice performance
Match expectations - Voice should fit agent personality

Performance Optimization

Latency Optimization

Reduce response time:

{
  // Use fastest configs
  sttConfigId: "nova-3-stt",      // Deepgram - very fast
  ttsConfigId: "aura-2-tts",       // Deepgram Aura - ultra-low latency
  llmConfigId: "gpt-4o-mini-llm",  // Fast and capable

  overrideSTTConfig: {
    interimResults: true    // Stream results
  },

  overrideLLMConfig: {
    maxTokens: 200,         // Shorter responses
    temperature: 0.5        // Less variation = faster
  },

  overrideTTSConfig: {
    sampleRate: 16000       // Lower quality = faster
  }
}

Quality Optimization

Maximize quality:

{
  // Use best configs
  sttConfigId: "whisper-stt",      // Most accurate
  ttsConfigId: "elevenlabs-tts",   // Most natural
  llmConfigId: "gpt-4o-llm",       // Most capable

  overrideSTTConfig: {
    enablePunctuation: true,
    enableDiarization: true
  },

  overrideLLMConfig: {
    temperature: 0.7,
    maxTokens: 500
  },

  overrideTTSConfig: {
    sampleRate: 24000,        // Higher quality
    model: "tts-1-hd"
  }
}

Cost Optimization

Reduce costs:

{
  // Use economical configs
  llmConfigId: "gpt-4o-mini-llm",  // Lower cost

  overrideLLMConfig: {
    maxTokens: 200,           // Shorter responses
    temperature: 0.5          // More consistent
  }
}

Common Use Cases

Customer Support

STT: Accurate, TTS: Calm & Clear, LLM: Helpful & Patient

Sales Outreach

STT: Fast, TTS: Energetic & Warm, LLM: Persuasive

Appointment Booking

STT: Reliable, TTS: Efficient & Professional, LLM: Task-focused

Surveys & Feedback

STT: Patient, TTS: Neutral, LLM: Question-focused

Troubleshooting

Poor Speech Recognition

Solutions:

Try different STT provider
Enable punctuation and diarization
Check audio quality settings
Test with different accents/languages

Unnatural Speech

Solutions:

Adjust TTS speed (0.95-1.05)
Try different voice
Reduce pitch modifications
Test with different TTS provider

Slow Response Times

Solutions:

Use faster STT/TTS/LLM configs
Reduce maxTokens
Enable interim results
Lower audio sample rates

Inconsistent Behavior

Solutions:

Lower temperature (0.3-0.5)
Add stop sequences
Reduce top_p
Use more specific prompts

Next Steps

Agents

Apply configurations to your agents

Testing Lab

Test different configuration combinations

Best Practices

Learn prompt engineering techniques

API Reference

Explore configuration APIs

Getting Started

Core Concepts

​What Are Configurations?

​Configuration Architecture

​Default vs Override Configurations

​Speech-to-Text (STT) Configurations

​What is STT?

​Available STT Configs

​STT Configuration Options

​Common STT Providers

​Text-to-Speech (TTS) Configurations

​What is TTS?

​Available TTS Configs

​TTS Configuration Options

​Common TTS Providers

​Language Model (LLM) Configurations

​What is an LLM?

​Available LLM Configs

​LLM Configuration Options

​Common LLM Models

​Temperature Settings Guide

​Voices

​Available Voices

​Voice Selection

​Voice Examples

​Configuration Best Practices

​STT Best Practices

​TTS Best Practices

​LLM Best Practices

​Voice Best Practices

​Performance Optimization

​Latency Optimization

​Quality Optimization

​Cost Optimization

​Common Use Cases

Customer Support

Sales Outreach

Appointment Booking

Surveys & Feedback

​Troubleshooting

​Poor Speech Recognition

​Unnatural Speech

​Slow Response Times

​Inconsistent Behavior

​Next Steps

Agents

Testing Lab

Best Practices

API Reference

What Are Configurations?

Configuration Architecture

Default vs Override Configurations

Speech-to-Text (STT) Configurations

What is STT?

Available STT Configs

STT Configuration Options

Common STT Providers

Text-to-Speech (TTS) Configurations

What is TTS?

Available TTS Configs

TTS Configuration Options

Common TTS Providers

Language Model (LLM) Configurations

What is an LLM?

Available LLM Configs

LLM Configuration Options

Common LLM Models

Temperature Settings Guide

Voices

Available Voices

Voice Selection

Voice Examples

Configuration Best Practices

STT Best Practices

TTS Best Practices

LLM Best Practices

Voice Best Practices

Performance Optimization

Latency Optimization

Quality Optimization

Cost Optimization

Common Use Cases

Troubleshooting

Poor Speech Recognition

Unnatural Speech

Slow Response Times

Inconsistent Behavior

Next Steps