What is the Testing Lab?
The Testing Lab is Feather’s automated testing framework for voice AI agents. It allows you to create test scenarios, run them against your agents, and validate behavior before production deployment. Think of it as your agent’s quality assurance system that:- Creates test scenarios - Define specific conversation flows to test
- Simulates conversations - Run automated tests without real phone calls
- Validates outcomes - Check if agents behave as expected
- Generates test cases - AI-powered scenario generation
- Tracks performance - Monitor test results over time
- Prevents regressions - Catch issues before they reach customers
Why Test Your Agents?
Before Production Deployment
- Verify agents handle common scenarios correctly
- Test edge cases and error conditions
- Validate tool integrations work as expected
- Ensure conversation flow is natural
- Check prompt changes don’t break functionality
Continuous Validation
- Regression testing after prompt updates
- Validate new agent versions
- Test different configurations
- Compare agent performance
- Quality assurance for agent updates
Test Scenarios
What is a Test Scenario?
A scenario defines a specific test case with:Scenario Components
Name: Descriptive test case nameManaging Test Scenarios
List All Scenarios
Create a Scenario
Generate Scenarios with AI
Let AI create test scenarios for you:- Agent’s purpose and configuration
- Common use cases for your industry
- Edge cases and error handling
- Tool usage validation
- Conversation flow testing
Update a Scenario
Delete a Scenario
Running Tests
Execute Test Scenarios
Run scenarios against an agent:Test Results
Results include:Test Evaluation
Automatic Evaluation
Feather automatically evaluates test results: Pass Criteria:- Conversation follows expected flow
- Correct disposition
- Required tools were called
- Key information was provided
- Expected outcome achieved
- 90-100: Excellent - Exceeded expectations
- 75-89: Good - Met all requirements
- 60-74: Acceptable - Minor issues
- Below 60: Needs improvement
Manual Review
Review failed tests:Test Scenario Patterns
Happy Path Testing
Test ideal customer journeys:Edge Case Testing
Test unusual situations:Objection Handling
Test sales objection scenarios:Error Handling
Test system failures:Tool Usage Validation
Test specific tool integrations:Best Practices
Scenario Design
- Be specific - Clear instructions and expected outcomes
- Test one thing - Each scenario should focus on specific behavior
- Use realistic data - Test with production-like information
- Cover edge cases - Don’t just test happy paths
- Update regularly - Evolve scenarios as agents improve
Test Coverage
Create scenarios for:- All major conversation flows
- Each tool integration
- Common customer objections
- Error conditions
- Edge cases and exceptions
- Different customer personalities
- Various outcomes (success, transfer, decline, etc.)
Continuous Testing
- Test before deployment - Run full suite before going live
- Regression testing - Re-run tests after changes
- Version comparison - Test new vs old versions
- Monitor failures - Track which scenarios fail frequently
- Iterate on failures - Improve prompts based on test results
Test Organization
Integration with CI/CD
Automated Testing Pipeline
Git Hooks
Common Use Cases
Pre-Deployment Testing
Validate agents before production release
Regression Testing
Ensure updates don’t break existing functionality
A/B Testing
Compare different agent configurations
Quality Assurance
Maintain consistent agent performance
Tool Validation
Verify custom tool integrations work correctly
Edge Case Coverage
Test unusual scenarios and error conditions
Troubleshooting
Tests Failing Unexpectedly
Check:- Agent version is deployed
- Tools and integrations are working
- Test scenario instructions are clear
- Expected outcomes are realistic
- Phone numbers in scenarios are valid
Inconsistent Results
Causes:- Non-deterministic LLM behavior
- External API variability
- Timing-dependent scenarios
- Run tests multiple times
- Make expected outcomes more flexible
- Use appropriate LLM temperature settings
- Add retry logic for flaky tests
Slow Test Execution
Optimize:- Run scenarios in parallel
- Use shorter test conversations
- Reduce number of scenarios in each run
- Cache scenario results