Skip The Menu, Just Talk: Intent-Based Routing For Voice Agents
Table of Contents
Phone trees are dying. Good riddance.
You know the drill: “Press 1 for billing, press 2 for technical support, press 3 to hear these options again.” By the time you reach option 7, you’ve forgotten what you were calling about.
Voice agents don’t need menus. They listen to what you actually say and route you where you need to go.
This is intent-based routing, and it’s changing how voice systems work.
The Problem With Menu-Based Routing
Traditional IVR (Interactive Voice Response) systems force users to categorize their own problems:
"I need help with my order"
→ Is that a new order?
→ Is it about tracking?
→ Is it a return?
→ Is it a refund?
Users don’t think in categories. They think in problems:
- “My package was supposed to arrive yesterday”
- “The item I received is damaged”
- “I was charged twice”
Menu systems make users translate problems into categories. Voice agents should do that work.
How Intent Detection Works
With OpenAI Realtime API, you’re not matching keywords—you’re understanding meaning.
Here’s a simple intent detection flow:
graph TD
A[User speaks naturally] --> B[Realtime API transcribes + understands]
B --> C{Detect intent}
C -->|Delivery issue| D[Route to fulfillment team]
C -->|Billing question| E[Route to payments team]
C -->|Product question| F[Route to product specialist]
C -->|Unclear| G[Ask clarifying question]
D --> H[Agent has full context]
E --> H
F --> H
G --> A
The magic is in step B. Speech-to-speech models capture:
- What the user said
- How they said it (tone, urgency)
- Why they’re calling (inferred intent)
Real-World Example: Delivery Support
User calls in upset:
// User: "My package was supposed to arrive yesterday but it's not here"
const sessionConfig = {
modalities: ['audio', 'text'],
instructions: `
You are a delivery support agent. Listen for:
- Delivery delays or missing packages
- Damaged items
- Wrong items received
- Address issues
When you detect intent, route immediately. Don't ask "which department do you need?"
For delivery issues:
1. Pull recent orders
2. Check tracking status
3. Provide update or solution
`,
tools: [
{
type: 'function',
name: 'check_order_status',
description: 'Look up order status by customer ID',
parameters: {
type: 'object',
properties: {
customer_id: { type: 'string' }
}
}
}
]
};
The agent detects:
- Intent: Delivery issue
- Context: Expected yesterday, hasn’t arrived
- Emotion: Frustrated (detected in prosody)
Response:
from openai import OpenAI
client = OpenAI()
def handle_delivery_issue(customer_id):
# Intent detected: delivery issue
order_status = check_order_status(customer_id)
if order_status['status'] == 'delayed':
return {
'action': 'provide_update',
'message': f"I see your order is delayed. It's currently in {order_status['location']} and should arrive by tomorrow evening. I can offer expedited shipping or a partial refund—which would help?",
'options': ['expedite', 'refund']
}
elif order_status['status'] == 'delivered':
return {
'action': 'check_delivery_location',
'message': "The tracking shows it was delivered yesterday at 3pm. Was it left in a specific location? Sometimes packages are placed in safe spots."
}
No menu navigation. User states problem → agent infers intent → solution offered.
Multi-Intent Conversations
Real conversations aren’t single-intent. Users shift topics:
User: "I want to return this shirt, but also I was charged twice"
Two intents:
- Return request
- Billing issue
Good intent detection handles both:
const detectMultipleIntents = (transcript) => {
const intents = [];
// Check for return keywords + context
if (transcript.includes('return') || transcript.includes('send back')) {
intents.push({
type: 'return_request',
confidence: 0.9,
keywords: ['return', 'shirt']
});
}
// Check for billing keywords + context
if (transcript.includes('charged twice') || transcript.includes('double charge')) {
intents.push({
type: 'billing_error',
confidence: 0.95,
keywords: ['charged twice']
});
}
return intents;
};
// Agent response:
// "I can help with both. Let me start with the double charge—I'll refund that right now while we set up your return. Does that work?"
Priority matters. Billing errors are urgent. Returns can wait 30 seconds.
Clarifying Ambiguous Requests
Not all intents are clear:
User: "I need to change my account"
Change what?
- Email?
- Password?
- Payment method?
- Subscription plan?
Good intent detection asks targeted questions:
def clarify_ambiguous_intent(transcript, context):
if 'change' in transcript and 'account' in transcript:
# Check recent activity for clues
recent_actions = get_recent_activity(context['user_id'])
if recent_actions['last_action'] == 'failed_login':
# Likely password issue
return "Are you trying to reset your password?"
elif recent_actions['payment_failed']:
# Likely payment method
return "Do you need to update your payment method?"
else:
# Generic clarification
return "What would you like to change—your email, password, or payment info?"
Context narrows questions. Don’t ask about everything. Ask about likely things.
Intent Confidence & Fallbacks
Sometimes you’re not sure:
const handleUncertainIntent = (intent, confidence) => {
if (confidence > 0.8) {
// High confidence: act immediately
return routeToAgent(intent);
} else if (confidence > 0.5) {
// Medium confidence: confirm first
return confirmIntent(intent);
} else {
// Low confidence: ask open-ended question
return "I want to make sure I help with the right thing. Can you tell me more about what you need?"
}
};
Threshold matters:
- >0.8: “Let me pull up your order history” (action)
- 0.5-0.8: “It sounds like you’re asking about delivery—is that right?” (confirm)
- <0.5: “Can you tell me more?” (clarify)
Real-Time Intent Updating
Intent can change mid-conversation:
User: "I want to cancel my subscription"
Agent: "I can help with that. Can I ask why you're canceling?"
User: "It's too expensive"
Agent: "Would a 20% discount work instead?"
User: "Actually, yeah, that would help"
Intent shifted:
- Cancel subscription → 2. Retention offer → 3. Accept discount
Your routing logic must adapt:
class IntentTracker:
def __init__(self):
self.history = []
def update_intent(self, new_intent):
self.history.append(new_intent)
# If user shifts from cancel to accept offer
if self.history[-2] == 'cancel_subscription' and new_intent == 'accept_discount':
return {
'action': 'apply_discount',
'route_to': 'retention_team',
'priority': 'high'
}
Track intent changes. Don’t lock users into their first statement.
Implementation: Intent Router
Here’s a production-ready intent router:
class VoiceIntentRouter {
constructor(openaiClient) {
this.client = openaiClient;
this.intentMap = {
'delivery_issue': { team: 'fulfillment', priority: 'high' },
'billing_error': { team: 'payments', priority: 'urgent' },
'product_question': { team: 'support', priority: 'medium' },
'return_request': { team: 'returns', priority: 'medium' },
'cancel_subscription': { team: 'retention', priority: 'high' }
};
}
async detectIntent(audioStream) {
const response = await this.client.audio.transcriptions.create({
file: audioStream,
model: 'whisper-1',
response_format: 'verbose_json'
});
const transcript = response.text;
const intent = await this.classifyIntent(transcript);
return {
transcript,
intent: intent.label,
confidence: intent.score,
routing: this.intentMap[intent.label]
};
}
async classifyIntent(transcript) {
const completion = await this.client.chat.completions.create({
model: 'gpt-4',
messages: [
{
role: 'system',
content: `
Classify user intent into one of these categories:
- delivery_issue
- billing_error
- product_question
- return_request
- cancel_subscription
Return JSON: {"label": "category", "score": 0.0-1.0}
`
},
{ role: 'user', content: transcript }
],
response_format: { type: 'json_object' }
});
return JSON.parse(completion.choices[0].message.content);
}
routeToAgent(intent) {
const routing = this.intentMap[intent];
if (!routing) {
throw new Error(`Unknown intent: ${intent}`);
}
return {
team: routing.team,
priority: routing.priority,
transfer_message: `Connecting you to our ${routing.team} team who can help immediately.`
};
}
}
// Usage
const router = new VoiceIntentRouter(openaiClient);
const result = await router.detectIntent(userAudioStream);
console.log(`Detected: ${result.intent} (${result.confidence})`);
if (result.confidence > 0.7) {
const route = router.routeToAgent(result.intent);
console.log(`Routing to ${route.team} (${route.priority} priority)`);
}
Measuring Intent Accuracy
Track these metrics:
import json
class IntentMetrics:
def __init__(self):
self.correct = 0
self.incorrect = 0
self.unclear = 0
def log_intent_result(self, predicted, actual, confidence):
result = {
'predicted': predicted,
'actual': actual,
'confidence': confidence,
'correct': predicted == actual
}
if predicted == actual:
self.correct += 1
elif actual == 'unclear':
self.unclear += 1
else:
self.incorrect += 1
return result
def get_accuracy(self):
total = self.correct + self.incorrect + self.unclear
return {
'accuracy': self.correct / total if total > 0 else 0,
'unclear_rate': self.unclear / total if total > 0 else 0,
'error_rate': self.incorrect / total if total > 0 else 0
}
# Target: >85% accuracy, <10% unclear, <5% error
Real metrics from production:
- Week 1: 72% accuracy, 18% unclear, 10% error
- Week 4: 87% accuracy, 9% unclear, 4% error
- Week 8: 91% accuracy, 6% unclear, 3% error
Improvement comes from:
- Better training data
- Clearer intent definitions
- More context signals
Why This Matters
Traditional IVR: User navigates 4-level menu → 90 seconds average
Intent-based routing: User states problem → Agent routes → 12 seconds average
Time saved per call: ~78 seconds
Multiply by thousands of daily calls:
- 1,000 calls/day: 21.6 hours saved
- 10,000 calls/day: 216 hours saved
- 100,000 calls/day: 2,166 hours saved
That’s time users get back. And agents handle fewer frustrated escalations.
Next Steps
- Start simple: 3-5 intent categories
- Add context: Recent orders, account status, previous calls
- Track confidence: Log predictions vs actual outcomes
- Iterate: Refine based on misclassifications
Intent-based routing isn’t magic. It’s understanding what users mean, not what they say.
And that’s what voice agents should do.
Learn More: