Skip The Menu, Just Talk: Intent-Based Routing For Voice Agents

Skip The Menu, Just Talk: Intent-Based Routing For Voice Agents

Table of Contents

Phone trees are dying. Good riddance.

You know the drill: “Press 1 for billing, press 2 for technical support, press 3 to hear these options again.” By the time you reach option 7, you’ve forgotten what you were calling about.

Voice agents don’t need menus. They listen to what you actually say and route you where you need to go.

This is intent-based routing, and it’s changing how voice systems work.

The Problem With Menu-Based Routing

Traditional IVR (Interactive Voice Response) systems force users to categorize their own problems:

"I need help with my order"
→ Is that a new order?
→ Is it about tracking?
→ Is it a return?
→ Is it a refund?

Users don’t think in categories. They think in problems:

  • “My package was supposed to arrive yesterday”
  • “The item I received is damaged”
  • “I was charged twice”

Menu systems make users translate problems into categories. Voice agents should do that work.

How Intent Detection Works

With OpenAI Realtime API, you’re not matching keywords—you’re understanding meaning.

Here’s a simple intent detection flow:

graph TD
    A[User speaks naturally] --> B[Realtime API transcribes + understands]
    B --> C{Detect intent}
    C -->|Delivery issue| D[Route to fulfillment team]
    C -->|Billing question| E[Route to payments team]
    C -->|Product question| F[Route to product specialist]
    C -->|Unclear| G[Ask clarifying question]
    D --> H[Agent has full context]
    E --> H
    F --> H
    G --> A

The magic is in step B. Speech-to-speech models capture:

  • What the user said
  • How they said it (tone, urgency)
  • Why they’re calling (inferred intent)

Real-World Example: Delivery Support

User calls in upset:

// User: "My package was supposed to arrive yesterday but it's not here"

const sessionConfig = {
  modalities: ['audio', 'text'],
  instructions: `
You are a delivery support agent. Listen for:
- Delivery delays or missing packages
- Damaged items
- Wrong items received
- Address issues

When you detect intent, route immediately. Don't ask "which department do you need?"

For delivery issues:
1. Pull recent orders
2. Check tracking status
3. Provide update or solution
`,
  tools: [
    {
      type: 'function',
      name: 'check_order_status',
      description: 'Look up order status by customer ID',
      parameters: {
        type: 'object',
        properties: {
          customer_id: { type: 'string' }
        }
      }
    }
  ]
};

The agent detects:

  • Intent: Delivery issue
  • Context: Expected yesterday, hasn’t arrived
  • Emotion: Frustrated (detected in prosody)

Response:

from openai import OpenAI

client = OpenAI()

def handle_delivery_issue(customer_id):
    # Intent detected: delivery issue
    order_status = check_order_status(customer_id)
    
    if order_status['status'] == 'delayed':
        return {
            'action': 'provide_update',
            'message': f"I see your order is delayed. It's currently in {order_status['location']} and should arrive by tomorrow evening. I can offer expedited shipping or a partial refund—which would help?",
            'options': ['expedite', 'refund']
        }
    elif order_status['status'] == 'delivered':
        return {
            'action': 'check_delivery_location',
            'message': "The tracking shows it was delivered yesterday at 3pm. Was it left in a specific location? Sometimes packages are placed in safe spots."
        }

No menu navigation. User states problem → agent infers intent → solution offered.

Multi-Intent Conversations

Real conversations aren’t single-intent. Users shift topics:

User: "I want to return this shirt, but also I was charged twice"

Two intents:

  1. Return request
  2. Billing issue

Good intent detection handles both:

const detectMultipleIntents = (transcript) => {
  const intents = [];
  
  // Check for return keywords + context
  if (transcript.includes('return') || transcript.includes('send back')) {
    intents.push({
      type: 'return_request',
      confidence: 0.9,
      keywords: ['return', 'shirt']
    });
  }
  
  // Check for billing keywords + context
  if (transcript.includes('charged twice') || transcript.includes('double charge')) {
    intents.push({
      type: 'billing_error',
      confidence: 0.95,
      keywords: ['charged twice']
    });
  }
  
  return intents;
};

// Agent response:
// "I can help with both. Let me start with the double charge—I'll refund that right now while we set up your return. Does that work?"

Priority matters. Billing errors are urgent. Returns can wait 30 seconds.

Clarifying Ambiguous Requests

Not all intents are clear:

User: "I need to change my account"

Change what?

  • Email?
  • Password?
  • Payment method?
  • Subscription plan?

Good intent detection asks targeted questions:

def clarify_ambiguous_intent(transcript, context):
    if 'change' in transcript and 'account' in transcript:
        # Check recent activity for clues
        recent_actions = get_recent_activity(context['user_id'])
        
        if recent_actions['last_action'] == 'failed_login':
            # Likely password issue
            return "Are you trying to reset your password?"
        elif recent_actions['payment_failed']:
            # Likely payment method
            return "Do you need to update your payment method?"
        else:
            # Generic clarification
            return "What would you like to change—your email, password, or payment info?"

Context narrows questions. Don’t ask about everything. Ask about likely things.

Intent Confidence & Fallbacks

Sometimes you’re not sure:

const handleUncertainIntent = (intent, confidence) => {
  if (confidence > 0.8) {
    // High confidence: act immediately
    return routeToAgent(intent);
  } else if (confidence > 0.5) {
    // Medium confidence: confirm first
    return confirmIntent(intent);
  } else {
    // Low confidence: ask open-ended question
    return "I want to make sure I help with the right thing. Can you tell me more about what you need?"
  }
};

Threshold matters:

  • >0.8: “Let me pull up your order history” (action)
  • 0.5-0.8: “It sounds like you’re asking about delivery—is that right?” (confirm)
  • <0.5: “Can you tell me more?” (clarify)

Real-Time Intent Updating

Intent can change mid-conversation:

User: "I want to cancel my subscription"
Agent: "I can help with that. Can I ask why you're canceling?"
User: "It's too expensive"
Agent: "Would a 20% discount work instead?"
User: "Actually, yeah, that would help"

Intent shifted:

  1. Cancel subscription → 2. Retention offer → 3. Accept discount

Your routing logic must adapt:

class IntentTracker:
    def __init__(self):
        self.history = []
    
    def update_intent(self, new_intent):
        self.history.append(new_intent)
        
        # If user shifts from cancel to accept offer
        if self.history[-2] == 'cancel_subscription' and new_intent == 'accept_discount':
            return {
                'action': 'apply_discount',
                'route_to': 'retention_team',
                'priority': 'high'
            }

Track intent changes. Don’t lock users into their first statement.

Implementation: Intent Router

Here’s a production-ready intent router:

class VoiceIntentRouter {
  constructor(openaiClient) {
    this.client = openaiClient;
    this.intentMap = {
      'delivery_issue': { team: 'fulfillment', priority: 'high' },
      'billing_error': { team: 'payments', priority: 'urgent' },
      'product_question': { team: 'support', priority: 'medium' },
      'return_request': { team: 'returns', priority: 'medium' },
      'cancel_subscription': { team: 'retention', priority: 'high' }
    };
  }
  
  async detectIntent(audioStream) {
    const response = await this.client.audio.transcriptions.create({
      file: audioStream,
      model: 'whisper-1',
      response_format: 'verbose_json'
    });
    
    const transcript = response.text;
    const intent = await this.classifyIntent(transcript);
    
    return {
      transcript,
      intent: intent.label,
      confidence: intent.score,
      routing: this.intentMap[intent.label]
    };
  }
  
  async classifyIntent(transcript) {
    const completion = await this.client.chat.completions.create({
      model: 'gpt-4',
      messages: [
        {
          role: 'system',
          content: `
Classify user intent into one of these categories:
- delivery_issue
- billing_error
- product_question
- return_request
- cancel_subscription

Return JSON: {"label": "category", "score": 0.0-1.0}
`
        },
        { role: 'user', content: transcript }
      ],
      response_format: { type: 'json_object' }
    });
    
    return JSON.parse(completion.choices[0].message.content);
  }
  
  routeToAgent(intent) {
    const routing = this.intentMap[intent];
    
    if (!routing) {
      throw new Error(`Unknown intent: ${intent}`);
    }
    
    return {
      team: routing.team,
      priority: routing.priority,
      transfer_message: `Connecting you to our ${routing.team} team who can help immediately.`
    };
  }
}

// Usage
const router = new VoiceIntentRouter(openaiClient);

const result = await router.detectIntent(userAudioStream);
console.log(`Detected: ${result.intent} (${result.confidence})`);

if (result.confidence > 0.7) {
  const route = router.routeToAgent(result.intent);
  console.log(`Routing to ${route.team} (${route.priority} priority)`);
}

Measuring Intent Accuracy

Track these metrics:

import json

class IntentMetrics:
    def __init__(self):
        self.correct = 0
        self.incorrect = 0
        self.unclear = 0
    
    def log_intent_result(self, predicted, actual, confidence):
        result = {
            'predicted': predicted,
            'actual': actual,
            'confidence': confidence,
            'correct': predicted == actual
        }
        
        if predicted == actual:
            self.correct += 1
        elif actual == 'unclear':
            self.unclear += 1
        else:
            self.incorrect += 1
        
        return result
    
    def get_accuracy(self):
        total = self.correct + self.incorrect + self.unclear
        return {
            'accuracy': self.correct / total if total > 0 else 0,
            'unclear_rate': self.unclear / total if total > 0 else 0,
            'error_rate': self.incorrect / total if total > 0 else 0
        }

# Target: >85% accuracy, <10% unclear, <5% error

Real metrics from production:

  • Week 1: 72% accuracy, 18% unclear, 10% error
  • Week 4: 87% accuracy, 9% unclear, 4% error
  • Week 8: 91% accuracy, 6% unclear, 3% error

Improvement comes from:

  1. Better training data
  2. Clearer intent definitions
  3. More context signals

Why This Matters

Traditional IVR: User navigates 4-level menu → 90 seconds average

Intent-based routing: User states problem → Agent routes → 12 seconds average

Time saved per call: ~78 seconds

Multiply by thousands of daily calls:

  • 1,000 calls/day: 21.6 hours saved
  • 10,000 calls/day: 216 hours saved
  • 100,000 calls/day: 2,166 hours saved

That’s time users get back. And agents handle fewer frustrated escalations.

Next Steps

  1. Start simple: 3-5 intent categories
  2. Add context: Recent orders, account status, previous calls
  3. Track confidence: Log predictions vs actual outcomes
  4. Iterate: Refine based on misclassifications

Intent-based routing isn’t magic. It’s understanding what users mean, not what they say.

And that’s what voice agents should do.


Learn More:

Share :

Related Posts

Handoffs Are The Missing Primitive

Handoffs Are The Missing Primitive

Picture this: A customer calls wanting to upgrade their plan. They start explaining their billing issue. The support agent realizes mid-conversation this needs to go to sales. So the customer gets transferred. Waits on hold. A new agent picks up: “Hi, how can I help you today?”

Read More