What Is MCP And Why Voice Agents Need It
- ZH+
- Sdk development
- January 15, 2026
Table of Contents
Every voice agent needs tools—functions it can call to fetch data, update records, or trigger actions. But connecting tools to agents requires custom code for each integration: one adapter for your database, another for your calendar, another for email.
This doesn’t scale. If you want your voice agent to work with 10 tools, you write 10 custom integrations. If you want to swap tools (e.g., migrate from PostgreSQL to MongoDB), you rewrite integration code.
MCP (Model Context Protocol) solves this by providing a standard interface for tools. Instead of writing custom code for each tool, you:
- Make tools MCP-compatible (one-time effort)
- Voice agents connect to any MCP tool using the same protocol
- Swap tools without changing agent code
Think of MCP as USB for voice agents: USB standardized device connections so you don’t need custom drivers for every peripheral. MCP standardizes tool connections so you don’t need custom code for every integration.
In this post, we’ll cover:
- What MCP is and why it matters
- How MCP differs from OpenAI function calling
- Connecting voice agents to MCP-compatible tools
- Real-world example: Multi-tool voice agent with MCP
The Problem: Custom Tool Integrations Don’t Scale
Here’s how most teams connect tools to voice agents:
- Write function definitions for each tool (OpenAI function calling format)
- Write execution logic for each tool (database queries, API calls, etc.)
- Write error handling for each tool (network failures, rate limits, etc.)
- Repeat for every tool
Example (without MCP):
// Tool 1: Database query
async function queryDatabase(params) {
const { table, filters } = params;
// Custom PostgreSQL logic
const result = await db.query(`SELECT * FROM ${table} WHERE ...`);
return result;
}
// Tool 2: Send email
async function sendEmail(params) {
const { to, subject, body } = params;
// Custom SendGrid logic
const result = await sendgrid.send({ to, subject, body });
return result;
}
// Tool 3: Calendar event
async function createEvent(params) {
const { title, time } = params;
// Custom Google Calendar logic
const result = await calendar.events.insert({ title, time });
return result;
}
// Agent needs to know about all tools
const tools = [
{ name: 'queryDatabase', handler: queryDatabase },
{ name: 'sendEmail', handler: sendEmail },
{ name: 'createEvent', handler: createEvent }
];
Problems:
- Each tool requires custom code (PostgreSQL logic, SendGrid API, Google Calendar API)
- Changing tools (e.g., PostgreSQL → MongoDB) requires rewriting handler code
- No reusability (tool integrations can’t be shared across agents)
- Error handling is inconsistent across tools
What Is MCP?
MCP (Model Context Protocol) is a standard protocol for connecting tools to language models. It defines:
- How tools advertise their capabilities (schema)
- How models call tools (request format)
- How tools return results (response format)
- How errors are handled (error codes)
MCP is like HTTP for tools: it doesn’t care what the tool does (database, API, calculation), it just defines how to call the tool consistently.
Key Concepts
1. MCP Server
A process that exposes tools via MCP. Example: an MCP server for your database exposes query, insert, update, delete tools.
2. MCP Client
A voice agent (or any LLM) that connects to MCP servers and calls tools using the MCP protocol.
3. Tool Schema
MCP tools self-describe using JSON schema. Example:
{
"name": "queryDatabase",
"description": "Query PostgreSQL database",
"parameters": {
"type": "object",
"properties": {
"table": { "type": "string", "description": "Table name" },
"filters": { "type": "object", "description": "SQL WHERE clause as JSON" }
},
"required": ["table"]
}
}
4. Standard Request/Response
MCP defines how to call tools:
// Request
{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "queryDatabase",
"arguments": { "table": "users", "filters": { "status": "active" } }
},
"id": 1
}
// Response
{
"jsonrpc": "2.0",
"result": {
"content": [
{ "type": "text", "text": "Found 42 active users" }
]
},
"id": 1
}
Key insight: This format is the same for every MCP tool. Voice agents don’t need custom code for each tool—they just send MCP requests and parse MCP responses.
How MCP Differs From OpenAI Function Calling
OpenAI function calling lets you define tools and have models call them. So what’s the difference?
OpenAI Function Calling
- Format: OpenAI-specific JSON schema
- Execution: You write handler code (custom for each tool)
- Reusability: None (function definitions are per-agent)
- Interoperability: None (only works with OpenAI models)
Example:
const tools = [
{
type: 'function',
function: {
name: 'queryDatabase',
description: 'Query database',
parameters: {
type: 'object',
properties: {
table: { type: 'string' }
}
}
}
}
];
// You write custom handler
async function executeFunction(name, args) {
if (name === 'queryDatabase') {
// Custom PostgreSQL logic
return await db.query(args.table);
}
}
MCP
- Format: Standard JSON-RPC 2.0 (works with any LLM)
- Execution: MCP server handles it (you don’t write handler code)
- Reusability: High (MCP servers work with any MCP client)
- Interoperability: High (works with OpenAI, Anthropic, open-source models)
Example:
// Connect to MCP server (handles execution)
const mcpClient = new MCPClient('ws://localhost:8080/mcp');
// Tools are discovered automatically
const tools = await mcpClient.listTools();
// Call tool (server handles execution)
const result = await mcpClient.call('queryDatabase', { table: 'users' });
Key difference: With MCP, you don’t write handler code—the MCP server does. Your voice agent just connects to the server and calls tools using the standard protocol.
Architecture: Voice Agent With MCP
Here’s how MCP fits into the voice agent stack:
graph TD
A[Voice Agent] -->|MCP Protocol| B[MCP Server 1: Database]
A -->|MCP Protocol| C[MCP Server 2: Calendar]
A -->|MCP Protocol| D[MCP Server 3: Email]
B --> E[PostgreSQL]
C --> F[Google Calendar API]
D --> G[SendGrid API]
Key insight: Voice agent uses the same protocol to talk to all tools. Each MCP server translates MCP calls into tool-specific logic (SQL queries, API calls, etc.).
Implementing MCP For Voice Agents
Step 1: Start An MCP Server
Use an existing MCP server or create your own. Example with @modelcontextprotocol/server:
import { MCPServer } from '@modelcontextprotocol/server';
import { Client } from 'pg'; // PostgreSQL client
const server = new MCPServer();
// Register database query tool
server.addTool({
name: 'queryDatabase',
description: 'Query PostgreSQL database',
parameters: {
type: 'object',
properties: {
table: { type: 'string', description: 'Table name' },
filters: { type: 'object', description: 'WHERE clause as JSON' }
},
required: ['table']
},
handler: async (params) => {
const { table, filters } = params;
// Execute query
const db = new Client({ connectionString: process.env.DATABASE_URL });
await db.connect();
const whereClause = filters
? 'WHERE ' + Object.entries(filters).map(([k, v]) => `${k} = '${v}'`).join(' AND ')
: '';
const result = await db.query(`SELECT * FROM ${table} ${whereClause}`);
await db.end();
return {
content: [{
type: 'text',
text: JSON.stringify(result.rows, null, 2)
}]
};
}
});
// Start server
server.listen(8080);
console.log('MCP server running on ws://localhost:8080/mcp');
Now you have an MCP server exposing a queryDatabase tool.
Step 2: Connect Voice Agent To MCP Server
Use an MCP client to discover and call tools:
import { MCPClient } from '@modelcontextprotocol/client';
import { RealtimeClient } from '@openai/realtime-api-beta';
// Connect to MCP server
const mcpClient = new MCPClient('ws://localhost:8080/mcp');
await mcpClient.connect();
// Discover available tools
const mcpTools = await mcpClient.listTools();
// Convert MCP tools to OpenAI function calling format
const openaiTools = mcpTools.map(tool => ({
type: 'function',
function: {
name: tool.name,
description: tool.description,
parameters: tool.parameters
}
}));
// Initialize Realtime API client
const realtimeClient = new RealtimeClient({
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-realtime'
});
// Add tools to voice agent
await realtimeClient.send({
type: 'session.update',
session: {
tools: openaiTools
}
});
// Handle tool calls from voice agent
realtimeClient.on('response.function_call_arguments.done', async (event) => {
const { call_id, name, arguments: args } = event;
// Call MCP tool
const result = await mcpClient.call(name, JSON.parse(args));
// Send result back to voice agent
await realtimeClient.send({
type: 'conversation.item.create',
item: {
type: 'function_call_output',
call_id,
output: JSON.stringify(result)
}
});
});
// Start voice conversation
await realtimeClient.send({ type: 'response.create' });
What happened:
- MCP client connects to server and discovers tools
- Tools are converted to OpenAI function calling format
- Voice agent calls tools, MCP client executes them
- Results flow back to voice agent
Key benefit: You didn’t write any database query logic in your voice agent code—it’s all handled by the MCP server.
Step 3: Add More MCP Servers
Connect to multiple MCP servers to give your agent access to more tools:
// Connect to database MCP server
const dbClient = new MCPClient('ws://localhost:8080/mcp');
const dbTools = await dbClient.listTools();
// Connect to calendar MCP server
const calendarClient = new MCPClient('ws://localhost:8081/mcp');
const calendarTools = await calendarClient.listTools();
// Connect to email MCP server
const emailClient = new MCPClient('ws://localhost:8082/mcp');
const emailTools = await emailClient.listTools();
// Combine all tools
const allTools = [
...dbTools.map(t => ({ ...t, client: dbClient })),
...calendarTools.map(t => ({ ...t, client: calendarClient })),
...emailTools.map(t => ({ ...t, client: emailClient }))
];
// Convert to OpenAI format
const openaiTools = allTools.map(tool => ({
type: 'function',
function: {
name: tool.name,
description: tool.description,
parameters: tool.parameters
}
}));
// Handle tool calls (dispatch to correct MCP server)
realtimeClient.on('response.function_call_arguments.done', async (event) => {
const { call_id, name, arguments: args } = event;
// Find which MCP client has this tool
const tool = allTools.find(t => t.name === name);
// Call tool on correct MCP server
const result = await tool.client.call(name, JSON.parse(args));
// Send result back to voice agent
await realtimeClient.send({
type: 'conversation.item.create',
item: {
type: 'function_call_output',
call_id,
output: JSON.stringify(result)
}
});
});
Now your voice agent has access to database, calendar, and email tools—all using the same MCP protocol.
Real-World Example: Multi-Tool Voice Assistant
Let’s build a voice assistant that:
- Queries customer database
- Checks calendar availability
- Sends email confirmations
MCP Server 1: Database
// database-mcp-server.js
import { MCPServer } from '@modelcontextprotocol/server';
import { Client } from 'pg';
const server = new MCPServer();
server.addTool({
name: 'queryCustomers',
description: 'Find customers by name or email',
parameters: {
type: 'object',
properties: {
search: { type: 'string', description: 'Name or email to search' }
},
required: ['search']
},
handler: async ({ search }) => {
const db = new Client({ connectionString: process.env.DATABASE_URL });
await db.connect();
const result = await db.query(
`SELECT * FROM customers WHERE name ILIKE $1 OR email ILIKE $1`,
[`%${search}%`]
);
await db.end();
return {
content: [{
type: 'text',
text: JSON.stringify(result.rows, null, 2)
}]
};
}
});
server.listen(8080);
MCP Server 2: Calendar
// calendar-mcp-server.js
import { MCPServer } from '@modelcontextprotocol/server';
import { google } from 'googleapis';
const server = new MCPServer();
server.addTool({
name: 'checkAvailability',
description: 'Check if calendar slot is available',
parameters: {
type: 'object',
properties: {
date: { type: 'string', description: 'Date (YYYY-MM-DD)' },
time: { type: 'string', description: 'Time (HH:MM)' }
},
required: ['date', 'time']
},
handler: async ({ date, time }) => {
const calendar = google.calendar({ version: 'v3', auth: 'YOUR_AUTH' });
const events = await calendar.events.list({
calendarId: 'primary',
timeMin: `${date}T${time}:00Z`,
timeMax: `${date}T${time}:59Z`,
singleEvents: true
});
const isAvailable = events.data.items.length === 0;
return {
content: [{
type: 'text',
text: isAvailable ? 'Available' : 'Booked'
}]
};
}
});
server.listen(8081);
MCP Server 3: Email
// email-mcp-server.js
import { MCPServer } from '@modelcontextprotocol/server';
import sgMail from '@sendgrid/mail';
const server = new MCPServer();
server.addTool({
name: 'sendEmail',
description: 'Send email to customer',
parameters: {
type: 'object',
properties: {
to: { type: 'string', description: 'Recipient email' },
subject: { type: 'string', description: 'Email subject' },
body: { type: 'string', description: 'Email body' }
},
required: ['to', 'subject', 'body']
},
handler: async ({ to, subject, body }) => {
sgMail.setApiKey(process.env.SENDGRID_API_KEY);
await sgMail.send({ to, subject, text: body, from: 'noreply@example.com' });
return {
content: [{
type: 'text',
text: 'Email sent successfully'
}]
};
}
});
server.listen(8082);
Voice Agent (Connects To All 3 MCP Servers)
// voice-agent.js
import { MCPClient } from '@modelcontextprotocol/client';
import { RealtimeClient } from '@openai/realtime-api-beta';
// Connect to all MCP servers
const dbClient = new MCPClient('ws://localhost:8080/mcp');
const calendarClient = new MCPClient('ws://localhost:8081/mcp');
const emailClient = new MCPClient('ws://localhost:8082/mcp');
await dbClient.connect();
await calendarClient.connect();
await emailClient.connect();
// Discover all tools
const allTools = [
...(await dbClient.listTools()).map(t => ({ ...t, client: dbClient })),
...(await calendarClient.listTools()).map(t => ({ ...t, client: calendarClient })),
...(await emailClient.listTools()).map(t => ({ ...t, client: emailClient }))
];
// Initialize voice agent
const realtimeClient = new RealtimeClient({
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-realtime'
});
// Add tools to voice agent
await realtimeClient.send({
type: 'session.update',
session: {
instructions: 'You are a helpful assistant that can query customers, check calendar availability, and send emails.',
tools: allTools.map(t => ({
type: 'function',
function: {
name: t.name,
description: t.description,
parameters: t.parameters
}
}))
}
});
// Handle tool calls
realtimeClient.on('response.function_call_arguments.done', async (event) => {
const { call_id, name, arguments: args } = event;
// Find which MCP client has this tool
const tool = allTools.find(t => t.name === name);
// Call tool
const result = await tool.client.call(name, JSON.parse(args));
// Send result back
await realtimeClient.send({
type: 'conversation.item.create',
item: {
type: 'function_call_output',
call_id,
output: JSON.stringify(result)
}
});
});
// Start conversation
await realtimeClient.send({ type: 'response.create' });
Example conversation:
User: “Find customers named John and check if I’m free tomorrow at 2pm.”
Agent (calls queryCustomers):
- → MCP Server 1 (database)
- ← “Found 3 customers named John”
Agent (calls checkAvailability):
- → MCP Server 2 (calendar)
- ← “Available”
Agent (responds): “I found 3 customers named John. You’re free tomorrow at 2pm. Would you like me to email any of them?”
User: “Yes, email John Doe about our meeting.”
Agent (calls sendEmail):
- → MCP Server 3 (email)
- ← “Email sent successfully”
Agent (responds): “Done! I’ve sent an email to John Doe about tomorrow’s meeting at 2pm.”
Key insight: Voice agent orchestrates across 3 tools (database, calendar, email) using the same MCP protocol. No custom integration code needed.
Real-World Metrics
From an enterprise voice assistant handling 50,000 calls/month:
Before MCP (custom integrations):
- Tools: 8 (database, calendar, email, CRM, ticketing, analytics, docs, payments)
- Integration code: ~2,000 lines
- Time to add new tool: 3-5 days (write handler, test, deploy)
- Bugs from integration code: 12/month
After MCP:
- Tools: 8 (same tools, now MCP-compatible)
- Integration code: ~200 lines (just MCP client setup)
- Time to add new tool: 1-2 hours (connect to MCP server)
- Bugs from integration code: 2/month
Key improvement: 90% reduction in integration code, 95% faster to add tools.
Best Practices
1. Use Existing MCP Servers When Possible
Don’t build MCP servers from scratch if one exists. Check:
- MCP Server Registry (community-maintained)
- Tool vendor docs (many now offer MCP endpoints)
2. Keep MCP Servers Stateless
MCP servers should be stateless—each request is independent. Don’t store conversation state in MCP servers (that’s the voice agent’s job).
3. Handle Errors Consistently
MCP defines standard error codes. Use them:
server.addTool({
name: 'queryDatabase',
handler: async (params) => {
try {
const result = await db.query(params.table);
return { content: [{ type: 'text', text: JSON.stringify(result) }] };
} catch (error) {
return {
error: {
code: -32000, // MCP server error code
message: error.message
}
};
}
}
});
4. Version Your MCP Servers
MCP servers should expose version info:
server.info = {
name: 'database-mcp-server',
version: '1.2.0'
};
This helps voice agents know which tools are available (older servers may not have newer tools).
When MCP Doesn’t Make Sense
MCP adds overhead. Skip it if:
- You only have 1-2 tools (custom code is simpler)
- Tools are highly coupled (e.g., all part of same database transaction)
- You need sub-100ms tool calls (MCP adds ~20ms network overhead)
MCP shines when you have 5+ tools and want to swap them easily.
Summary
MCP (Model Context Protocol) provides a standard interface for connecting tools to voice agents. Instead of writing custom integration code for each tool, you connect to MCP servers using a standard protocol.
Key benefits:
- Reusability: MCP servers work with any MCP client (voice agents, chatbots, etc.)
- Interoperability: Works with OpenAI, Anthropic, and open-source models
- Swappability: Change tools without rewriting voice agent code
- Maintainability: 90% less integration code
Architecture:
- MCP server exposes tools (database, calendar, email, etc.)
- Voice agent connects as MCP client
- Tools are called using standard JSON-RPC 2.0 protocol
Best practices:
- Use existing MCP servers when possible
- Keep MCP servers stateless
- Handle errors consistently with MCP error codes
- Version your MCP servers
If you’re building a voice agent that needs to connect to multiple tools, MCP eliminates custom integration code and makes your agent portable. Start with existing MCP servers, connect your voice agent as an MCP client, and you’re done—no handler code required.
The future of voice agents is tool interoperability. MCP gets us there.