Back to Blog

Understanding OpenAI json_schema, MCP, and the 30-Second Streaming Pause

Why missing tool_choice when using OpenAI structured outputs with MCP causes a long streaming pause, and how to fix it.

Recently, while implementing OpenAI structured outputs using json_schema with MCP (Model Context Protocol), I ran into a serious performance issue.

The model would:

  • Start streaming normally
  • Then call mcp.list
  • And suddenly pause for 30 seconds
  • After that, the next chunk appeared

After investigation, the root cause turned out to be not specifying tool_choice. When I added tool_choice, the pause dropped to 8-10 seconds.

What is json_schema in OpenAI?

OpenAI now allows enforcing structured outputs using:

response_format: {
  type: "json_schema",
  json_schema: { ... }
}

This guarantees strict JSON and production-ready parsing.

What is MCP?

MCP allows models to list tools, call tools, fetch schemas, and interact with external systems.

The Real Problem: 30-Second Pause

Without explicit tool choice, the model evaluates too many tool paths while also honoring schema constraints.

tool_choice: "auto"

Setting this reduced the pause significantly by removing decision overhead.