Beyond MCP: The Missing Infrastructure Layer

MCP gives us standardized tool invocation—a solid foundation. But multi-step agent workflows need more: stateful orchestration, semantic discovery, policy enforcement, and context efficiency. DataGrout provides these as infrastructure, not agent complexity—from the LLM's perspective, it's still just calling MCP tools.

DataGrout AI · Agentic Infrastructure for Autonomous Systems

1. What MCP actually gets right

The Model Context Protocol is a good foundation for one thing: giving agents a standard way to discover and call tools over JSON-RPC. Servers expose capabilities, clients call them, and everyone agrees on a simple schema.

That part is genuinely useful:

  • A shared notion of “tool servers” and “clients”
  • Simple JSON-RPC messages instead of one-off integrations
  • Enough structure to build basic agent toolchains

Unfortunately, once you go beyond localhost demos, MCP breaks down in predictable ways.

2. Where MCP is broken in real workloads

MCP now has an OAuth 2.1–based story for HTTP transports: clients can discover an auth server, get a Bearer token, and present it to MCP servers. That’s useful plumbing — but it stops at the transport boundary.

What’s missing for real agent systems is an identity and policy model for tools:

  • A standard way to say which user, agent harness, or workspace is actually behind a call
  • A way to scope access per tenant, per environment (dev/stage/prod), and per tool
  • A way to express risk semantics (“this is destructive”, “this returns PII”) and enforce policy on top of tokens

In practice, MCP still largely assumes a trusted, local setup:

  • STDIO/local transports explicitly punt to “whatever credentials are on this machine.”
  • HTTP servers see a token, but the spec doesn’t say how to map that to tenants, agents, or tool-level policy.

That’s fine for a single dev on a laptop. It’s not fine once your servers can read and write to Salesforce, SAP, or a data warehouse. At that point you don’t just need “a token” you need a coherent identity and policy plane for tools and agents.

2.2 Static, global tool lists

Every client calls tools/list and gets the same flat list of every tool on the server.

There is no way to:

  • Filter by goal, semantics, or risk level
  • Paginate or segment large servers
  • Dynamically hide high-risk tools based on policy

In practice this means you either keep servers tiny (10–20 tools), or you throw 40+ tools at the model and hope it picks the right one.

2.3 No native discovery or planning

MCP deliberately stays out of questions like:

  • “Which tools can actually satisfy this goal?”
  • “How do I stitch Salesforce + Concur + SAP together?”
  • “What adapters and joins exist across my systems?”

Agents are left to guess based on tool descriptions and whatever prompt engineering you can cram into context. That is slow, token-hungry, and fragile in production.

2.4 The multi-turn token explosion

Here's what actually happens when an agent tackles a multi-step goal like “Sync last 30 days of invoices from SAP to Salesforce”:

  • Turn 1: LLM decomposes goal (~19k tokens in context)
  • Turn 2: Selects SAP tool from 40 options (~38k tokens)
  • Turn 3: Processes 200 SAP documents (~56k tokens)
  • Turn 4: Plans Salesforce sync (~89k tokens)
  • Turn 5: Picks wrong tool, retries (~127k tokens)
  • Turn 6: Maps schemas manually (~145k tokens)
  • Turn 7: Finally executes (~163k tokens total)

That's 7+ LLM calls with exponentially growing context, costing ~$1.50 and taking 3-5 minutes. Every turn carries the full weight of history plus 40 tool definitions the model must reason over again.

The agent is doing work that shouldn't require an LLM: searching tool lists, validating schemas, managing workflow state. It's like using a neural network to sort an array—technically possible, but absurdly inefficient.

Our view: MCP is a solid wire format, but it's not a complete application stack. The solution isn't a fork—it's a layer: treat MCP as transport, and add identity, discovery, planning, and policy above it.

3. Our approach: treat MCP as transport, not the whole stack

We stay 100% MCP-compatible on the wire. We do not introduce new RPC methods or custom envelopes. Instead, we make MCP servers smarter and give clients a more expressive way to call them.

3.1 Identity-anchored MCP servers (OAuth2.1 + optional mTLS)

Servers still speak normal MCP on the wire. What changes is that every call is anchored in identity and policy, not just “whoever can reach this port.”

At the transport layer, we follow the MCP spec:

  • HTTP servers act as OAuth 2.1 resource servers
  • MCP clients obtain and present Bearer tokens for each server

On top of that, we add a real identity and policy layer:

  • Each token is mapped to a principal: user, tenant, workspace, and roles
  • Each connection is evaluated against policies and scopes derived from that principal
  • Tools carry explicit metadata: side-effects (“read”, “write”, “destructive”), PII categories, environment constraints, etc.

For machine-to-machine connections (substrate agents, gateways, on-prem connectors), we go further: DataGrout operates its own Certificate Authority, issuing short-lived X.509 certificates that enable mutual TLS. The CA signing key lives in an HSM in production; the public CA cert is available at a well-known endpoint. Agents authenticate with their certificate, not with tokens—so every request is cryptographically tied to a verified identity without passing secrets over the wire.

To a vanilla MCP client, this is just an HTTPS MCP endpoint. Under the hood, every call is tied to who is calling, which tenant they belong to, and what they’re allowed to do at the tool level.

3.2 Intelligence gateway & stateful sessions

Instead of forcing the model to reason over the entire tool list every turn, we expose a small set of gateway tools. But crucially, these tools maintain server-side context across multiple agent turns.

For example, discovery.guide maintains a working memory of:

  • Goal decomposition tree
  • Tool candidates at each step
  • User approvals and preferences
  • Partial results from each stage

The agent's context stays small (5-10k tokens) while the backend handles discovery, planning, and workflow orchestration. This is distributed cognition: the LLM focuses on intent and decision-making, while specialized systems handle search, validation, and execution.

Our gateway tools include:

  • discovery.guide — stateful workflow orchestration
  • discovery.discover — semantic tool search
  • discovery.perform — execute validated plans
  • flow.into — multi-step workflow execution
  • prism.refract — dynamic data transformation

These tools understand Semio types (a semantic type system that makes cross-system data transformations deterministic), adapters across systems, PII rules, side-effects, policies, and can access Private Connectors for on-prem data centers. The agent sees a handful of high-level tools; the gateway handles the complexity.

3.3 Goals vs. queries

We distinguish two concepts that are both just arguments on a tool call:

  • Query: a structured search, e.g. “tools that output crm.lead” or “adapters between Salesforce and Concur”
  • Goal: a natural language destination that may require multiple steps, e.g. “refresh CRM leads and email invoices from the last 30 days”

Our gateway tools accept both. A query is “find things”. A goal is “get this done”.

3.4 Workflows become reusable skills

After an agent successfully completes a multi-step workflow, DataGrout can compile it into a reusable MCP tool. The system extracts the parameterized inputs (the "holes"), validates the plan, and generates a new tool that any agent can call.

Example: An agent works through “sync invoices from SAP to Salesforce” via several discovery turns. DataGrout notices the pattern and offers:

{
  "tool": "sync_invoices_to_crm",
  "description": "Syncs invoices between ERP and CRM systems",
  "inputSchema": {
    "start_date": "Date",
    "end_date": "Date", 
    "source_system": "string",
    "target_crm": "string"
  }
}

Future agents see this as a native tool. Under the hood, it's a validated, multi-step workflow with proven safety properties. This is JIT compilation for agent tasks: exploratory work becomes optimized primitives.

Compatibility note: we do not require new MCP methods. The difference is in how tools are used: instead of only calling tools/list and guessing, clients can call a planning tool with a goal or query and let the gateway do the heavy lifting.

4. Making old-school MCP clients better without changing them

You don’t have to adopt our SDK on day one. Existing MCP clients can keep working and still benefit from the intelligence layer.

4.1 Attach integrations to your server

In the DataGrout dashboard, create a server and add integrations (Salesforce, SAP S/4HANA, Concur, any MCP server, etc.). DataGrout core tools are automatically available on every server.

From the server’s point of view, the intelligence layer adds these gateway tools:

  • data-grout@1/discovery.discover@1 — semantic tool search
  • data-grout@1/discovery.plan@1 — generate a verified multi-step plan
  • data-grout@1/discovery.guide@1 — stateful interactive workflow builder
  • data-grout@1/discovery.perform@1 — execute any tool through the intelligence layer
  • data-grout@1/flow.into@1 — multi-step workflow execution
  • data-grout@1/prism.refract@1 — dynamic data transformation
  • Plus Warden, Scheduler, Math, Frame, Logic, and Inspect tool suites

4.2 Optional: hide the raw tool surface

For naive or generic MCP clients, enable a simple server setting:

  • “Use Intelligent Interface”: only discovery.discover and discovery.perform appear in tools/list

The client still calls tools/list, but now gets two focused tools instead of a firehose. The agent uses discover to find the right tool for a goal, then perform to execute it — without ever seeing the raw integration catalog.

4.3 Call our tools like any other MCP tool

If your client supports arguments on tool calls, it can immediately use goals and queries:

Example — goal-based discovery and execution:

{
    "name": "data-grout@1/discovery.discover@1",
    "arguments": {
      "goal": "sync invoices to CRM and email overdue ones",
      "limit": 5
    }
  }

Example — execute a tool through the intelligence layer:

{
    "name": "data-grout@1/discovery.perform@1",
    "arguments": {
      "tool": "quickbooks@1/create_invoice@1",
      "args": { "customer_id": "123", "amount": 5000 }
    }
  }

From the client’s perspective, these are normal tool calls using the existing MCP semantics. Under the hood, they trigger discovery, planning, policy checks, and execution across your entire integration fabric.

For very simple MCP clients that can’t pass rich arguments, you still win by exposing only a minimal set of gateway tools and adjusting the system prompt so the model learns: “call data-grout/execute-plan with the user request”.

5. Conduit SDK: a goal-first MCP client

For teams that want the full experience, we provide the Conduit SDK — available for Python, TypeScript, and Rust — that speaks plain MCP on the wire but exposes a goal-first API instead of raw tools/list calls.

  // TypeScript (npm install @datagrout/conduit)
  import { Client } from '@datagrout/conduit';

  const client = new Client({
    url: "https://gateway.datagrout.ai/servers/YOUR_SERVER_ID/mcp",
    auth: { bearer: "dg_your_token_here" },
  });
  await client.connect();

  // Find the right tool by goal
  const matches = await client.discover({
    goal: "sync invoices from SAP to Salesforce",
    limit: 5,
  });

  // Execute it through the intelligence layer
  const result = await client.perform(matches[0].tool.name, {
    start_date: "2026-01-01",
  });

The SDK:

  • Supports bearer token, OAuth 2.1, and mTLS authentication (with automatic bootstrap — no manual certificate management)
  • Calls discovery.discover instead of hammering tools/list
  • Exposes discover(), guide(), and perform() as first-class methods
  • Lets the model focus on “what should happen next?” instead of “which of these 40 tool names sounds right?”
Superset, not a fork: unplug the Conduit SDK and point any vanilla MCP client at the same server, and everything still works. You simply fall back to basic MCP semantics: flat tool lists and manual selection instead of goals and plans.

6. The result: orders of magnitude improvement

Same workflow, dramatically different execution:

Vanilla MCP: 7+ LLM calls, 163k tokens, $1.50, 3-5 minutes
DataGrout: 2 LLM calls, 18k tokens, $0.02, 3-5 seconds

MCP isn’t broken—it’s a solid wire format. But treating it as the entire stack leaves agents doing work they shouldn’t: searching, validating, orchestrating. DataGrout adds the infrastructure layer that lets LLMs focus on what they do best: understanding intent and making decisions.

The intelligence layer (semantic discovery, planning, policy, skills compilation) stays invisible to vanilla MCP clients, but transforms what’s possible for production agent systems.

Next: Want to know how the intelligence layer actually works? Read about the symbolic backbone that makes this possible, or explore the DataGrout Labs papers for the full formal treatment. Ready to build? Open the Library.


DataGrout AI · Agentic Infrastructure for Autonomous Systems