Three approaches to handling large API payloads in agent workflows. One costs 1,000x more than the other. The formula is visible.
Stuff the entire API response into the agent's context window.
// Agent receives the full payload { "results": [ {"Id": "00Q000000001", "Company": "Acme Corp", ...}, {"Id": "00Q000000002", "Company": "Globex Inc", ...}, ... 9,998 more records ... // ~7.5 MB of JSON → ~1.9M tokens ] }
Agent writes filter code across multiple turns. Or build custom hooks/scripts.
// Turn 1: Agent inspects a sample // Turn 2: Writes Python filter leads = [l for l in data if l["State"] == "California" and parse(l["CreatedDate"]) > cutoff] leads.sort(key=lambda x: x["LeadScore"], reverse=True) result = leads[:5] // Turn 3: Verify and format output
Deterministic tools extract exactly what the agent needs. Zero LLM credits.
// Single MCP tool call chain { "name": "data-grout@1/frame.filter@1", "arguments": { "cache_ref": "rc_leads_abc123", "where": {"State": "California"} } } // → frame.sort → frame.slice → frame.select // Result: 5 records, 6 fields = ~340 tokens
Adjust the parameters to see how costs scale with your data.
Measured on generated datasets using DataGrout's deterministic tool suite. Model: Claude Opus 4 at $15/1M input tokens.
| Approach | Input Tokens | Cost | Turns | Accurate | Savings |
|---|---|---|---|---|---|
| Raw Context | 1,880,425 | $28.22 | 1 | No | baseline |
| Roll Your Own | 3,320 | $0.1548 | 3 | Yes | 99.8% |
| DataGrout | 348 | $0.0265 | 1 | Yes | 100.0% |
| Approach | Input Tokens | Cost | Turns | Accurate | Savings |
|---|---|---|---|---|---|
| Raw Context | 519,475 | $7.83 | 1 | No | baseline |
| Roll Your Own | 4,300 | $0.2370 | 3 | Yes | 99.2% |
| DataGrout | 401 | $0.0585 | 1 | Yes | 99.9% |
| Approach | Input Tokens | Cost | Turns | Accurate | Savings |
|---|---|---|---|---|---|
| Raw Context | 350,150 | $5.30 | 1 | No | baseline |
| Roll Your Own | 6,150 | $0.3248 | 4 | Yes | 98.2% |
| DataGrout | 750 | $0.0738 | 1 | Yes | 99.8% |
Deterministic first. LLM second. This isn't new — a recent analysis showed that awk scripts alone can collapse 108,894 bytes of terminal noise to 37. But terminal noise is the easy case.
The hard case is structured data — API responses, database results, integration payloads. You can't awk your way through 10,000 JSON records. That's where DataGrout's tool suites come in.
DataGrout's deterministic suites handle the heavy lifting before any LLM token is spent.
Filter, sort, group, pivot, slice, join — columnar ops on record sets. Accepts cache_ref so large payloads stay server-side.
0 credits · deterministicPick, omit, flatten, aggregate, unique — pure JSON transformations. Shrink payloads to just the fields you need.
0 credits · deterministicNatural-language data reshaping. "Group by customer, sum totals" — generates and caches reusable Starlark transforms.
LLM-backed · skill cachingTurn raw numbers into a sparkline + text summary. The agent gets a 50-token visual instead of 500k raw values.
2 credits · visual compressionStatistical summary: mean, median, percentiles, histogram. Deterministic analysis of numeric columns.
0 credits · deterministicKeep large results server-side entirely. Agent pages through results without ever loading the full set into context.
0 credits · server-side5,000 free credits/month. Deterministic tools (frame.*, data.*, math.*) cost 0 credits — only the gateway base fee applies. LLM-backed tools charge 1 credit + passthrough model cost at 1.2× margin. See full pricing →