The #1 Mistake Developers Make With LLM APIs (It's Still Using JSON)
If you're a developer, your world has been defined by JSON and YAML. JSON became the undisputed king of web APIs, the language of data transfer. YAML became the king of configuration, the human-readable language of DevOps, Kubernetes, and CI/CD.
For most of history, the battle was over.
Then, Large Language Models (LLMs) changed the rules.
Suddenly, a new, critical metric appeared: token cost. Every time you send data to an LLM API, every brace {}, every quote "", and every repeated key in a big JSON array costs you money and, just as importantly, precious space in the model's context window.
Enter TOON (Token-Oriented Object Notation).
No, it's not a typo for TOML. TOON is a new, lightweight data format built for one specific, critical purpose: to be the most token-efficient way to send structured data to LLMs.
So, is TOON here to replace JSON and YAML? Or is it another tool for a very specific job? Let's find out.
🧐 What is JSON (JavaScript Object Notation)?
You already know it, but let's re-frame it for the AI era. JSON is a universal, language-independent format built on key/value pairs (objects) and ordered lists (arrays).
JSON Example
Here's a simple list of users in JSON:
JSON
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
}
✅ The Good
- Universal: Every language and system can speak it.
- Robust: Its strict syntax is easy for machines to parse reliably.
- The Standard: It's the only choice for 99% of web APIs.
❌ The Bad (for LLMs)
- Token-Heavy: Look at the example above. The keys
"id","name", and"role"are repeated for every single object. The punctuation—{},[],,,:—adds up. In an LLM prompt, this is all "wasted" data that consumes tokens.
🎯 Best For:
Web APIs, data storage, and machine-to-machine communication outside of AI.
📋 What is YAML (YAML Ain't Markup Language)?
YAML is a human-first format. It uses indentation and minimal punctuation to represent the same data structures as JSON.
YAML Example
Here's that same user list in YAML:
YAML
# A list of users
users:
- id: 1
name: Alice
role: admin
- id: 2
name: Bob
role: user
- id: 3
name: Charlie
role: user
✅ The Good
- Human-Readable: It's clean and easy to read and write by hand.
- Comments: It supports comments, making it perfect for configuration.
❌ The Bad
- Fragile: The "YAML space" problem—where a single incorrect indentation breaks the file—is a developer meme for a reason.
- Complex Spec: Parsing YAML is much slower and more complex than JSON.
- Still Token-Heavy: While it saves a few tokens by removing braces and quotes, it's not fundamentally more efficient for an LLM.
🎯 Best For:
Human-edited configuration files (e.g., Kubernetes, Docker Compose, GitHub Actions).
⚡ What is TOON (Token-Oriented Object Notation)?
This is the new one. TOON looks at the JSON example and asks: "Why are we repeating the keys?"
TOON is a hybrid format. It uses YAML-like indentation for nested objects but switches to a CSV-like tabular format for uniform arrays. This is its "power move."
TOON Example
Here is the exact same data in TOON:
Code snippet
users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Let's break this down:
users[3]{id,name,role}:This is the header. It declares the array (users), its explicit length ([3]), and the schema of its objects ({id,name,role}). This is done once.- The following lines are just the values, separated by commas.
The result? We've eliminated all repeated keys and all the {} and "" punctuation from the array.
✅ The Good
- Massively Token-Efficient: For its target use case—large, uniform arrays—TOON can reduce token counts by 30-60% compared to JSON. This directly saves you money and context window space.
- AI-Friendly Structure: The explicit length (
[3]) and schema header ({...}) are "guardrails" that can help LLMs understand and validate the data more reliably. - Human-Readable: It's arguably as readable as YAML, if not more so, for tabular data.
❌ The Bad
- Niche: It is not a general-purpose replacement for JSON. It is a specialized tool.
- Worse for Nested Data: If your data is deeply nested and not uniform (i.e., every object is different), TOON's syntax can be less efficient than compact JSON.
- New Ecosystem: It's brand new. While Python, TypeScript, and Elixir libraries exist, it's not universally supported like JSON.
🎯 Best For:
Optimizing structured data for LLM prompts. Think RAG pipelines, AI agent function-calling, or any time you're "stuffing" database results into a prompt.
📊 Head-to-Head Comparison: TOON vs. JSON vs. YAML
🏆 The Verdict: Which One Should You Use?
This is the easiest verdict ever, because they don't really compete. They solve different problems at different layers of your stack.
- Use JSON for your APIs. It's the standard. Your frontend and backend services should still talk to each other in JSON.
- Use YAML for your configs. Your
docker-compose.ymlandvalues.yamlfiles aren't going anywhere. - Use TOON as an optimization layer. When your application (that speaks JSON) needs to talk to an LLM, it should convert its data to TOON right before it builds the prompt.
The modern AI workflow looks like this:
- Your server fetches data from a database (gets JSON).
- Your server reads its instructions from a config file (reads YAML).
- Your server converts the JSON data into a TOON string.
- Your server puts that TOON string into a prompt and sends it to the LLM.
TOON isn't here to replace JSON or YAML. It's a new, specialized tool that lives alongside them to solve a problem that didn't exist five years ago: the high cost of tokens.
📚 References and Further Reading
- TOON: The Official TOON GitHub Repository & Spec
- JSON: The Official JSON Standard (json.org)
- YAML: The Official YAML Spec (yaml.org)
What do you think? Have you started using TOON in your AI pipelines? Let me know in the comments!