Value Accuracy

Validating · Emerged 2026-04-28 · 49 days old · Last reviewed 2026-04-30

Value Accuracy measures the fraction of JSON leaf values that exactly match ground truth — distinct from JSON pass rate, which only checks schema structure. The gap is large: schema compliance routinely exceeds 84%, while value accuracy peaks at 83% even on frontier models.

Interfaze and JigsawStack formalized the metric in the SOB paper on April 28, 2026, benchmarking 21 models across 5,000+ text, image, and audio records. GLM-4.7 leads text at 83.0%; audio accuracy collapses to 23.7% for every model tested — exposing that structured outputs silently hallucinate at production scale.

💡

On the SOB leaderboard, every model in the 21-model test achieves >84% JSON Pass Rate. Yet Value Accuracy (exact leaf-value match) peaks at 83.0% for text — and model size is not a predictor: open-weight Qwen3.5-35B outscores GPT-5.4 on this metric, exposing that the failure mode is extraction quality, not model scale.

Think of it as the spell-checker that passes "their" when you meant "there" — valid, but wrong.

Search Interest

peak ~5.2K/mo

updated 2026-06-14

~5.2K/mo ~2.6K/mo 0

2026-05-16 2026-05-31 2026-06-14

Term Lifecycle

Nascent

0–7 days
Emergent

8–30 days
Validating ← now

31–90 days
Rising

91–180 days
Established

180 days +

Why is it emerging now?

TL;DR

LLMs now produce valid JSON at >84% schema compliance — so the bottleneck has shifted to whether values are correct, not whether the schema parses. The SOB paper (April 28, 2026) named this gap 'Value Accuracy' and quantified it across 21 frontier models, surfacing that structured outputs silently hallucinate at rates most teams don't measure.

4 forces driving coverage — scroll →

arXiv

The Structured Output Benchmark: A Multi-Source Benchmark for Evaluating Structured Output Quality in LLMs

Best Value Accuracy reaches only 83.0% on text, 67.2% on images, 23.7% on audio — despite near-perfect schema compliance.

Apr 28, 2026

Interfaze

Introducing SOB: A Multi-Source Structured Output Benchmark for LLMs

15–30-point gap between JSON Pass Rate and Value Accuracy on every model; 'structured hallucinations invisible without field-level checks.'

Apr 29, 2026

Y Hacker News

Show HN: A new benchmark for testing LLMs for deterministic outputs

"The JSON-pass vs Value-Accuracy gap section was an eye opener — models great at schema, bad at accurate values."

Apr 29, 2026 54 points · 21 comments

JigsawStack/sob

A multi-source benchmark for evaluating structured-output quality in LLMs

Dataset, evaluation pipeline, and leaderboard code released alongside the paper.

14 stars

Outlook

6-month signal projection and commercial timeline.

Signal medium

Revenue moderate

The metric fills a real gap in production LLM evaluation, but adoption hinges on whether SOB becomes a de facto leaderboard.

Risk · Existing hallucination benchmarks may absorb this framing without adopting the specific term.

Analogs · perplexity · BLEU score · hallucination rate

Monetization timeline

now

Evaluation tooling gap

Developers ship structured-output pipelines without measuring value accuracy; audit tools are absent.
3-6mo

Leaderboard & monitoring

SOB leaderboard attracts SaaS monitoring tools that track value accuracy per endpoint.
6-12mo

Platform integration

Cloud inference APIs add Value Accuracy as a billable quality metric alongside token counts.

Competition & Opportunity for term “Value Accuracy”

Three heuristic signals derived from the tracked queries, the term's monetization cards, and its cluster neighbors. Directional, not audited.

Content Gap

10 queries tracked

Led by General (9), Explainer (1)

10 Suggest-only tails — long-tail opening

Revenue Potential

0% commercial-intent queries

2 monetization angles mapped

Mostly informational — pre-commercial

Build Difficulty

Medium

Stage: validating — incumbents warming up

1 / 10 default TLDs taken · oldest incumbent valueaccuracy.com (2023-05-11)

No cluster neighbors published yet

Heuristic · signals: tracked queries, term monetization cards, cluster neighbors

Ideas for term “Value Accuracy”

Buildable pitches — turn this term into an article, site, product, post, newsletter, video, or course. Steal any card and run with it.

Article

Value Accuracy vs JSON Pass Rate: The Hidden Gap in LLM Structured Output Evaluation

Evergreen explainer targeting devs searching for 'structured output evaluation LLM' — a query cluster with no dominant page yet. Monetize via affiliate links to eval frameworks.

Article

How to Measure Value Accuracy in Your LLM Pipeline (Without the SOB Paper)

Tutorial-style guide targeting builders implementing custom eval loops. Long-tail traffic from 'json extraction accuracy llm' searches.

Article

Best LLMs for Structured Output by Value Accuracy: 2026 Benchmark Comparison

Comparison piece citing SOB leaderboard data. Targets 'best llm structured output' and 'llm json accuracy comparison' queries.

Product

Value Accuracy Monitor: Field-Level LLM Output Auditing for Production Pipelines

SaaS tool that computes value accuracy on live inference logs against schema-defined ground truth, alerting on model drift before downstream corruption spreads.

Product

A drop-in pytest fixture that asserts value accuracy thresholds on LLM extraction tests

OSS tool targeting CI/CD workflows; lightweight monetization via cloud storage for baseline snapshots.

Newsletter

Eval Matters: Weekly structured-output accuracy benchmarks for the top 10 frontier models

Recurring benchmarking digest targeting ML engineers who run extraction pipelines. Natural upgrade path to paid leaderboard access.

Post HN / r/MachineLearning

Your LLM Passes JSON Validation. Its Values Are Still Wrong. Here's the Gap Nobody Measures.

In a 21-model benchmark, every frontier LLM achieved >84% schema compliance — yet the best value accuracy was 83.0%, and audio dropped to 23.7%.

Post LinkedIn / Newsletter

Model Size Does Not Predict Value Accuracy. Here's What Does.

Qwen3.5-35B outscored GPT-5.4 on structured-output value accuracy — a smaller open-weight model, no API fees.

Post YouTube / Tech media

I Ran the Same JSON Extraction Prompt on 10 LLMs. The Value Accuracy Results Are Embarrassing.

Valid JSON doesn't mean correct JSON — here's what I found when I checked the actual leaf values, not just the schema.

What People Search

Long-tail queries from Google Suggest + Trends. Volume and competition are heuristics — directional, not audited. Content Type comes from query shape.

Keyword

Competition

Content Type

value accuracy

Very Low

General

value accuracy professional consulting

Very Low

General

accuracy value formula

Low

General

accuracy value calculator

Low

General

accuracy value meaning

Low

Explainer

zillow value accuracy

Very Low

General

redfin value accuracy

Very Low

General

p value for accuracy

Low

General

1–8 of 10

1 / 2

Updated 2026-06-14 · sources: Google Trends, Google Suggest · Competition is heuristic

SERP of term “Value Accuracy”

What searchers see today — organic results on top, paid ads if anyone's bidding. Ad density is a real-time commercial signal.

FAQ

What is Value Accuracy?

Value Accuracy measures the fraction of JSON leaf values that exactly match ground truth — distinct from JSON pass rate, which only checks schema structure.

Why is Value Accuracy emerging now?

When did Value Accuracy emerge?

Publicly emerged around 2026-04-28 (about 49 days ago as of 2026-06-16). EarlyTerms first recorded a pipeline signal on 2026-04-30.

Related Terms

Other terms in the same space — aliases, subtypes, competitors, and neighbors to explore next.

Also mentioned

Also known as field-level accuracy
Part of structured output·LLM evaluation
Related JSON schema compliance·hallucination rate·SOB benchmark·schema compliance·exact match·structured hallucination·JSON pass rate

Sources

Primary URLs this report cites — open any to verify the claim yourself.

Domain Availability

valueaccuracy.com
valueaccuracy.ai
valueaccuracy.net
valueaccuracy.io
valueaccuracy.co
valueaccuracy.app
valueaccuracy.pro
valueaccuracy.top
valueaccuracy.org
valueaccuracy.info
valueaccuracy.xyz
valueaccuracy.run
valueaccuracy.me
value-accuracy.com
value-accuracy.ai
value-accuracy.net
value-accuracy.io
value-accuracy.co
value-accuracy.app
value-accuracy.pro
value-accuracy.top
value-accuracy.org
value-accuracy.info
value-accuracy.xyz
value-accuracy.run
value-accuracy.me

Checked via RDAP — live from your browser.

EarlyTerms Weekly

5–8 new terms every Tuesday. Research, story angles, buildable ideas — straight to your inbox.

Join the waitlist for issue #1. No spam.

Search Interest

Why is it emerging now?

Outlook

Competition & Opportunity for term “Value Accuracy”

Ideas for term “Value Accuracy”

What People Search

SERP of term “Value Accuracy”

FAQ

Related Terms

Sources

Full access is a paid feature