Benchmarks & Evals

The most accurate
web search for AI

We rigorously benchmark RunoxAI against every major search API on accuracy, freshness, and LLM-readiness. The results speak for themselves.

Try for free →

QA Benchmark Comparisons

FRAMES

Factuality, Retrieval, and Multi-hop reasoning benchmark across 824 complex questions.

RunoxAI ✓

82.4%

Perplexity

71.2%

Google Search

68.9%

SimpleQA

Short-form factual question answering from web sources.

RunoxAI ✓

90.1%

Bing Search

81.4%

Perplexity

83.7%

TriviaQA

Open-domain trivia question answering requiring precise factual retrieval.

RunoxAI ✓

88.6%

Brave Search

79.3%

Bing Search

82.1%

HotpotQA

Multi-hop reasoning questions requiring information synthesis across documents.

RunoxAI ✓

76.3%

Google Search

64.5%

Perplexity

69.8%

All benchmarks run January–March 2025. Methodology available on request.

Quality dimensions

Relevance

94%

How semantically relevant are returned results to the query intent — not just keyword overlap.

Freshness

89%

What percentage of results contain content published within the last 30 days for time-sensitive queries.

Content quality

97%

Proportion of results returning full-text content suitable for LLM consumption vs. truncated snippets.

Latency (p50)

280ms

Median response time in milliseconds across 10,000 production queries.

Uptime

99.97%

30-day rolling API availability averaged across all endpoints.

Run your own evals

Start with 1,000 free searches and benchmark RunoxAI against your existing search stack. No credit card required.