Web Benchmark Analysis

A comprehensive analysis of 213 consumer product pages across 7 categories, measuring their Machine Likeability scores.

Executive Summary

213

Pages Analyzed

49.6

Average Score

Minimum Score

81.3

Maximum Score

This benchmark represents the most comprehensive analysis of Machine Likeability across real-world product pages to date. We analyzed 213 pages from leading brands across 7 major product categories, measuring their optimization across all 26 AI preference dimensions.

The results reveal a massive optimization gap in the market. The average web page scores just 49.6 out of 100 for Machine Likeability, with scores ranging from 40 to 81.3. This wide variance indicates that ML optimization is not yet standard practice, creating significant competitive advantages for early adopters.

Key Insight

The average web page scores just 49.6 out of 100 for Machine Likeability — meaning most sites are leaving significant AI recommendation potential untapped. Even leading brands from Google (40.5), Linear (40.0), and Paula's Choice (40.0) fail to implement basic optimization signals.

Category Performance Analysis

Telecom (55.4 avg, Top Performer)

Leader

Telecom companies lead with an average score of 55.4, driven by strong bundle offerings and clear pricing structures. T-Mobile's home internet page (81.3) sets the benchmark with exceptional third-party authority signals, detailed plan comparisons, and regulatory transparency.

Electronics (51.0)

Electronics retailers average 51.0, with gaming brands like Razer (72.1) excelling through detailed product specifications and bundle options. The category succeeds with specification-heavy content.

Software (49.4)

Software companies average 49.4. Modern SaaS design prioritizes minimalism over comprehensive information display, leaving AI systems with insufficient data for recommendations.

Apparel (44.5, Bottom)

Apparel is the lowest-performing category. DTC brands struggle to communicate value in AI-readable formats. Human-trust signals don't translate to machine-readable confidence indicators.

Signal Presence Analysis

Measuring how often each of the 26 AI preference dimensions appears across the 213 analyzed pages.

Most Present Signals

Novelty Seeking (77%)77%

Most pages communicate what's new about their products. This is table stakes — brands understand the importance of positioning products as current and innovative.

Specificity Preference (69%)69%

Detailed specifications are common, especially in electronics and technical categories. Products with measurable attributes naturally include spec sheets.

Information Depth (66%)66%

Pages generally provide adequate detail. Most brands recognize that customers need information to make decisions, though depth varies by category.

Recommendation Revision (61%)61%

Many pages have content that could trigger recommendation changes — highlighting unique benefits or addressing specific use cases that differentiate products.

Bundle Preference (53%)53%

About half of pages mention bundles or packages. More common in telecom and electronics; rare in apparel and personal care.

Most Missing Signals

Ethical Concern (3%)3%

ESG and ethical sourcing signals are almost entirely absent. Even brands with strong ethical practices fail to communicate them in AI-readable formats.

Local Preference (6%)6%

Almost no pages mention local sourcing or production. This represents a massive missed opportunity for brands with local manufacturing stories.

Negative Review Weight (14%)14%

Very few pages address potential concerns proactively. Brands avoid mentioning limitations, missing the opportunity to build trust through transparency.

Social Proof (17%)17%

Despite being the #1 AI selection driver in research, only 17% of pages have visible social proof. Reviews exist but aren't prominently displayed or are hidden behind clicks.

Sustainability (17%)17%

Environmental messaging remains rare despite growing importance to both consumers and AI recommendation systems. Even eco-focused brands often fail to quantify impact.

The Signal Gap

54%

The average page is missing 54% of the signals AI models look for.

This represents a massive optimization opportunity. Early adopters can gain significant competitive advantages by addressing these gaps.

We measured the "target signal strength" for each dimension based on top-performing pages, then calculated how many pages reach that target. The results are sobering:

Dimension	Pages at Target	Percent	Average Gap
Bundle Preference	24 / 213	11%	47 points
Specificity Preference	18 / 213	8%	51 points
Third-Party Authority	12 / 213	6%	67 points
Sustainability	8 / 213	4%	83 points
Social Proof	0 / 213	0%	83 points
Local Preference	0 / 213	0%	94 points
Ethical Concern	0 / 213	0%	97 points

What This Means

Only 24 of 213 pages (11%) hit the target for Bundle Preference — the best-performing dimension. For Social Proof, ZERO pages hit the target, despite social proof being the #1 driver of AI recommendations in our research. Even pages with reviews don't display them prominently enough for maximum AI impact.

For Ethical Concern and Local Preference, virtually no pages even attempt these signals. This creates a blue ocean opportunity: brands that authentically communicate ethics and locality can dominate AI recommendations in those dimensions with minimal competition.

Model Consensus Analysis

All six AI models show remarkably similar scoring patterns, with averages ranging from 44.6 (Perplexity) to 49.1 (Gemini). This consensus suggests the signals measured are truly universal across AI systems, not model-specific quirks.

Cross-Model Consistency

The 4.5-point spread between the most lenient (Gemini, 49.1) and strictest (Perplexity, 44.6) models is remarkably small given the diversity of architectures and training approaches. This consistency validates our research: these signals represent fundamental patterns in how AI systems evaluate product information, not artifacts of specific model implementations.

GPT-5.4, Claude, and Gemini cluster tightly around 48-49, suggesting similar training on commercial content evaluation. O3's reasoning capabilities don't significantly change its scoring (47.2), indicating that these signals work at the system level, not just for quick-response models.

Perplexity: The Strict Evaluator

Perplexity is the strictest evaluator at 44.6 average, likely due to its search-oriented training emphasizing factual grounding and source attribution. Pages that score well with Perplexity tend to have exceptional third-party authority signals and detailed specificity.

The Perplexity-Gemini spread of 4.5 points means your score should be relatively consistent regardless of which AI recommends you. If you optimize for the strictest model (Perplexity), you'll perform well across all platforms. Conversely, a page scoring 40 with Gemini will still score poorly (~36) with Perplexity — there's no gaming the system.

Strategic Implication

Because all models show similar patterns, you don't need separate optimization strategies for different AI platforms. Focus on the fundamental signals — social proof, specificity, third-party authority, sustainability — and your improvements will translate across all AI recommendation systems. This makes ML optimization more approachable: one comprehensive improvement benefits all channels.

Top 10 Performers: What Makes Them Succeed

Detailed analysis of the highest-scoring pages, examining specific signals and extracting lessons for other brands.

t-mobile.com

Telecom

https://www.t-mobile.com/home-internet

81.3

ML Score

Web Benchmark Analysis

Executive Summary

Key Insight

Category Performance Analysis

Telecom (55.4 avg, Top Performer)

Electronics (51.0)

Software (49.4)

Apparel (44.5, Bottom)

Signal Presence Analysis

Most Present Signals

Most Missing Signals

The Signal Gap

What This Means

Model Consensus Analysis

Cross-Model Consistency

Perplexity: The Strict Evaluator

Strategic Implication

Top 10 Performers: What Makes Them Succeed

t-mobile.com

Why it succeeds:

Key Lesson:

razer.com

Why it succeeds:

Key Lesson:

soylent.com

Why it succeeds:

Key Lesson:

ikea.com

Why it succeeds:

Key Lesson:

github.com

Why it succeeds:

Key Lesson:

verizon.com

Why it succeeds:

Key Lesson:

wayfair.com

Why it succeeds:

Key Lesson:

apple.com

Why it succeeds:

Key Lesson:

att.com

Why it succeeds:

Key Lesson:

dell.com

Why it succeeds:

Key Lesson:

Bottom 10 Performers: Understanding The Gap

Common Pattern

bluebottlecoffee.com

Missing signals:

Pattern:

linear.app

Missing signals:

Pattern:

paulaschoice.com

Missing signals:

Pattern:

allbirds.com

Missing signals:

Pattern:

glossier.com

Missing signals:

Pattern:

everlane.com

Missing signals:

Pattern:

warbyparker.com

Missing signals:

Pattern:

allmodern.com

Missing signals:

Pattern:

store.google.com

Missing signals:

Pattern:

huel.com

Missing signals:

Pattern: