Study Methodology
Pre-registeredAPIS Price Sensitivity Study (Study 3)
Study Design
This confirmatory study investigates how AI agents respond to price premiums when recommending products. We designed 4 sub-studies to test specific hypotheses about price sensitivity, psychological pricing, and the mechanisms behind the "price cliff."
Sub-studies
Price Points Tested
| Sub-study | Price Multipliers |
|---|---|
| A: Price Sensitivity | 0.4x, 0.6x, 0.8x, 1.0x, 1.2x, 1.5x, 1.75x, 2.0x, 3.0x |
| B: Psychological Pricing | 1.0x only (5 format variations) |
| C: Cliff Mechanism | 1.0x, 1.5x, 1.75x, 2.0x, 2.5x |
| D: Reasoning Extraction | 1.5x, 1.75x, 2.0x, 3.0x |
AI Models Tested
We tested 4 leading frontier models from 3 major providers to ensure findings generalize across the AI ecosystem.
GPT-5.4
OpenAI
Claude Sonnet 4.6
Anthropic
Gemini 3.1 Pro
Gemini 3.0 Flash
Judge scoring: All responses were scored by cross-family judges (Claude responses scored by GPT and Gemini, etc.) to eliminate bias. Judge agreement rate: 97%
Products Tested
We selected 5 products across diverse categories to ensure findings aren't category-specific.
| ID | Product | Category | Anchor Price |
|---|---|---|---|
| P1 | CleanBright Ultra Laundry Detergent | Household | $14.99 |
| P2 | DermaCare Daily Facial Moisturizer | Personal Care | $28.00 |
| P3 | StrideMax Neutral Running Shoe | Footwear | $130.00 |
| P4 | NutriCore Whey Protein Powder | Supplements | $44.99 |
| P5 | HomeChef Pro 5.5Qt Air Fryer | Kitchen | $79.99 |
Hypothesis Testing Results
We pre-registered 10 hypotheses covering price sensitivity, psychological pricing effects, and mechanisms behind the price cliff.
Piecewise model outperforms linear model
Evidence: 90% of model-product cells show piecewise > linear (9 of 10 non-flat curves)
Breakpoint falls within pre-specified range
Evidence: Mean breakpoint at 1.94x, but only 33% within strict 1.25-2.0x range (most cluster at ~2.0x)
Reduced selection at very low prices
Evidence: Only 20% of cells show any floor effect; AI doesn't avoid suspiciously cheap options
Different models show different sensitivity
Evidence: Selection rates at 3x range from 20.8% to 59.4% (38.6pp spread, chi-square=197, p<0.0001)
.99 charm pricing has no effect
Evidence: 100% selection rate across all price formats (standard condition at 1.0x)
Price justification extends tolerance
Evidence: No significant lift from justification at 2.0x (-1.3% difference)
First-listed product gets selection advantage
Evidence: 5.5pp advantage for first position (chi-square=54.90, p<0.0001)
Price sensitivity varies by product category
Evidence: Commodities cliff at 1.2x, electronics at 1.5x
Statistical Methods
Breakpoint Detection
We used piecewise linear regression to identify the price multiplier where selection rate drops most sharply. Models were compared using AIC (Akaike Information Criterion) to confirm piecewise models outperform linear alternatives.
Model Heterogeneity Testing
Kruskal-Wallis H-test was used to assess whether breakpoints differ significantly across AI models. Result: no significant heterogeneity (p=0.72), indicating all models converge on similar price thresholds.
Confidence Intervals
Bootstrap confidence intervals (1000 resamples) were computed for all breakpoint estimates. The mean breakpoint of 1.94x has a range of 1.62x to 2.03x across models.
Judge Agreement
Inter-rater reliability was assessed using ICC (Intraclass Correlation Coefficient). Three judges scored each response with blinded model identifiers. Overall agreement rate: 97%, indicating excellent reliability.
Key Design Features
Config Locking
SHA-256 hashes of all config files verified before data collection began
Cross-family Judging
Claude responses not judged by Claude family models to eliminate bias
Position Randomization
Branded product position (first vs second) randomized across trials
Blinded Scoring
Model identifiers stripped from responses before judge evaluation
Cliff Oversampling
40 trials at cliff region (1.75x-2.0x) vs 20 elsewhere for precision
Resumable Collection
Scripts check for existing data before API calls for reliability
Data Availability
This study follows open science practices. Pre-registration, analysis scripts, and anonymized data are available through OSF.
Citation
Agentonomics. (2026). APIS Price Sensitivity Study: The 2x Rule for AI Commerce. OSF Preprints. https://osf.io/2xnmu