AI Search Visibility Is Asking the Wrong Question

An empirical look at whether AI visibility actually moves purchase — and what one controlled experiment among 604 dog food buyers tells us about a budget decision the entire industry is making right now.

By Walter Carl, Ph.D. Founder and Chief Research Officer at Purchased. Eighteen years shipping consumer-insight products for Google, Walmart, Microsoft, Gap Inc., and other global brands.

Imagine a CMO at a household-name CPG brand wrestling with whether her team should be reorienting their search budget around AI visibility. Her digital lead has been to three conferences in two months. The agencies are pitching AEO and GEO services. Her board is asking what she’s doing about ChatGPT. And the practitioner community on LinkedIn is full of confident prescriptions: optimize for citation, build context moats, win the relevance signal, restructure your content for retrievability.

Her question is simpler than any of those prescriptions: if I do all of this, will it actually move purchase?

Nobody has given her a clear answer. Not because they’re being cagey — because nobody really knows.

The field is asking the wrong question

Almost everything written about AI search right now sits on the input side of the question. How do you get cited in AI Overviews? What content structures help? Which tools should you use to track visibility? What does your schema need to look like? What signals does the model weight?

These are good questions, and they’re being answered by serious people doing rigorous work. Mike King’s relevance engineering project is the most technically sophisticated framework for thinking about how language models evaluate and surface content. Duane Forrester has made the strategic case for context moats — the idea that brands need to own enough surface area in the training and retrieval corpus to be defensible. Kevin Indig, Rand Fishkin, and Eli Schwartz have all argued, in different framings, that the real game isn’t optimization at all, but whether you have a brand strong enough to be named when the model is asked — Schwartz captures the position as “SEO is a product, AEO is brand” — and Indig has gone further, running UX research on how users actually interact with AI Overviews. Lily Ray, Aleyda Solis, and the Ahrefs editorial team have made the case that AI search isn’t a discontinuity from classic search but a continuity of it — the same principles of authority, topical depth, and entity recognition that won before continue to apply. Finally, Pete Blackshaw represents a fifth school — Algorithmic Trust and Risk — arguing that brands must proactively engineer algorithmic trust and mitigate brand vulnerability rather than merely optimizing for visibility. Underneath all five, Profound, BrandRank.ai, Brandlight, and a growing tooling category are building the observability layer the field will need to track any of these prescriptions in practice.

Five schools of thought, plus an observability layer underneath. All serious. Most addressing the same input-side question: how do you become more visible in AI answers? The behavioral work is the partial exception — it asks what users actually do in front of an AI answer once it’s there, which is adjacent to our question. But it still stops short of the purchase decision itself.

What nobody has yet tested, in a controlled experimental setting with observed purchase behavior,¹ is the purchase-side question: does being visible in AI answers actually move purchase behavior — and if it does, whose purchase behavior, and under what conditions?

The entire field is operating on the assumption that AI visibility translates into purchase. The work is downstream of that assumption. The translation itself hasn’t been empirically tested.

That’s the question I set out to answer.

What we tested

We ran a 4-cell controlled experiment with 604 U.S. adults who actively purchase dog food. (The full methodology, statistical detail, and supporting tables are documented in the case study. What follows here is the argument the data supports.) Cell 1 was a control: standard organic results, no AI Overview. Cell 2 showed Google’s natural AI Overview with the brands it actually recommends. Cell 3 displaced an established brand to a below-the-fold position within the AIO by promoting a smaller brand into the top spot. Cell 4 repeated the displacement but added a paid text ad above the AI Overview for the displaced brand.

Participants made an observed purchase decision — they identified a specific product, navigated to a real retailer, and submitted a screenshot of what they would buy. The design put participants into a realistic purchase decision mindset rather than relying exclusively on abstract rating questions about their hypothetical likelihood of purchasing brands from a researcher-defined list.

The findings I’m about to walk through come from one study, in one category, with one experimental design. I’ll be specific about that one study limitation later in this essay, because I think it matters and I think it’s also where the field’s real problem lies. But three of the findings are uncomfortable enough to warrant attention now.

Finding 1: The AI Overview is a confirmation engine, not a discovery engine

When the AI Overview named a brand prominently, repeat purchase for that brand went up. For Purina Pro Plan — the brand with the largest loyal customer base in our sample — repeat purchase increased 6.6 percentage points (p = .030) when the AIO was present, consistent with what you’d expect from a validation effect among existing customers.

First-time purchase did the opposite. For Purina Pro Plan, first-time purchase fell from 10.0% to 4.2% when the AIO appeared — a 5.8-point decline (p = .048). Hill’s Science Diet showed the same directional pattern at a smaller magnitude that didn’t reach significance.

We also tested whether the AIO could elevate smaller brands. We promoted JustFoodForDogs and PetPlate to the first position in the AI Overview, above the fold. JustFoodForDogs produced zero purchases across every cell. PetPlate showed no significant movement on any metric.

The pattern across both findings is consistent: the AIO reinforces existing brand preference. It does not appear to create new preference, and it does not appear to convert unfamiliar brands into considered ones, even when the model itself surfaces them prominently.

The implication is uncomfortable. The CMO investing in AI visibility because she thinks it will deliver new-customer acquisition is, on this evidence, investing for the wrong reason. The AIO’s measurable effect in our data runs through loyalty, not discovery. Defending existing customers from defection is a real and probably underweighted use case. Acquiring new customers through AI visibility, at least in this study, did not work.

Finding 2: Visibility is binary, not ranked — and displacement below the fold is costly

Two-thirds of users in our study never expanded the AIO to see brands below the fold. When we added a paid text ad above the AIO, the expansion rate dropped to about 9% — meaning roughly 91% of users never saw anything below the visible portion of the AI Overview.

This collapses the optimization problem in a way the field hasn’t fully reckoned with. The question isn’t really where your brand ranks inside the AI Overview as a whole. The question is whether your brand is in the part people actually look at — the part above the fold. Outside that, you might as well not exist. This is consistent with independent UX research finding that users rarely scroll deep into AIOs or click cited links.²

When we displaced established brands below the fold, repeat purchase dropped 8.9 percentage points among users who didn’t expand the AIO. Under a sensitivity analysis restricted to prior purchasers of the displaced brand, that effect grew to 40.3 points (p = .010). For a brand with meaningful loyalty equity, losing above-the-fold AIO presence is not a small problem.

Finding 3: Paid search recovers displacement — but only through clicks

When we added text ads above the AIO for the displaced brand, we recovered substantial ground. Purchase intent among ad clickers climbed to 67.0% versus 49.3% in the displaced group — a 17.7-point lift (p = .002). Observed purchase rates among ad clickers reached 29.6% versus 16.6% displaced (p = .010).

The qualifier matters. Non-clickers showed no improvement at all. Their metrics were statistically indistinguishable from the displaced group across every measure. Passive ad exposure didn’t recover anything. The ad had to earn the click.

This finding has direct budget implications. A defensive paid search strategy is viable for brands with the kind of loyalty base that produces high click-through rates on branded queries. For Purina Pro Plan, 40% of users clicked the ad. For brands without that base of loyalists actively seeking them, the recovery mechanism isn’t available — there isn’t a loyalist click to earn.

There’s also a complication worth naming. The same ads that recover performance suppress expansion behavior across the page. Users seeing a text ad above the AIO collapse the visible-information set down to what’s above the fold. The toggling that would have surfaced more brands disappears. The defensive ad recovers your brand and quietly reduces visibility for everyone else, including organic results below.

The implication for budget decisions: visibility is a loyalty mechanism, not an acquisition one

What this adds up to, with appropriate caution about generalization:

The CMO asking whether AI visibility is worth investing in is asking the right question. The answer this study suggests is yes — but probably not for the reasons her agencies are giving her, and probably not in the proportion she’s being asked to fund.

AI visibility looks, on this evidence, like a loyalty defense mechanism. Losing it hurts existing customers’ willingness to repurchase. Winning it back through paid search works, conditionally, on having loyalists who’ll click. None of the brand-building or customer-acquisition rationale that the AEO industry tends to lead with is supported by the data we collected.

If that’s right, the budget question reframes. Spending to win AI visibility because it will acquire new customers is not, on this evidence, what the data supports. Spending to defend AI visibility because losing it will cost you loyalists is what the data supports. These are different conversations with different sizing, different KPIs, and different success criteria. Conflating them — which most of the current vendor discourse does — leads to investment that isn’t aligned with where the value appears to actually sit.

One study is not a verdict — but it’s not nothing

The obvious objection to everything I’ve just written is the most important one: this is one study, in one category, with one set of design choices. The dog food category has structural characteristics — high repeat purchase rates, established brand loyalty, a long tail of niche brands competing with a few large incumbents — that may not generalize to financial services, telecom, personal care, B2B software, automotive, or anywhere else.

That objection is correct. It’s also, I think, the wrong reason to dismiss the work.

The field’s actual problem isn’t that this study covers only one category. It’s that no other category has been studied at this level of rigor by anyone. The five schools I named at the top of this essay — all of which I respect and continue to learn from — are operating without controlled experiments testing whether their prescriptions move purchase. The measurement tooling category is selling visibility metrics whose link to purchase behavior remains untested. The brand-strength school is making theoretically strong arguments without behavioral data behind them. Even the most rigorous work in this space is, in evidentiary terms, less constrained than what we did.

That’s not a criticism — it’s a description of where the field is. I’m raising the point because the right response to a single-category study is more single-category studies, not retreat to assumption. If you’re a brand or agency telling yourself that this dog food finding doesn’t apply to your category, the test of that hypothesis is to run the study in your category. The cost of being wrong about it — investing significantly in AI visibility on the theory that it drives acquisition when it actually drives loyalty defense — scales with your media budget.

The interesting next move isn’t another opinion about AI search. It’s a second data point for your category.

An invitation to category partners

The dog food study above demonstrates what the methodology can do. It applies across categories, but designing it well requires close consultation with your team — adapting the experimental setup to your category’s structure, your competitive set, and your specific business model. That’s how the next data points get built. That’s how the field starts having a body of empirical work to draw on rather than a stack of opinions.

If you work at a brand, agency, or platform partner that wants to commission this kind of research in your category — telecom, financial services, CPG outside pet, automotive, healthcare, travel, anything — that’s a conversation I want to have. Purchased designs and runs the studies. You get findings specific to your business. I could be wrong about dog food applying to your category. I could also be right. There’s only one way to find out.

You can reach me through the contact form or on LinkedIn.

For the full methodology, statistical detail, and supporting tables, download the case study at purchased.com.

Survey research has examined psychological links between AI exposure, brand trust, and stated purchase intent — see Guerra-Tamez et al., “Decoding Gen Z: AI’s influence on brand trust and purchasing behavior,” Frontiers in Artificial Intelligence (2024). The distinction here is observed purchase behavior in a controlled experiment, rather than self-reported attitudes. ↩
Kevin Indig and Eric Van Buskirk’s 2025 screen-recording study of users interacting with Google AI Overviews found a median ~30% scroll depth inside AIOs and citation click rates of roughly 7% on desktop. Pew Research’s 2025 passive-browsing analysis showed users clicked a cited source link in about 1% of visits to pages with an AI summary. ↩

Navigation