AI shopping agents don't browse the way humans do. They don't click through category pages or scroll product listings — they pull structured data from feeds, APIs, and schema markup, then surface the best match to a user's query. If your product data is thin, inconsistent, or missing key attributes, your products won't make the cut. AI product feed discoverability is quickly becoming one of the highest-leverage levers in e-commerce — and most brands aren't paying attention to it yet.
How AI Shopping Agents Actually Discover Products
The mechanics vary by platform, but there are two primary discovery paths that AI agents use to find and surface products.
The first is feed-based discovery. Platforms like Google Merchant Center, Meta Catalog, Microsoft Shopping, and emerging AI commerce protocols give agents a structured inventory to query against. When a user asks an AI assistant for a recommendation, the agent can look up products by attribute, price, availability, and category from a curated, machine-readable source. Feed quality directly controls whether your product appears in those results.
The second is crawl-based discovery. AI agents — including Google's Gemini, Perplexity, and ChatGPT's Browse mode — crawl the web and extract product information from your product detail pages (PDPs). What they find is largely determined by your Schema.org Product markup. Without structured markup on your PDPs, agents are left to parse unstructured HTML and make inferences, which leads to incomplete or inaccurate product representations.
Most brands are weak on both. Feeds submitted years ago for Google Shopping never get updated. Schema markup was added once, then ignored. In an environment where AI agents are making purchasing recommendations at scale, outdated data is a competitive liability.
Schema.org Product Markup: The Non-Negotiable Foundation
Schema.org Product markup is the most direct signal you can give an AI crawler about your products. It's machine-readable, platform-agnostic, and sits directly on your PDPs where crawlers find it. A well-implemented Product schema tells AI agents exactly what you're selling without requiring them to interpret your page layout.
The core fields that matter most for AI discoverability are:
name — Your product title. Keep it specific and attribute-rich (brand, model, size, color where relevant). Don't optimize for aesthetics here; optimize for query matching.
description — A clear, factual description of what the product is and does. AI agents use this to match products to conversational queries. Vague copy like "our best product yet" is useless to a machine.
offers — Price, currency, availability, and URL. Stale pricing or "out of stock" availability left in schema after restocking will get your products excluded from results that filter by availability.
sku / gtin / mpn — Global Trade Item Numbers (GTINs) and Manufacturer Part Numbers (MPNs) are critical for disambiguation. When an AI agent is cross-referencing product identity across sources, GTINs are the primary key. Missing GTINs make it harder for agents to match your listing against user intent confidently.
aggregateRating — Review counts and ratings. AI agents factor social proof into recommendations. A product with 400 reviews at 4.7 stars is surfaced over a comparable product with none.
image — High-resolution, clean product images. AI models generating product cards for users rely on your image URLs. Low-quality or missing images reduce the likelihood of inclusion.
Implement this markup in JSON-LD format in the <head> of every product detail page. Google's Rich Results documentation is the definitive reference for what's supported and tested.
Product Feed Optimization for AI Surfaces
Structured markup on your site handles crawl-based discovery. Feed optimization handles the platforms that query your inventory directly. The two main systems most brands need to maintain are Google Merchant Center and Meta Catalog — though the principles apply broadly to any product feed.
Google Merchant Center is the gateway to Google AI Mode, Google Shopping, and Performance Max campaigns. Google's AI surfaces increasingly pull product data from Merchant Center when users ask shopping-related questions. Feed completeness and quality directly affect whether your products are eligible for AI-driven placement.
The attributes that most often create gaps: missing GTINs for branded products, vague product type hierarchies, generic descriptions that don't reflect how users search, missing size/color/material attributes for apparel and home goods, and stale pricing that hasn't been synced with your live site. Google's product data specification is comprehensive — audit your feed against it before assuming you're fully compliant.
Meta Catalog powers AI-driven product recommendations in Advantage+ Shopping Campaigns and Meta's AI ad products. The same fundamentals apply: complete product titles, accurate pricing, availability status, and high-quality images. Meta is increasingly using Catalog data to power automated creative, so thin product data compounds into poor ad performance beyond just discoverability.
Feed freshness is often overlooked. AI agents that query product availability in real time will exclude products marked as unavailable in your feed, even if they're back in stock on your site. Automated feed sync — ideally updated daily or more frequently for high-velocity inventory — is no longer optional for brands competing for AI-driven traffic.
Writing Product Titles and Descriptions That AI Agents Understand
The way humans write product copy and the way AI agents parse product data are fundamentally different. Most e-commerce copy is written for conversion — it's punchy, benefit-forward, and brand-voice driven. That approach works for human browsers. AI agents need factual, attribute-dense content they can match against specific queries.
For product titles, the structure that performs best for AI discovery is: Brand + Product Name + Key Differentiating Attributes. For a running shoe, that might be "Nike Air Zoom Pegasus 41 — Men's Road Running Shoe, Size 10, Blue/White." This title contains brand, model, gender, use case, size, and colorway — all attributes an AI might filter on when answering a query like "best Nike road running shoes for men under $150."
For descriptions, prioritize functional detail over marketing language. Cover what the product is, what it's designed for, who it's for, what it's made of, and any specifications that distinguish it from similar products. The first 200 characters carry the most weight — lead with the most important attributes, not a tagline.
Avoid keyword stuffing. AI models are sophisticated enough to penalize descriptions that read as manipulative rather than informative. Write for clarity, not density.
Emerging Protocols Brands Need to Know About
Beyond existing feed platforms, a new layer of open infrastructure is taking shape specifically for AI commerce. The Universal Commerce Protocol (UCP) — an open standard backed by Google, Shopify, Walmart, Target, Etsy, and Wayfair — is designed to let AI agents query product availability, pricing, and inventory in real time without relying on cached feed data.
UCP gives AI agents a direct API path to merchant inventory. Instead of reading a feed that might be 24 hours stale, an agent can query your catalog in real time and return accurate availability and pricing to the user. Brands that implement UCP early will have a structural advantage in AI-native commerce environments.
Separately, llms.txt is an emerging convention (analogous to robots.txt) that lets site owners provide AI crawlers with curated, plain-text summaries of their content and products. Adoption is early, but it's worth watching — and low-effort to implement if you want to be ahead of the curve.
These protocols represent the next layer of AI discoverability infrastructure. Brands that treat them as optional extras now will be scrambling to catch up in 12–18 months.
Measuring AI-Driven Discovery in GA4
Measurement is where most brands hit a wall. AI discovery generates traffic across several different channels, and without deliberate configuration in GA4, it either gets miscategorized or disappears entirely.
Referral traffic from AI platforms. When a user clicks through to your site from a ChatGPT or Perplexity recommendation, GA4 captures the referrer. You'll see sessions from domains like chatgpt.com, perplexity.ai, or claude.ai in your referral traffic. Build a custom channel grouping in GA4 that segments these referrers under an "AI Referral" channel so you can monitor volume and conversion rates over time.
Google AI Mode traffic. Clicks from Google's AI Mode and AI Overviews arrive as organic search sessions — they're indistinguishable in GA4 from traditional organic Google traffic without additional data from Google Search Console. GSC will increasingly surface AI-driven impression and click data as Google rolls out those metrics, so connecting GA4 to GSC is essential for any brand tracking organic performance.
Agentic order gaps. For agentic checkout flows — where the purchase completes on a third-party platform rather than your storefront — your client-side GA4 tag won't fire. You'll see the revenue in your order management system but not in GA4, creating a gap in your attribution data. Closing that gap requires server-side data: platform-level order attribution from your commerce platform, or GA4 Measurement Protocol hits sent from your backend when an order is confirmed. This is the same challenge covered in the Shopify Agentic Storefronts post — and it applies regardless of your platform.
Start by establishing a baseline now. Add the custom channel grouping, connect GSC, and note the current volume of AI-sourced sessions. In 6–12 months, you'll want that historical data to quantify AI's share of your discovery funnel.
The brands that win in AI-driven commerce won't just have great products — they'll have great product data. Feed completeness, schema accuracy, and real-time availability are the new SEO. The underlying discipline is the same: give the algorithm exactly what it needs to surface you confidently.
Building Your AI Discoverability Strategy
Pulling this together into a practical roadmap comes down to four areas of work.
Audit your current state. Run your product detail pages through Google's Rich Results Test to see what structured data is present and valid. Pull your Google Merchant Center feed and check feed health — look at disapproved products, missing attributes, and data quality issues. Do the same for Meta Catalog. These audits will surface the biggest gaps quickly.
Fix the data fundamentals. Add or update Schema.org Product markup on all PDPs. Prioritize GTINs, availability, and pricing accuracy above everything else — these are the fields most likely to exclude you from AI results when wrong. Clean up your Merchant Center feed attributes: product type hierarchy, custom labels, and descriptions. Set up automated feed refreshes so availability and pricing stay current.
Configure measurement before you need it. Set up the GA4 custom channel grouping for AI referrers now. Connect Google Search Console to your GA4 property. If you're on a major commerce platform, check whether it provides AI-channel order attribution natively — many now do. Understand which checkout flows bypass your storefront so you know where your measurement blind spots are.
Monitor and iterate. AI discoverability is a moving target. Google's AI Mode surfaces, ChatGPT's shopping integrations, and emerging protocols like UCP are all evolving quickly. Check your referral sources monthly, watch for new AI-sourced domains appearing in your data, and stay current on Merchant Center feed spec updates. The brands staying ahead aren't doing anything exotic — they're maintaining rigorous data hygiene and paying attention to where referral traffic is actually coming from.
If your GA4 implementation or product feed setup needs a structured review before building on top of it, that's worth doing first. You can't optimize what you can't measure, and you can't measure accurately without clean data collection.
Frequently Asked Questions
AI shopping agents prioritize feeds with complete, accurate structured data: precise product titles with key attributes, detailed descriptions, GTINs/MPNs, availability status, pricing, high-quality images, and rich categorization. Schema.org Product markup on your PDPs gives AI crawlers a direct machine-readable signal independent of your feed platform.
Google AI Mode draws from both. Products submitted through Google Merchant Center get priority placement in AI-powered shopping surfaces. Google's crawlers also pick up Schema.org Product markup directly from product detail pages. Having both in place maximizes coverage across AI-driven and traditional search surfaces.
In GA4, AI referral traffic typically arrives as referral sessions from domains like perplexity.ai or chatgpt.com, or as organic search from Google (which includes AI Overview and AI Mode surfaces). Use custom channel groupings to isolate known AI referrers. For agentic checkouts that bypass your storefront, you'll need server-side data — platform order attribution or GA4 Measurement Protocol hits — since client-side tags won't fire.
Is Your Product Data Ready for AI-Driven Commerce?
Get a free review of your GA4 setup, product feed health, and schema markup — and a clear plan for closing the gaps before AI becomes your primary discovery channel.
Schedule a Free Call