Between January 15 and March 28, 2026, we conducted a structured audit of 500 local business websites across 12 industries and 38 U.S. states. Our goal was straightforward: measure how prepared these businesses are to appear in AI-generated search results — the answers produced by ChatGPT, Gemini, Perplexity, and Google AI Overviews.
The results were more alarming than we expected. The vast majority of local businesses are not just unprepared for AI search — they are structurally invisible to it. And the gap between the businesses that are optimized and those that are not is growing wider every month.
This is the full data, the methodology behind it, and what it means for any business that depends on being found online.
Executive Summary
Why We Conducted This Study
The motivation for this research came from a pattern we observed in our own client work. Businesses that ranked well on Google were frequently absent from AI-generated recommendations. A plumber ranking #1 for "emergency plumber Dallas" might not appear at all when someone asked ChatGPT "Who's the best emergency plumber in Dallas?" These are fundamentally different systems with different selection criteria, and we wanted to quantify the gap.
Existing research from Search Engine Land and iPullRank/Yext established that AI assistants recommend only 1% to 11% of locations — but those studies focused on the output side (what AI recommends) rather than the input side (what businesses are doing to be recommendable). Our study fills that gap by examining the technical and content factors that correlate with AI citation.
Methodology
Sample Selection
Sites were identified through Google Maps searches for each industry in metropolitan, suburban, and rural areas across 38 states. We deliberately included a mix of page-1 Google rankers and page-2-3 businesses to avoid survivorship bias. Businesses without any website were excluded, as were franchise locations using corporate-managed sites (which would skew schema adoption rates upward).
Industry Distribution
| Industry | Sites Audited | States Represented |
|---|---|---|
| Plumbing | 52 | 32 |
| HVAC | 48 | 30 |
| Dental Practices | 45 | 28 |
| Auto Repair | 44 | 27 |
| Roofing | 43 | 26 |
| Landscaping | 42 | 25 |
| Pest Control | 40 | 24 |
| Electricians | 39 | 23 |
| Fencing | 38 | 22 |
| Painting | 37 | 21 |
| Tree Service | 37 | 20 |
| Concrete/Construction | 35 | 19 |
Audit Criteria
Each website was evaluated on 8 dimensions, scored individually, and then combined into a composite "AI-Ready Score" (0-100 scale, equally weighted):
- Structured Data Markup — Presence and completeness of Schema.org markup (LocalBusiness, Service, FAQ, Review schemas)
- Entity Clarity — NAP (Name, Address, Phone) consistency, service area definition, and "About" page quality that establishes the business as a distinct entity
- FAQ/Q&A Content — Dedicated FAQ pages, in-page FAQ sections, or content structured in question-answer format
- Content Freshness — Date of last published or updated content, blog activity frequency
- Mobile Performance — Core Web Vitals pass/fail status and mobile-friendly rendering
- AI-Formatted Content — Presence of answer capsules, concise service definitions, how-to structures, and direct-answer paragraphs
- llms.txt Presence — Whether the site hosts a /llms.txt file providing structured information for AI crawlers
- AI Citation Test — We queried ChatGPT, Gemini, and Perplexity with standardized prompts ("[service] near [city]") and recorded whether the business was cited in the response
Tools Used
Screaming Frog SEO Spider for crawling and schema detection. Google's Rich Results Test and Schema Markup Validator for structured data verification. PageSpeed Insights API for Core Web Vitals. Manual testing across ChatGPT (GPT-4o), Gemini Advanced, and Perplexity Pro for AI citation verification. All AI queries were conducted between March 1-28, 2026 using standardized prompt templates.
Finding 1: 73% Have No Structured Data Markup
This finding aligns with broader industry data. A 2026 systematic analysis of 500 law firm websites found similarly low adoption rates for comprehensive schema implementation. The pattern holds across industries: while 92% of top-ranking pages on Google use structured data (according to aggregate SEO research), the vast majority of local businesses have not implemented even basic schema.
The gap is particularly stark when you consider what AI systems need to recommend a business. Unlike Google, which can infer relevance from backlinks and content proximity, AI language models rely heavily on explicit structured signals — schema markup, clear entity definitions, and machine-readable service descriptions. Without these signals, a business may as well not exist to an AI assistant.
Of the 27% that did have some structured data, the implementation quality varied dramatically. Many had only basic Organization schema (name and logo) without the LocalBusiness, Service, or FAQ schemas that actually drive AI citation. Only 5.4% of all 500 sites had what we would classify as "comprehensive" schema implementation (3+ schema types properly nested and validated).
Finding 2: The AI Visibility Gap Is Real
This is arguably the most important finding of the study. We identified 156 businesses in our sample that ranked on Google's first page for their primary service keyword in their city. Of those 156, only 17 (10.9%) were also cited by at least one AI platform when we asked the equivalent question.
The remaining 139 businesses — ranking well on Google, presumably investing in SEO — were completely absent from AI-generated recommendations. Their SEO investment is not translating to AI visibility.
This confirms what Search Engine Land reported in January 2026: AI local visibility is up to 30x harder to earn than a Google ranking. The systems use different selection criteria, different content evaluation methods, and different trust signals. A business optimized for Google's algorithm is not automatically optimized for AI citation.
Conversely, 4 businesses in our sample that ranked on Google page 2 or 3 WERE cited by AI platforms — suggesting that AI systems may weight content quality and structure more heavily than traditional ranking signals like backlinks and domain authority.
Finding 3: Schema Adoption Varies Wildly by Industry
| Industry | Any Schema | LocalBusiness | FAQ Schema | Service Schema |
|---|---|---|---|---|
| Dental Practices | 41% | 38% | 14% | 9% |
| HVAC | 31% | 27% | 11% | 8% |
| Plumbing | 29% | 24% | 8% | 6% |
| Pest Control | 26% | 22% | 9% | 6% |
| Landscaping | 24% | 19% | 7% | 5% |
| Electricians | 23% | 19% | 6% | 4% |
| Auto Repair | 22% | 18% | 5% | 4% |
| Roofing | 19% | 16% | 4% | 3% |
| Painting | 18% | 14% | 4% | 3% |
| Fencing | 16% | 13% | 3% | 2% |
| Tree Service | 14% | 11% | 3% | 2% |
| Concrete/Construction | 12% | 9% | 2% | 1% |
The pattern is clear: industries where businesses commonly hire marketing agencies (dental, HVAC, pest control) show higher schema adoption. Industries where owners typically build their own sites or use basic website builders (fencing, tree service, concrete) have the lowest rates. This suggests the gap is primarily a knowledge and resource problem, not a technical one — schema implementation is straightforward once someone knows to do it.
FAQ schema deserves special attention. Across all 500 sites, only 6.4% had FAQ schema markup. Yet our AI citation testing showed FAQ content to be the single strongest predictor of AI recommendation. This represents the largest low-hanging-fruit opportunity in the dataset: adding FAQ schema is technically simple, yet almost no one is doing it.
Finding 4: Content Freshness Is a Stronger Signal Than Expected
Content freshness emerged as the third-strongest predictor of AI citation in our correlation analysis (r=0.58). This makes intuitive sense: AI models are trained on and retrieve from recent data. A website that hasn't been updated since 2023 is less likely to appear in training data refreshes and less likely to be surfaced by retrieval-augmented generation (RAG) systems that power real-time AI search.
The 3.2x citation lift for fresh content held even when we controlled for schema presence and FAQ content. This suggests that AI platforms may use content recency as an independent quality signal — similar to how Google's "freshness" algorithm update works, but potentially weighted even more heavily.
The practical implication is significant: a local business that publishes one high-quality, AI-formatted blog post per month may see measurably better AI visibility than a competitor with a technically superior but static website.
Finding 5: The 5 Strongest Predictors of AI Citation
| Rank | Factor | Correlation (r) | Citation Rate When Present |
|---|---|---|---|
| 1 | FAQ/Q&A Structured Content | 0.72 | 18.4% |
| 2 | Schema Markup Completeness | 0.64 | 14.2% |
| 3 | Content Freshness (<90 days) | 0.58 | 11.8% |
| 4 | Entity Clarity Score | 0.51 | 9.6% |
| 5 | Mobile Performance (CWV pass) | 0.34 | 7.1% |
The compounding effect is dramatic. When we looked at sites that had ALL five factors optimized (only 21 sites in our sample, or 4.2%), the AI citation rate jumped to 34%. Compare this to the 1.8% citation rate for sites with none of these factors — a 19x difference.
FAQ content's dominance as the top predictor deserves explanation. AI language models are fundamentally question-answering systems. When a user asks "Who's the best plumber in Austin?" the AI is looking for content that directly answers questions in a clear, authoritative format. Websites with dedicated FAQ sections, service-specific Q&A content, and answer-first paragraph structures give AI models exactly what they need to extract and cite.
Notably, traditional SEO factors like backlink count and domain authority did NOT appear as significant predictors of AI citation in our analysis. This reinforces Finding 2: the systems are fundamentally different.
The AI-Ready Score: Industry Rankings
| Industry | Average Score | Top Quartile | Bottom Quartile | Gap |
|---|---|---|---|---|
| Dental Practices | 38 | 67 | 14 | 53 |
| HVAC | 31 | 58 | 11 | 47 |
| Plumbing | 29 | 54 | 9 | 45 |
| Pest Control | 28 | 52 | 10 | 42 |
| Landscaping | 25 | 48 | 8 | 40 |
| Auto Repair | 24 | 45 | 8 | 37 |
| Electricians | 23 | 44 | 7 | 37 |
| Roofing | 21 | 41 | 6 | 35 |
| Painting | 19 | 37 | 5 | 32 |
| Fencing | 17 | 34 | 4 | 30 |
| Tree Service | 16 | 32 | 4 | 28 |
| Concrete/Construction | 14 | 28 | 3 | 25 |
The most striking pattern in this data is not the industry averages — it's the gap within each industry. In dental, the top-quartile performers score 67 while the bottom quartile scores 14. This means the best-prepared dental practices are nearly 5x more AI-ready than their least-prepared competitors in the same industry. The competitive advantage for early movers is enormous.
For industries at the bottom of the table (tree service, concrete, fencing), the opportunity is even more pronounced. When your entire competitive set scores between 3 and 32, even modest optimization — adding basic schema, publishing monthly FAQ content, implementing llms.txt — can vault a business into the top quartile of AI readiness for its industry.
What the Top 4% Do Differently
We examined the 21 sites that were cited by at least one AI platform to identify common patterns. Every single one had:
- Comprehensive Schema Markup — Not just basic LocalBusiness, but nested Service schemas, FAQ schemas on relevant pages, and Review/AggregateRating markup. Average of 4.3 schema types per site.
- FAQ Content on Service Pages — Not a single FAQ page buried in the footer, but question-answer sections embedded directly on each service page. Average of 5-8 FAQs per service page.
- Content Updated Within 90 Days — Active blogs, updated service descriptions, or recently published case studies. The freshest site had published 3 days before our audit.
- Clear Entity Definitions — Unambiguous "About" pages that state exactly what the business does, where it operates, who runs it, and what makes it different. Written in third-person authoritative tone.
- Mobile-First Performance — All 21 passed Core Web Vitals. Average LCP under 2.1 seconds. No layout shift issues.
- Answer-Formatted Content — Paragraphs that begin with direct answers before providing supporting detail. Headers phrased as questions. Concise definitions at the top of each section.
What's notable about this list is what's absent: none of these factors require large budgets, advanced technical skills, or years of SEO investment. They require knowledge of what AI systems look for and deliberate implementation. The barrier is awareness, not capability.
Implications for Local Businesses
The data tells a clear story: AI search is not a future concern — it is a present reality that most local businesses are completely unprepared for. BrightLocal's 2026 survey shows 45% of consumers now use AI tools to find local services, up from 6% one year ago. That's a 650% increase in AI search usage for local discovery in a single year.
Yet our data shows only 4.2% of local businesses are positioned to capture this traffic. The math is simple: massive demand growth meeting almost zero supply-side optimization creates an extraordinary window for businesses that act now.
Based on our findings, we recommend local businesses prioritize these actions in order:
- Implement LocalBusiness + Service + FAQ schema — This is the single highest-impact technical change. It can be done in a day and immediately makes the business machine-readable.
- Add FAQ sections to every service page — 5-8 questions per page, answered in 2-3 sentences each. Write them as if answering a customer directly.
- Publish fresh content monthly — Even one blog post per month about a common customer question keeps the site "alive" to AI crawlers.
- Create a clear entity definition — Rewrite the About page to explicitly state: what you do, where you do it, how long you've done it, and what makes you different.
- Add an llms.txt file — A simple text file at your domain root that provides AI systems with structured information about your business. Learn more about llms.txt.
Limitations and Future Research
We acknowledge several limitations of this research:
- Temporal snapshot: AI citation testing was conducted between March 1-28, 2026. AI platforms update frequently, and results may differ if tested at a different time.
- Sample bias: Our sample includes only businesses with existing websites. Businesses with no web presence (estimated at 27% of local businesses per Census data) are excluded, which likely makes our results more optimistic than the true population.
- Correlation vs. causation: While we identify strong correlations between optimization factors and AI citation, we cannot definitively prove causation. Businesses with better schema may also have other unmeasured advantages (better reviews, longer operating history, etc.).
- Geographic scope: 38 states were represented, but not all 50. Alaska, Hawaii, and several smaller states were underrepresented due to sample availability.
- Industry scope: Our 12 industries are all home-services or local-services focused. Results may differ for retail, hospitality, or professional services.
Future research should examine: (1) longitudinal tracking of AI citation rates as more businesses optimize, (2) the impact of review volume and sentiment on AI recommendations, (3) industry-specific AI query patterns and how they differ from Google search patterns, and (4) the role of third-party citations and directory listings in AI visibility.
Frequently Asked Questions
How were the 500 websites selected?
Sites were identified through Google Maps searches for each of the 12 industries across 38 U.S. states. We included a deliberate mix of businesses ranking on Google page 1, page 2, and page 3 to avoid survivorship bias. Franchise locations using corporate-managed websites were excluded.
What AI platforms were tested?
We tested ChatGPT (GPT-4o), Gemini Advanced, and Perplexity Pro. Each platform was queried with standardized prompts in the format "[service type] near [city]" — for example, "best plumber near Austin TX" or "emergency HVAC repair near Denver."
What does "AI citation" mean in this study?
A business was counted as "AI-cited" if it was mentioned by name, linked to, or explicitly recommended in the AI platform's response to our standardized query. Indirect mentions (e.g., "check Google Maps for options") were not counted.
Is this study peer-reviewed?
No. This is an industry research study conducted by our team using publicly available tools and standardized methodology. We have published our full methodology and limitations to enable replication and critique.
How quickly can a business improve its AI-Ready Score?
Based on the factors we measured, a business could realistically move from a bottom-quartile score to a top-quartile score within 30-60 days. The highest-impact changes (schema implementation, FAQ content, entity clarity) can be completed in 1-2 weeks. Content freshness requires ongoing monthly effort.
Does Google ranking help with AI visibility at all?
Our data shows a weak positive correlation — businesses ranking on Google page 1 were slightly more likely to be AI-cited than those on page 3. But the relationship is far weaker than most people assume. 89% of page-1 rankers were NOT cited by AI. The systems evaluate content differently.
What is llms.txt and how many sites had it?
llms.txt is a text file placed at a website's root (like robots.txt) that provides AI systems with structured information about the business. Only 9% of sites in our sample had one. Industry-wide adoption is estimated at approximately 10% according to Link Building HQ's 2026 analysis.
Methodology Disclosure
This study was conducted by the Heliux Digital research team between January and March 2026. Heliux Digital is a digital marketing agency specializing in AI search optimization for local businesses. While we have made every effort to present objective findings, readers should be aware that our business model is aligned with the services this research suggests businesses need. We encourage independent replication of this study and welcome methodological feedback.
Data collection tools: Screaming Frog SEO Spider v20.1, Google PageSpeed Insights API, Schema Markup Validator, ChatGPT (GPT-4o, March 2026), Gemini Advanced (March 2026), Perplexity Pro (March 2026). Raw data is available upon request for academic or journalistic purposes.