NeverRanked · Teardown 09 · Honolulu HVAC

We predicted Claude would ignore Honolulu AC companies. Then we measured.

12-firm Honolulu/Oahu HVAC cohort, 18 hash-locked questions, 3 clean runs finalized 2026-06-11. The prediction was committed to a timestamped private record on 2026-06-09, before any HVAC data existed. Individual firms anonymized. Counts and distributions named.

The headline finding in one sentence: before any HVAC data existed, NeverRanked committed a prediction to a timestamped private record: Claude would cite Honolulu AC companies’ own websites less than 5% of the time. The measurement came in at 2% across 606 Claude citations, while the web-searching engines reach 38% to 66% on the identical questions. The forecast held. This is the third unrelated category where Claude’s training-data citation of local firms collapses to near-zero, after CPA firms (across three geographies) and med spas.

The prediction, on the record

Most measurement is reported after the fact, which makes it impossible to tell a finding from a story told to fit the data. This one was different. On 2026-06-09, before a single HVAC query had run, we wrote down a prediction and committed it to git. The commit timestamp predates the first measurement run. The prediction said two things:

Claude would cite AC companies’ own websites under 5% of the time. The reasoning: home services is an editorially thin local trade. Individual AC companies generate almost no national editorial coverage, so they barely exist in Claude’s training data.
The web-searching engines would reach 35% or higher on the same questions, so any collapse would be Claude-specific rather than a category nobody cites.

Both held. Claude came in at 2%. The four AI-answer web engines pooled to 47%. We also wrote down what would have falsified the prediction (Claude at 10% or higher, or web engines below 35%), so this was a real forecast with a way to be wrong, not a hedge. The committed file, with the falsification criteria and the result recorded next to the forecast, is a timestamped entry in our internal record. That is the difference between a finding and a forecast, and it is the difference between research and a marketing claim.

Why HVAC, and why it matters as a measurement subject

Home services is the largest agency-served local vertical there is. Whole marketing agencies do nothing but HVAC, plumbing, and roofing. The queries are pure commercial intent: buyers search by service (AC repair, AC installation, mini split), by area (Honolulu, Oahu, Waikiki), and by urgency (emergency AC repair). The field has no dominant national brand, so the AI-citation surface is genuinely contested. It is also a clean test of the training-data hypothesis: if editorially thin local trades collapse on Claude, HVAC should collapse. It did.

Methodology summary

Same 7-AI-tool methodology applied across all NeverRanked teardowns:

5 web-searching AI tools: Perplexity, ChatGPT search, Gemini (grounded with live web search), Microsoft Copilot via Bing organic results, Google AI Overviews.
2 training-data AI tools: Claude (via the Anthropic API), Gemma (open-weight, run on a model-hosting provider so the model itself is independently inspectable).

18 questions a Honolulu AC buyer would actually ask AI, locked at hash 86b2b18e... so every run compares apples to apples. 3 repetitions per question per AI tool to separate signal from noise. 3 clean usable runs, 6,218 citations total. Pattern-readiness rule of 3 usable runs cleared per the internal pattern-readiness rule. The 12-firm cohort was built from the citations themselves: a first run with no cohort registered, then a cohort-coverage scan that surfaced every AC company AI cited 45 or more times across the runs. Directories (Angi, HomeAdvisor) and national retailers (Home Depot) were excluded by rule, so the cohort is AC companies only.

Full documented methodology at /methodology/.

Source-type distribution (cohort-wide)

Across all 12 AC companies and all 7 AI tools, AI pulled answers from these source types:

Source type	% of mentions	Count
Independent web (third-party content)	53%	3,274
Competitor (AC-company-owned websites)	38%	2,352
Review directories (Angi and similar)	7%	411
Wikipedia	1%	67
Reddit, YouTube, social, forum (combined)	2%	114

The 38% AC-company-owned share is pooled across all 7 tools, dragged down by the training-data engines. On the five web-searching engines alone, own-site share is 40%, and across the four AI-answer engines (excluding Copilot, which sits at 0%) it is 47%. Review directories are only 7% of mentions, so this is a firm-heavy field: AI mostly chooses between the AC companies’ own sites and editorial third-party content, not aggregators.

Per-AI-tool breakdown

AI tool	AC-company-owned share	Third-party share	Total mentions
ChatGPT search	66%	34%	935
Gemma (training data)	63%	38%	432
Gemini grounded	45%	53%	1,674
Google AI Overviews	45%	43%	302
Perplexity	38%	59%	1,477
Claude (training data)	2%	98%	606
Microsoft Copilot (Bing)	0%	91%	792

ChatGPT search cites AC companies’ own sites two-thirds of the time, the strongest web engine here. Gemma, a training-data engine, sits high at 63%. Then the bottom two rows are the blind spots: Claude at 2% and Microsoft Copilot at 0%. Note the split inside the training-data engines: Claude at 2% and Gemma at 63% is a 61-point gap on the exact same firms, the widest in the entire NeverRanked dataset.

The Claude collapse, now confirmed across three industries

Claude answers from training data, not live search. At 2% own-site share for Honolulu AC companies, it has effectively no memory of these firms. We measured the identical collapse for Hawaii CPA firms (Claude 1%, geo-invariant across three markets) and Honolulu med spas (Claude 2%). Accounting, aesthetics, home services: three unrelated industries, the same near-zero result. The collapse is not a quirk of one category. It is what happens to local-service businesses that generate little national editorial coverage. They barely exist in Claude’s training data, so Claude barely cites them. And because we forecast this one before measuring it, the pattern is no longer three after-the-fact observations. It is a rule that made a risky prediction and the data agreed. For an AC company, the practical read is that Claude is a category-wide blind spot no single firm closes on a short timescale, so the addressable surface is the web-searching engines, where own-site work moves the number directly, plus Gemma (63%), which is reachable through sustained editorial presence over the training cycle.

The Microsoft Copilot first-mover opening

Microsoft Copilot answers using Bing’s organic search results. Across 792 mentions for Honolulu AC questions, zero went to any of the 12 firms’ own websites. The same cohort-wide gap shows up in every category we measure, because for these queries the current Bing top results are dominated by directories and editorial content, not individual firms’ sites. The opening is open for every AC company simultaneously: whichever Honolulu firm ranks first in Bing organic for the common queries (AC repair Honolulu, AC installation Honolulu, emergency AC repair) effectively owns the Copilot answer while every competitor is still invisible there. We name the condition. Whether changing it closes the citation is a measurement question we keep answering month over month.

Top recurring AC companies (anonymized)

The 5 AC companies AI cited most often across the 18 questions and 7 tools, by total mentions across the 3 runs:

AC company (anonymized)	Total mentions	Runs cited in
Firm A	404	3/3
Firm B	268	3/3
Firm C	222	3/3
Firm D	206	3/3
Firm E	198	3/3

The top 5 appeared in all 3 measurement runs, a consistency signal rather than run-to-run noise. The remaining 7 firms in the cohort have meaningful but lower mention counts. The shape matches other local-service categories: a small top tier AI knows well, a longer tail it cites less often.

What this teardown does and does not prove

What it supports:

The 12-firm cohort is the set of Honolulu AC companies AI tools actually cite for buyer-shaped questions, surfaced from the citations rather than picked in advance.
The web-searching engines cite AC-company-owned sites at 38% to 66%, with ChatGPT search highest.
The Claude 2% collapse was pre-registered and confirmed, and it matches the CPA and med spa results, supporting a category-level (editorially thin local-service) reading rather than a one-off.
The Microsoft Copilot 0% own-site share is cohort-wide and consistent across all 3 runs.

What it does not yet support:

That AI behavior on these questions stays stable over months. Models refresh training data and search indices on schedules outside our control. Re-measurement is the only honest answer.
That changing an AC company’s site or its third-party presence would cause AI to cite differently. We measured what AI cites. Causation requires pre-registered experiments against named firms with control for confounds. Different scope.
That the Microsoft Copilot first-mover opening is actually closable. The Bing organic ranking competition is its own beast this measurement does not measure.

Why this is anonymized

None of the 12 AC companies in this cohort are paying NeverRanked customers. The non-customer anonymization rule applies: counts, distributions, and per-AI-tool numbers are public. Individual firm names are not. The pattern is what is informative on a public surface. An AC company that becomes a customer gets a 1:1 deliverable that names every firm in the cohort, names the queries it is missing on, and ranks the closable conditions. That deliverable is private to the customer.

Run the free check Cross-category teardown How we measure

Pre-registration: the prediction (Claude under 5%, web engines 35% or higher) was committed to a timestamped private record on 2026-06-09, before any HVAC run existed. The commit predates the first measurement. Falsification criteria and the recorded result are committed next to the forecast in that record.

Measurement window: 3 clean usable runs finalized 2026-06-11, 6,218 citations. Figures generated from the aggregate tooling (teardown-data.mjs) and drift-monitored. Pattern-readiness rule of 3 runs cleared per the internal pattern-readiness rule. Refresh cadence is monthly or on customer request.

Substantiation: question set locked by hash 86b2b18e..., documented method at /methodology/, dated runs on the /claims/ ledger, named AI tools on named dates. Gemma is open-weight, so the model itself is independently inspectable.

Anonymization: the 12-firm cohort is kept anonymized at the firm level per the non-customer rule. Counts, distributions, and category-level source surfaces are public. Individual firm names are not.