Published 2026-05-21 · Honolulu

We killed our own thesis.

NeverRanked sold a product that did not work. This is the full story, written by the person who sold it.

The thesis

Until May 2026, NeverRanked sold a small JavaScript snippet. You pasted it once on your website. We claimed it made the structured-data signals on your pages legible to AI answer engines (ChatGPT, Perplexity, Google AI Overviews, Claude) and that being legible was the thing that earned citations in their answers.

It was a clean pitch. It had a clear mechanism. It had a customer who could be referenced by name. It had paying agencies considering reseller tiers. The framing was: SEO is the wrong dial to turn for AI search, AEO is the right one, and we ship the AEO infrastructure faster than any agency can build it in-house.

The thesis was wrong. We can prove it, because we tested it.

The kill test

Before testing, we wrote down what would count as a pass and what would count as a fail. This matters. The single most common way founders fool themselves is moving the goalposts after the data lands. So the criterion was locked, in writing, in a file we hash-stamped:

Hypothesis: shipping our snippet on neverranked.com drives
            AI citation share for AEO-category queries on
            Perplexity within seven days.

Test: 18 hash-locked queries, three repetitions each, on
      the Perplexity API. Domain: neverranked.com.
      Snippet deployed and verified rendering.

Pass:   >= 30% of slots cite neverranked.com.
Fail:   < 10% of slots cite neverranked.com.
Gray:   10-30% triggers a follow-up before any decision.

The pre-registration file was hash-stamped before the test ran. Hash: 87bd4abefb331bc9. That hash is in the file. The same hash printed by the runner on the day the test ran. The file is available on request, so you can read the inputs without trusting the outputs.

We ran the test on 2026-05-19. The result was zero.

Not "below the pass bar." Not "in the gray zone." Not "weak signal that needs more data." Zero of 409 citation slots cited neverranked.com. The snippet was on the page. The crawlers had been to the page. The structured data was rendered in the DOM. None of it produced a single citation.

Why it failed

The failure had a single technical cause we should have tested first. The LLM crawlers that AI engines use to gather pages for answer-grounding (GPTBot, ClaudeBot, PerplexityBot) do not execute JavaScript. They fetch the raw HTML. Our snippet ran in the browser, after the page loaded, to inject JSON-LD structured data into the DOM. That injection was invisible to every crawler that mattered. The structured data we shipped, the central mechanism the entire product was built on, was present only for human visitors and search engines that already had it indexed via Google.

If we had tested this single assumption six months earlier, the product would never have been built. The discipline failure was not technical. It was that we let conviction outrun the test.

What the product was actually doing

The named customer reference we had been using publicly was Hawaii Theatre Center. The story we told about HTC was that their NeverRanked AEO score moved from 45 to 95 in ten days after we deployed the snippet, and that Perplexity cited them on 14 of 19 tracked queries the same week. Both numbers were real measurements. The causal link between them and our work was not.

The 45-to-95 score was a measurement of structured-data presence in the DOM, which the snippet did move. The 14-of-19 Perplexity citations was authority-driven, pre-existing behavior we observed but did not cause. Both are useful diagnostics. Neither one was evidence of the product working the way we claimed.

The diagnostic work for HTC was real, valuable, and the customer kept it. We surfaced an expired Charity Navigator profile that had not been updated since 2023, a BBB profile last touched in 1999, missing presence on Bing Business Profile, misconfigured authority backlinks, and we collaborated with their team on meta description rewrites. None of that required the snippet. All of it is the kind of gap a standard SEO scan misses and a forensic measurement engagement surfaces.

What we did when the test came back zero

Stopped selling the product within 24 hours.
Retired the public surfaces that promoted it. Pitch pages, blog posts, case studies, entity profiles, state-of-AEO reports, schema standards, agency reseller program. All replaced with honest holders. The site no longer contradicts the retraction.
Updated llms.txt so AI engines crawling NeverRanked get the current state, not the prior framing.
Locked the production code paths that would have generated content built on the dead thesis (the cold outreach generator, the preview generator, the audit deliverable) until each one could be rewritten against substantiated claims.
Rebuilt the company around the measurement layer that does work. Measurement is what the snippet was supposed to inform. The snippet was the wrong instrument. Measurement, done honestly, is the right instrument.

What NeverRanked is now

A research engagement that measures what AI answer engines actually cite for a category. The deliverable is a forensic memo plus a prepped punch list. The customer's team or their agency executes the work. We do not. That separation is structural.

We watch seven AI surfaces in repeated runs against a frozen baseline. Five citation-grade engines that search the live web (Perplexity, ChatGPT search, Gemini grounded, Microsoft Copilot via Bing, Google AI Overviews). Two model-knowledge engines that answer from training data alone (Claude, Gemma). Both layers measure different failure modes a brand can have.

Pricing is $4,500 kickoff per category, one time. $1,500 per month per category, ongoing. Per category, not per client. There is no SaaS dashboard.

Why publish this

The honest reason is that our buyers can read. Anyone evaluating NeverRanked seriously can read the documented method at /methodology/, check the hash-locked question sets, and review the dated runs on the claims ledger. We can either own the retraction in our own voice, or wait for someone else to surface it. Owning it is better business and more honest.

The harder reason is that the category is full of vendors making the same claim we just retracted. Every AEO tool on the market promises some version of "ship our structured data and AI cites you." Some of them may eventually prove it. Most of them have not run the test. We do not believe a vendor in this space should be trusted on the citation-causation claim without a pre-registered, hash-locked, reproducible kill test against their own domain. We ran ours. We failed. We stopped selling. The honest move now is to say so.

What we changed about how we work

The discipline failure that produced the snippet product was: build first, validate the bet never. The reverse pattern is now structurally enforced. Three things in particular:

1. Pre-registration before measurement

Any claim that could become a product gets a pre-registration document first. Hypothesis, criterion, pass/fail thresholds, threats to validity. The file is written and hash-stamped before the test runs. Moving the goalposts is impossible because the goalposts are in the file.

2. The grader is fail-closed

Any prospect-facing artifact (cold email, preview page, research memo, dashboard report) is graded by a separate model against a canonical fact list before it ships. Mention something we cannot substantiate, the grader rejects the artifact and the send is held. The canonical fact list lives in the codebase. Anyone can read it.

3. Pattern-readiness has a numeric bar

The cross-category dataset our engagements feed is governed by the internal pattern-readiness rule: claim a pattern in a category only when three or more usable runs in that category agree. A run with zero successful API captures does not count, even if its log is the right length. The discipline is enforced by the catalog tool, not by us remembering.

The honest current state

Zero current paying customers under the new product. The demand for honest research-only measurement is being tested, not assumed. Validation requires actual paid commitments, not warm replies.
One named customer reference (Hawaii Theatre Center) and only as a capability example: the kinds of gaps a forensic measurement engagement surfaces that a standard SEO scan misses. The 45-to-95 score lift and 14-of-19 Perplexity citation claims are retracted.
The measurement infrastructure is real and live. As of this publication, the aggregate covers ten dated measurements, nine Hawaii business categories plus a cross-geo Austin CPA run, measured across the seven AI tools each, with three usable runs per category clearing the internal pattern-readiness rule. Public teardowns at /teardowns/ document the findings, and the dated runs are listed on the claims ledger.
The cold outreach pipeline is paused until the rewritten generator and grader are deployed and verified against the new positioning.
Pricing is locked, not provisional. $4,500 kickoff, $1,500 a month, per category.

If you want to dig in

The documented method: /methodology/, including how the hash-locked question sets are built and run.
The dated runs and the hashes that lock each question set: the claims ledger.
The published teardowns: /teardowns/.
The pre-registration file, the canonical fact list, and the per-run logs are available on request.
To talk: Lance@hi.neverranked.com.

Return to NeverRanked