AI crawler access · robots.txt · llms.txt

Can AI actually read your site? Find out in 2 seconds.

Answer engines like ChatGPT, Perplexity and Google AI fetch your pages to cite them — unless your robots.txt quietly blocks them. aicrawlcheck tests every major AI crawler, validates your llms.txt and structured data, and shows what to fix.

Run a check ↗

What it checks

AI crawler access

Allow/deny for GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider and more — with a correct robots.txt matcher.

llms.txt

Validates a /llms.txt (title, summary, link sections) — the emerging map that helps AI engines find your key pages.

On-page AI readiness

JSON-LD structured data, FAQ schema, a single H1, meta description, server-rendered content depth, and a freshness signal.

Why it matters

The most common way sites vanish from AI answers is a robots.txt that blocks the engines’ user-agents — often via a copy-pasted “block AI bots” snippet. aicrawlcheck catches that, with an open, published methodology and no black-box score.

Frequently asked questions

Is aicrawlcheck free?

Yes — free, no account, no sign-up. Enter a URL and get an instant report. We never store the URLs you check.

What exactly does it check?

Whether the major AI crawlers (GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider and more) are allowed or blocked in your robots.txt; whether you publish a valid llms.txt; and on-page AI-readiness signals — JSON-LD structured data, FAQ schema, a single H1, meta description, content depth and freshness.

Why does AI-crawler access matter?

Answer engines like ChatGPT Search, Perplexity and Google AI Overviews fetch your pages to cite them. If your robots.txt blocks their user-agents (often by accident, via a blanket Disallow or an over-eager “block AI bots” snippet), you silently disappear from AI answers. This tool catches that.

Should I block AI crawlers or allow them?

It depends on your goal. To be cited in AI answers, allow the answer-engine fetchers (OAI-SearchBot, PerplexityBot, ChatGPT-User, Google-Extended). Training-only scrapers (CCBot, Bytespider, Applebot-Extended) are a separate policy choice — blocking them is legitimate and the tool treats it as a choice, not an error.

What is llms.txt?

An emerging convention (llmstxt.org): a Markdown file at /llms.txt that gives AI engines a curated, clean map of your most important pages. aicrawlcheck checks it exists and follows the format (a title, a summary blockquote, and sections of links).

Is the score a real measurement?

There is no black-box “0–100 AI visibility” score here. We report concrete pass/warning/issue FACTS for each check, every rule is published in the Methodology section, and you get a downloadable result. We do not claim to predict whether a specific engine will cite you — that requires live monitoring, which is out of scope.

Does it run JavaScript-rendered pages?

No — it reads the server-rendered HTML, exactly like most AI crawlers do (they often do not execute JavaScript). If your key content only appears after client-side rendering, the tool will warn you, because that content is likely invisible to AI crawlers too.

Is my data safe? Any SSRF concerns?

The audit runs on Cloudflare and only fetches public http(s) URLs; requests to private, loopback, link-local and cloud-metadata addresses are blocked, redirects are re-validated, and responses are size- and time-capped. We keep no logs of what you check.

Open the full interactive checker ↗