# pseudobash — full LLM-readable corpus

This file concatenates the canonical, citeable pages of pseudobash in a single
plain-text document so language models can ingest the whole thing in one fetch.

URLs cited below are relative; resolve them against the request host.

================================================================================
SOURCE: /
TITLE: pseudobash — Be the source ChatGPT cites
================================================================================

pseudobash is the retrieval surface AI answer engines cite.

The problem: AI Overviews answer the question; your site never gets the visit.
Pew Research (2025) found a 47% drop in click-throughs when Google shows an AI
Overview. Most websites are JavaScript-heavy SPAs that AI crawlers cannot read.

The fix: expose a deterministic, plain-text interface — `/shell` — that answer
engines can query the way an agent queries a filesystem. Every response carries
a `Link: <canonical>; rel="canonical"` header, a `Last-Modified` timestamp, and
a pre-formatted `X-Pseudobash-Cite` header so the engine can attribute the
answer back to your URL.

Drop one tag, get cited.

================================================================================
SOURCE: /shell.md
TITLE: shell.md — pseudobash AI retrieval contract
================================================================================

(See the full contract at /shell.md. Summary follows.)

- Endpoint: POST /shell with a plain-text command body.
- Discovery: GET /shell.md for the contract.
- Supported commands: cd, ls, cat, cite, grep, find, head, tail, wc, tree,
  pwd, man, --help.
- Pipelines: single `|` operator supported.
- Sessions: server returns `X-Session-Id`; echo it back to keep cwd state.
- Throttling: when no session id is present, throttling keys on
  sha256(ip):sha256(user-agent). Standard `X-RateLimit-*` headers and
  `Retry-After` on 429.
- Citation: every successful response includes `Link`, `Last-Modified`, and
  `X-Pseudobash-Cite: title="..."; url="..."; last-modified="..."`.

Allowlisted crawlers (paste these into your robots.txt):

  GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User,
  PerplexityBot, Perplexity-User, Google-Extended, Bytespider,
  Applebot-Extended.

================================================================================
SOURCE: /audit
TITLE: Is AI citing you, or your competitor?
================================================================================

Free scan: paste a domain (or visit /audit/<host> directly) and pseudobash
fetches that homepage four times — once as each major AI answer-engine crawler
— plus reads /robots.txt, /shell.md, and /llms.txt.

Output:

- A 0–100 "how citeable you are to AI" score.
- Per-crawler: HTTP status, content length, and whether the page needs JS to
  render (in which case the bot sees an empty shell).
- Retrieval contract: which of /shell.md, /llms.txt, /robots.txt are present.
- robots.txt allowlist analysis per crawler.

Results are server-side cached for 45 minutes per host. Anonymous scans are
heavily rate-limited. Set `Authorization: Bearer <AUDIT_API_KEY>` to bypass.

================================================================================
SOURCE: /llms.txt
TITLE: llms.txt index
================================================================================

See /llms.txt for the structured table of contents this corpus expands on.

================================================================================
SOURCE: /blog/show-up-in-chatgpt-results
TITLE: How to show up in ChatGPT results (2026 guide)
LAST-MODIFIED: 2026-04-18
================================================================================

Short answer: ChatGPT cites pages that are (1) crawlable by OAI-SearchBot and
GPTBot, (2) answer the user's exact question in the first paragraph, and
(3) expose a clean, citation-ready surface — a canonical URL, a Last-Modified
date, and ideally a hand-curated llms.txt or markdown endpoint.

Which crawlers does ChatGPT use?
- OAI-SearchBot — indexes pages for citation inside ChatGPT's search/browse
  feature. The one you most need to allow.
- ChatGPT-User — fetches a single page on demand when a ChatGPT user clicks a
  citation or asks the model to browse live.
- GPTBot — crawls pages that may improve future OpenAI models. Affects
  training, not retrieval.

What does ChatGPT actually fetch? The crawlers behave like a fast, headless
reader: raw HTML, robots.txt, sitemap.xml, llms.txt, ai.txt. They do not run
most JavaScript. If your homepage is client-rendered, the bot sees an empty
shell.

Quick test: curl -A "OAI-SearchBot/1.0" https://your-site.com/ | wc -c. A few
hundred bytes means ChatGPT cannot read the page.

The 4 things to ship this week:
1. Allow the bots in robots.txt with explicit User-agent blocks for
   OAI-SearchBot, ChatGPT-User, and (optionally) GPTBot.
2. Lead with the answer in the first 200 words.
3. Make the page easy to attribute (canonical URL, Last-Modified, JSON-LD
   Article + FAQPage).
4. Publish an llms.txt — a curated map of your most citeable URLs.

Verify by filtering analytics for referrer hosts chatgpt.com and
chat.openai.com, watching access logs for ChatGPT-User and OAI-SearchBot
hits, and manually querying ChatGPT for your top 10 questions.

Common reasons for invisibility: JS-only rendering, Cloudflare/WAF blocking
unknown UAs, the answer buried below 1000 words, no stable URL, not present
in Bing's index.

================================================================================
SOURCE: /blog/traffic-from-ai-agents
TITLE: How to get traffic from AI agents (ChatGPT, Perplexity, Claude, Gemini)
LAST-MODIFIED: 2026-04-18
================================================================================

Short answer: Treat each AI agent as its own channel. Allow its crawler in
robots.txt, expose a machine-friendly surface, and instrument referrals by
the agent's referrer host.

AI agents send traffic in two flows:
- Citation clicks (humans): user reads an answer and clicks a citation. Your
  analytics see the referrer host (chatgpt.com, perplexity.ai, claude.ai,
  gemini.google.com, copilot.microsoft.com).
- Agentic fetches (bots): the agent itself fetches pages, identified by
  user-agent strings like ChatGPT-User, Claude-User, Perplexity-User.

Per-agent reference table (citation referrer host / indexing crawler / on-
demand fetcher / recommended utm_source):
- ChatGPT: chatgpt.com, chat.openai.com / OAI-SearchBot / ChatGPT-User /
  utm_source=chatgpt
- Perplexity: perplexity.ai / PerplexityBot / Perplexity-User /
  utm_source=perplexity
- Claude: claude.ai / ClaudeBot / Claude-User / utm_source=claude
- Gemini: gemini.google.com / Google-Extended / (uses Googlebot) /
  utm_source=gemini
- Copilot: copilot.microsoft.com / Bingbot (Bing index) / (via Bing) /
  utm_source=copilot

Dashboard recipe: create one analytics segment "AI citation traffic" matching
all five referrer hosts, plus one segment per agent, plus a server-log
dashboard counting hits per AI user-agent grouped by URL.

Anecdotal conversion: AI citation traffic tends to convert higher per visit
than generic Google organic — visitors arrive pre-qualified — but at much
lower volume. Roughly 1–10% of Google organic by sessions in 2026 for most
sites.

================================================================================
SOURCE: /blog/aeo-vs-seo
TITLE: AEO vs SEO: what Answer Engine Optimization actually changes
LAST-MODIFIED: 2026-04-18
================================================================================

Short answer: AEO optimizes for being the source an answer engine quotes;
SEO optimizes for the SERP listing a user clicks. SEO success is a click;
AEO success is a citation. Same technical fundamentals, very different
content shape. The unit of AEO work is the passage, not the page.

Key differences:
- Goal: SEO ranks the page; AEO is cited inside the answer.
- Unit: SEO = page; AEO = paragraph/list/table.
- Success metric: SEO = click; AEO = citation, then optionally click.
- Title role: SEO title is the SERP headline; AEO title is disambiguation
  while the H1 + first paragraph do the work.
- Backlinks: strong SEO signal; weaker for AEO (freshness, attribution,
  clarity matter more).
- Word count: SEO often rewards depth; AEO rewards a short, direct answer in
  the first 200 words.

The passage-level mindset: every section should answer one question
completely in 60–120 words, self-contained when ripped out of context.
Repeat the noun (don't write "it"). Use lists and tables (LLMs lift them
verbatim). Write headings as questions. Front-load numbers and definitions.

5 AEO checks SEO tools miss:
1. Per-bot crawlability (Lighthouse fetches as Googlebot only).
2. JS-only content invisibility to retrievers.
3. Citation surface (llms.txt, markdown endpoint).
4. Passage shape (each H2 must stand alone).
5. Per-agent referral instrumentation.

Treat AEO and SEO as portfolio. Transactional/brand: SEO wins.
Informational/how-to: AEO is the dominant surface. Comparison: both matter.

================================================================================
SOURCE: /blog/llms-txt-and-ai-txt-guide
TITLE: llms.txt and ai.txt: a copy-pasteable guide for AI crawlers
LAST-MODIFIED: 2026-04-18
================================================================================

Short answer: llms.txt is a hand-curated map of citeable URLs. llms-full.txt
is the long-form dump. ai.txt declares your training stance. robots.txt is
the only one that actually controls access.

Minimal llms.txt template (markdown, at /llms.txt):

  # your-site
  > One sentence describing what your site is.
  ## Core
  - [Homepage](/): overview.
  - [Pricing](/pricing): plans.
  - [Docs](/docs): canonical product documentation.
  ## Reference
  - [API reference](/docs/api): endpoints, auth, rate limits.

Ship llms-full.txt when you have more than ~10 citeable pages. Format: same
sectioned URL list, but include the full plain-text body of each page after
its header. Generate from your CMS on every deploy. Keep under 200 KB.

Minimal ai.txt:
  Training: allowed for all foundation models.
  Citation: required when content is quoted.
  Contact: ai@your-site.com

Robots.txt allowlist for AI bots that matter today: GPTBot, OAI-SearchBot,
ChatGPT-User, ClaudeBot, Claude-User, PerplexityBot, Perplexity-User,
Google-Extended, Applebot-Extended, Bytespider. Each gets its own
User-agent block (many WAFs ignore wildcard rules for AI bots).

Test commands:
  curl -I https://your-site.com/llms.txt
  curl -I https://your-site.com/ai.txt
  curl -A "OAI-SearchBot/1.0" -L https://your-site.com/robots.txt

================================================================================
SOURCE: /blog/google-referrals-dropping-ai-overviews
TITLE: Why your Google referrals are dropping (and what to do about AI Overviews)
LAST-MODIFIED: 2026-04-18
================================================================================

Short answer: Pages whose first paragraph answers the query lose ~47% of
clicks when Google shows an AI Overview (Pew Research, July 2025). Recover
by either making the click necessary (interactive tools, deep content) or
by being the source the Overview cites. Hiding from Google is not an option.

Diagnose in Search Console: compare year-over-year for the last 90 days,
filter to question-type queries. If impressions are flat-to-up but clicks
are down 30–60%, AI Overviews are the likely cause.

The 47% number, in context: Pew's 2025 study found that on Google searches
with an AI Overview, users clicked through to a website on roughly 8% of
visits vs. 15% without — a relative ~47% drop. The effect concentrates on
informational queries with paragraph-shaped answers; transactional and
brand queries are largely unaffected.

Two recovery paths:
- Path A (anti-summary): interactive tools, original data and visuals,
  stepwise tutorials with screenshots, up-to-date specifics. Make pages the
  Overview cannot fully summarize.
- Path B (cite-bait): one-paragraph direct answer at the top, cited
  statistics, author byline + organization schema, Last-Modified freshness.
  Be the source the Overview names.

Site-level changes that helped: adding llms.txt, server-rendering the
answer paragraph on previously-CSR pages, tightening titles to the actual
query, adding a one-line summary block at the top of long-form pages,
submitting a sitemap to Bing Webmaster Tools (for Copilot).

Don't: block Googlebot (you'll vanish from Search), cloak (deindexing
risk), paywall the answer paragraph, pad the page with 1500 words above
the answer.