# pseudobash — full LLM-readable corpus This file concatenates the canonical, citeable pages of pseudobash in a single plain-text document so language models can ingest the whole thing in one fetch. URLs cited below are relative; resolve them against the request host. ================================================================================ SOURCE: / TITLE: pseudobash — Be the source ChatGPT cites ================================================================================ pseudobash is the retrieval surface AI answer engines cite. The problem: AI Overviews answer the question; your site never gets the visit. Pew Research (2025) found a 47% drop in click-throughs when Google shows an AI Overview. Most websites are JavaScript-heavy SPAs that AI crawlers cannot read. The fix: expose a deterministic, plain-text interface — `/shell` — that answer engines can query the way an agent queries a filesystem. Every response carries a `Link: ; rel="canonical"` header, a `Last-Modified` timestamp, and a pre-formatted `X-Pseudobash-Cite` header so the engine can attribute the answer back to your URL. Drop one tag, get cited. ================================================================================ SOURCE: /shell.md TITLE: shell.md — pseudobash AI retrieval contract ================================================================================ (See the full contract at /shell.md. Summary follows.) - Endpoint: POST /shell with a plain-text command body. - Discovery: GET /shell.md for the contract. - Supported commands: cd, ls, cat, cite, grep, find, head, tail, wc, tree, pwd, man, --help. - Pipelines: single `|` operator supported. - Sessions: server returns `X-Session-Id`; echo it back to keep cwd state. - Throttling: when no session id is present, throttling keys on sha256(ip):sha256(user-agent). Standard `X-RateLimit-*` headers and `Retry-After` on 429. - Citation: every successful response includes `Link`, `Last-Modified`, and `X-Pseudobash-Cite: title="..."; url="..."; last-modified="..."`. Allowlisted crawlers (paste these into your robots.txt): GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, PerplexityBot, Perplexity-User, Google-Extended, Bytespider, Applebot-Extended. ================================================================================ SOURCE: /audit TITLE: Is AI citing you, or your competitor? ================================================================================ Free scan: paste a domain (or visit /audit/ directly) and pseudobash fetches that homepage four times — once as each major AI answer-engine crawler — plus reads /robots.txt, /shell.md, and /llms.txt. Output: - A 0–100 "how citeable you are to AI" score. - Per-crawler: HTTP status, content length, and whether the page needs JS to render (in which case the bot sees an empty shell). - Retrieval contract: which of /shell.md, /llms.txt, /robots.txt are present. - robots.txt allowlist analysis per crawler. Results are server-side cached for 45 minutes per host. Anonymous scans are heavily rate-limited. Set `Authorization: Bearer ` to bypass. ================================================================================ SOURCE: /llms.txt TITLE: llms.txt index ================================================================================ See /llms.txt for the structured table of contents this corpus expands on. ================================================================================ SOURCE: /blog/show-up-in-chatgpt-results TITLE: How to show up in ChatGPT results (2026 guide) LAST-MODIFIED: 2026-04-18 ================================================================================ Short answer: ChatGPT cites pages that are (1) crawlable by OAI-SearchBot and GPTBot, (2) answer the user's exact question in the first paragraph, and (3) expose a clean, citation-ready surface — a canonical URL, a Last-Modified date, and ideally a hand-curated llms.txt or markdown endpoint. Which crawlers does ChatGPT use? - OAI-SearchBot — indexes pages for citation inside ChatGPT's search/browse feature. The one you most need to allow. - ChatGPT-User — fetches a single page on demand when a ChatGPT user clicks a citation or asks the model to browse live. - GPTBot — crawls pages that may improve future OpenAI models. Affects training, not retrieval. What does ChatGPT actually fetch? The crawlers behave like a fast, headless reader: raw HTML, robots.txt, sitemap.xml, llms.txt, ai.txt. They do not run most JavaScript. If your homepage is client-rendered, the bot sees an empty shell. Quick test: curl -A "OAI-SearchBot/1.0" https://your-site.com/ | wc -c. A few hundred bytes means ChatGPT cannot read the page. The 4 things to ship this week: 1. Allow the bots in robots.txt with explicit User-agent blocks for OAI-SearchBot, ChatGPT-User, and (optionally) GPTBot. 2. Lead with the answer in the first 200 words. 3. Make the page easy to attribute (canonical URL, Last-Modified, JSON-LD Article + FAQPage). 4. Publish an llms.txt — a curated map of your most citeable URLs. Verify by filtering analytics for referrer hosts chatgpt.com and chat.openai.com, watching access logs for ChatGPT-User and OAI-SearchBot hits, and manually querying ChatGPT for your top 10 questions. Common reasons for invisibility: JS-only rendering, Cloudflare/WAF blocking unknown UAs, the answer buried below 1000 words, no stable URL, not present in Bing's index. ================================================================================ SOURCE: /blog/traffic-from-ai-agents TITLE: How to get traffic from AI agents (ChatGPT, Perplexity, Claude, Gemini) LAST-MODIFIED: 2026-04-18 ================================================================================ Short answer: Treat each AI agent as its own channel. Allow its crawler in robots.txt, expose a machine-friendly surface, and instrument referrals by the agent's referrer host. AI agents send traffic in two flows: - Citation clicks (humans): user reads an answer and clicks a citation. Your analytics see the referrer host (chatgpt.com, perplexity.ai, claude.ai, gemini.google.com, copilot.microsoft.com). - Agentic fetches (bots): the agent itself fetches pages, identified by user-agent strings like ChatGPT-User, Claude-User, Perplexity-User. Per-agent reference table (citation referrer host / indexing crawler / on- demand fetcher / recommended utm_source): - ChatGPT: chatgpt.com, chat.openai.com / OAI-SearchBot / ChatGPT-User / utm_source=chatgpt - Perplexity: perplexity.ai / PerplexityBot / Perplexity-User / utm_source=perplexity - Claude: claude.ai / ClaudeBot / Claude-User / utm_source=claude - Gemini: gemini.google.com / Google-Extended / (uses Googlebot) / utm_source=gemini - Copilot: copilot.microsoft.com / Bingbot (Bing index) / (via Bing) / utm_source=copilot Dashboard recipe: create one analytics segment "AI citation traffic" matching all five referrer hosts, plus one segment per agent, plus a server-log dashboard counting hits per AI user-agent grouped by URL. Anecdotal conversion: AI citation traffic tends to convert higher per visit than generic Google organic — visitors arrive pre-qualified — but at much lower volume. Roughly 1–10% of Google organic by sessions in 2026 for most sites. ================================================================================ SOURCE: /blog/aeo-vs-seo TITLE: AEO vs SEO: what Answer Engine Optimization actually changes LAST-MODIFIED: 2026-04-18 ================================================================================ Short answer: AEO optimizes for being the source an answer engine quotes; SEO optimizes for the SERP listing a user clicks. SEO success is a click; AEO success is a citation. Same technical fundamentals, very different content shape. The unit of AEO work is the passage, not the page. Key differences: - Goal: SEO ranks the page; AEO is cited inside the answer. - Unit: SEO = page; AEO = paragraph/list/table. - Success metric: SEO = click; AEO = citation, then optionally click. - Title role: SEO title is the SERP headline; AEO title is disambiguation while the H1 + first paragraph do the work. - Backlinks: strong SEO signal; weaker for AEO (freshness, attribution, clarity matter more). - Word count: SEO often rewards depth; AEO rewards a short, direct answer in the first 200 words. The passage-level mindset: every section should answer one question completely in 60–120 words, self-contained when ripped out of context. Repeat the noun (don't write "it"). Use lists and tables (LLMs lift them verbatim). Write headings as questions. Front-load numbers and definitions. 5 AEO checks SEO tools miss: 1. Per-bot crawlability (Lighthouse fetches as Googlebot only). 2. JS-only content invisibility to retrievers. 3. Citation surface (llms.txt, markdown endpoint). 4. Passage shape (each H2 must stand alone). 5. Per-agent referral instrumentation. Treat AEO and SEO as portfolio. Transactional/brand: SEO wins. Informational/how-to: AEO is the dominant surface. Comparison: both matter. ================================================================================ SOURCE: /blog/llms-txt-and-ai-txt-guide TITLE: llms.txt and ai.txt: a copy-pasteable guide for AI crawlers LAST-MODIFIED: 2026-04-18 ================================================================================ Short answer: llms.txt is a hand-curated map of citeable URLs. llms-full.txt is the long-form dump. ai.txt declares your training stance. robots.txt is the only one that actually controls access. Minimal llms.txt template (markdown, at /llms.txt): # your-site > One sentence describing what your site is. ## Core - [Homepage](/): overview. - [Pricing](/pricing): plans. - [Docs](/docs): canonical product documentation. ## Reference - [API reference](/docs/api): endpoints, auth, rate limits. Ship llms-full.txt when you have more than ~10 citeable pages. Format: same sectioned URL list, but include the full plain-text body of each page after its header. Generate from your CMS on every deploy. Keep under 200 KB. Minimal ai.txt: Training: allowed for all foundation models. Citation: required when content is quoted. Contact: ai@your-site.com Robots.txt allowlist for AI bots that matter today: GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended, Bytespider. Each gets its own User-agent block (many WAFs ignore wildcard rules for AI bots). Test commands: curl -I https://your-site.com/llms.txt curl -I https://your-site.com/ai.txt curl -A "OAI-SearchBot/1.0" -L https://your-site.com/robots.txt ================================================================================ SOURCE: /blog/google-referrals-dropping-ai-overviews TITLE: Why your Google referrals are dropping (and what to do about AI Overviews) LAST-MODIFIED: 2026-04-18 ================================================================================ Short answer: Pages whose first paragraph answers the query lose ~47% of clicks when Google shows an AI Overview (Pew Research, July 2025). Recover by either making the click necessary (interactive tools, deep content) or by being the source the Overview cites. Hiding from Google is not an option. Diagnose in Search Console: compare year-over-year for the last 90 days, filter to question-type queries. If impressions are flat-to-up but clicks are down 30–60%, AI Overviews are the likely cause. The 47% number, in context: Pew's 2025 study found that on Google searches with an AI Overview, users clicked through to a website on roughly 8% of visits vs. 15% without — a relative ~47% drop. The effect concentrates on informational queries with paragraph-shaped answers; transactional and brand queries are largely unaffected. Two recovery paths: - Path A (anti-summary): interactive tools, original data and visuals, stepwise tutorials with screenshots, up-to-date specifics. Make pages the Overview cannot fully summarize. - Path B (cite-bait): one-paragraph direct answer at the top, cited statistics, author byline + organization schema, Last-Modified freshness. Be the source the Overview names. Site-level changes that helped: adding llms.txt, server-rendering the answer paragraph on previously-CSR pages, tightening titles to the actual query, adding a one-line summary block at the top of long-form pages, submitting a sitemap to Bing Webmaster Tools (for Copilot). Don't: block Googlebot (you'll vanish from Search), cloak (deindexing risk), paywall the answer paragraph, pad the page with 1500 words above the answer.