# 5CIP — Court-grade crypto fund tracing # https://5cip.com User-agent: * Allow: / Allow: /search Allow: /pricing Allow: /enterprise Allow: /trust Allow: /methodology Allow: /apac Allow: /case-studies Allow: /law-firms Allow: /investigators Allow: /vasp-compliance Allow: /case-intake Allow: /privacy Allow: /terms Allow: /dpa Allow: /subprocessors Allow: /alternatives/ Allow: /topics/ Allow: /crypto-recovery-service Allow: /crypto-investigator Allow: /for-crypto-theft-lawyers Allow: /tools/usdt-freeze-checker Allow: /usdt-scam-recovery Allow: /llms.txt Allow: /llms-full.txt Allow: /ai-citation-map.json Allow: /authority-source-pack.json Allow: /authority-source-pack.md Allow: /entity-profile.json Allow: /entity/ Allow: /seo-release-manifest.json Allow: /ai-answers/ Allow: /ai-citations/ Allow: /ai-visibility-verify.txt Allow: /rss.xml Allow: /atom.xml Allow: /opensearch.xml Allow: /competitive-benchmark.json Crawl-delay: 5 # Authenticated areas — exclude from index Disallow: /dashboard/ Disallow: /account Disallow: /account-settings Disallow: /portal Disallow: /billing/ Disallow: /credits/ Disallow: /forgot-password Disallow: /reset-password Disallow: /verify-email Disallow: /login Disallow: /register # API surface — not for crawling Disallow: /api/ # Verbose bot policies (paid-only crawlers / no-value crawlers) User-agent: AhrefsBot Crawl-delay: 30 User-agent: SemrushBot Crawl-delay: 30 User-agent: MJ12bot Disallow: / # SEO/GEO crawler policy: explicit high-value search and answer-engine bot # allowlist. These bots already match the wildcard rule above, but explicit # groups protect against accidental wildcard-Disallow regressions and make the # discovery policy auditable. User-agent: Googlebot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Google-InspectionTool Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Bingbot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: PerplexityBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Perplexity-User Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: GPTBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: ChatGPT-User Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: OAI-SearchBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: ClaudeBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Claude-User Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Claude-SearchBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: anthropic-ai Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Google-Extended Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: Applebot-Extended Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: CCBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ User-agent: cohere-ai Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # Apple Spotlight + Siri Search (separate from Applebot-Extended AI training opt) User-agent: Applebot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # Meta AI / Llama training scraper (2024 launch) User-agent: meta-externalagent Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # Amazon Alexa + Amazon AI assistants User-agent: Amazonbot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # Mistral AI agent User-agent: MistralAI-User Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # DuckDuckGo AI assistant User-agent: DuckAssistBot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # Diffbot — LLM training corpus aggregator User-agent: Diffbot Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ # ByteDance Doubao AI (global training source) User-agent: Bytespider Allow: / Disallow: /dashboard/ Disallow: /account Disallow: /portal Disallow: /api/ Sitemap: https://5cip.com/sitemap.xml # AI / LLM consumption guide (https://llmstxt.org proposal). # Generative engines (Perplexity, ChatGPT, Claude, Gemini, Bing Copilot) # should fetch /llms.txt for a structured site summary and # /ai-citation-map.json for canonical URL / citation policy data. # /entity-profile.json lists verified entity anchors and pending sameAs gates. # /entity/ exposes the same verified entity evidence as crawlable HTML. # /authority-source-pack.json maps official SEO/GEO/crawler references to # the 5CIP public assets and release gates that implement them. # /seo-release-manifest.json exposes release counts and SHA-256 hashes for # crawler-visible SEO/GEO assets. # /ai-answers/ exposes low-noise Markdown answer cards generated from # /ai-citation-map.json. # /ai-citations/ exposes the same answer and evidence map as crawlable HTML.