# Indian Flavors Calgary - robots.txt # ---------------------------------------------------------------------- # Default rules for all crawlers # ---------------------------------------------------------------------- User-agent: * Allow: / # Block crawler-only / system paths Disallow: /config/ Disallow: /search/ Disallow: /account/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ # Block low-value query-string variants (filters, archives, format switches) Disallow: /*?*author=* Disallow: /*?*tag=* Disallow: /*?*month=* Disallow: /*?*view=* Disallow: /*?*format=* # Keep llms.txt out of search indexes (consumed by AI agents only) Disallow: /llms.txt # ---------------------------------------------------------------------- # AI / LLM training crawlers # ---------------------------------------------------------------------- User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Claude-Web User-agent: ClaudeBot User-agent: Google-Extended User-agent: FacebookBot User-agent: cohere-ai User-agent: PerplexityBot Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html # ---------------------------------------------------------------------- # Google Ads crawlers (allow full access for ad quality scoring) # ---------------------------------------------------------------------- User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps Allow: / # ---------------------------------------------------------------------- # Heavy / aggressive crawlers - throttle # ---------------------------------------------------------------------- User-agent: Baiduspider Crawl-delay: 10 # ---------------------------------------------------------------------- # Sitemap # ---------------------------------------------------------------------- Sitemap: https://www.indianflavorsyyc.ca/sitemap.xml