{"id":323840,"date":"2026-06-09T22:59:49","date_gmt":"2026-06-09T22:59:49","guid":{"rendered":"https:\/\/wordpress.org\/plugins\/crawlsignal-ai-crawler-insights\/"},"modified":"2026-06-09T22:59:26","modified_gmt":"2026-06-09T22:59:26","slug":"puffersights-ai-crawler-insights","status":"publish","type":"plugin","link":"https:\/\/wordpress.org\/plugins\/puffersights-ai-crawler-insights\/","author":7870694,"comment_status":"closed","ping_status":"closed","template":"","meta":{"version":"0.1.0","stable_tag":"0.1.0","tested":"7.0","requires":"6.5","requires_php":"7.4","requires_plugins":null,"header_name":"PufferSights - AI Crawler Insights","header_author":"Senol Sahin","header_description":"Monitor 100+ known AI crawler and AI agent user agents with local analytics, AI referrals, llms.txt, and robots.txt policy tools.","assets_banners_color":"","last_updated":"2026-06-09 22:59:26","external_support_url":"","external_repository_url":"","donate_link":"","header_plugin_uri":"","header_author_uri":"","rating":0,"author_block_rating":0,"active_installs":0,"downloads":32,"num_ratings":0,"support_threads":0,"support_threads_resolved":0,"author_block_count":0,"sections":["description","installation","faq","changelog"],"tags":{"0.1.0":{"tag":"0.1.0","author":"senols","date":"2026-06-09 22:59:26"}},"upgrade_notice":[],"ratings":[],"assets_icons":[],"assets_banners":[],"assets_blueprints":{},"all_blocks":[],"tagged_versions":["0.1.0"],"block_files":[],"assets_screenshots":[],"screenshots":[]},"plugin_section":[],"plugin_tags":[2353,232,4866,12928,244604],"plugin_category":[36],"plugin_contributors":[192179],"plugin_business_model":[],"class_list":["post-323840","plugin","type-plugin","status-publish","hentry","plugin_tags-ai","plugin_tags-analytics","plugin_tags-bots","plugin_tags-crawlers","plugin_tags-llms-txt","plugin_category-analytics","plugin_contributors-senols","plugin_committers-senols"],"banners":[],"icons":{"svg":false,"icon":"https:\/\/s.w.org\/plugins\/geopattern-icon\/puffersights-ai-crawler-insights.svg","icon_2x":false,"generated":true},"screenshots":[],"raw_content":"<!--section=description-->\n<p>PufferSights monitors 100+ known AI crawler and AI agent user agents, hashes IP addresses, groups traffic by bot, provider, crawl purpose, content type, and response status, tracks human referrals from AI surfaces, and can publish a dynamic llms.txt content map for public site content.<\/p>\n\n<p>The dashboard summarizes:<\/p>\n\n<ul>\n<li>HTTP traffic by bot.<\/li>\n<li>Crawl purpose.<\/li>\n<li>Content type.<\/li>\n<li>Response status.<\/li>\n<li>AI referrals and crawl-to-refer ratio.<\/li>\n<li>Top crawled content.<\/li>\n<li>Tracked agent count.<\/li>\n<li>Dynamic llms.txt content map.<\/li>\n<li>robots.txt audit and policy snippets.<\/li>\n<\/ul>\n\n<p>The crawler registry is based on current public operator documentation and industry references for OpenAI, Anthropic, Perplexity, Google, Apple, Common Crawl, Meta, ByteDance, Microsoft, Amazon, and related AI crawler operators.<\/p>\n\n<p>The plugin does not contact any external service. All analytics data is stored in your own WordPress database.<\/p>\n\n<p>robots.txt publishing is off by default. The plugin can generate and optionally publish policies for:<\/p>\n\n<ul>\n<li>Monitor only.<\/li>\n<li>Block training crawlers.<\/li>\n<li>Allow AI search\/user-action bots while blocking training crawlers.<\/li>\n<li>Block all known AI bots.<\/li>\n<\/ul>\n\n<p>llms.txt publishing is on by default and can be disabled in the PufferSights settings. The generated <code>\/llms.txt<\/code> file lists selected published public pages and posts in Markdown so AI assistants can find the site's main public content more easily.<\/p>\n\n<h3>Important Notes<\/h3>\n\n<p>User-agent detection is not bot verification. User agents can be spoofed. Raw IP addresses are not stored; the plugin stores a salted hash for rough uniqueness.<\/p>\n\n<p>robots.txt is voluntary. Use a WAF, CDN, or server-level controls when technical enforcement is required.<\/p>\n\n<p>Google-Extended and Applebot-Extended are robots.txt control tokens rather than normal request user agents, so they appear in robots.txt audits and policy snippets but usually do not appear in request logs.<\/p>\n\n<p>llms.txt is a content map, not an access-control policy. It does not replace robots.txt and does not force AI systems to use or cite your content.<\/p>\n\n<h3>Privacy<\/h3>\n\n<p>PufferSights stores local analytics for public, logged-out requests only. It does not track wp-admin pages, logged-in users, AJAX requests, or WP-Cron requests.<\/p>\n\n<p>The plugin stores:<\/p>\n\n<ul>\n<li>Request time and date.<\/li>\n<li>Event type, such as AI crawler request or AI referral.<\/li>\n<li>HTTP method.<\/li>\n<li>Request path without query string.<\/li>\n<li>HTTP response status.<\/li>\n<li>MIME\/content group.<\/li>\n<li>Matched crawler or AI referral provider.<\/li>\n<li>User-agent string and user-agent hash.<\/li>\n<li>Salted one-way hash of the request IP address.<\/li>\n<li>Referrer origin only, such as <code>https:\/\/chatgpt.com<\/code>, without referrer path or query string.<\/li>\n<\/ul>\n\n<p>The plugin does not store raw IP addresses, cookies, browser local storage, or complete referrer URLs. It does not send analytics, telemetry, crawler records, or site data to third-party services.<\/p>\n\n<p>If llms.txt publishing is enabled, the plugin serves a Markdown overview of selected published public posts and pages at <code>\/llms.txt<\/code>. Drafts, private posts, and password-protected posts are not included.<\/p>\n\n<p>Administrators can disable tracking, clear captured events, and configure retention from the PufferSights admin page. The default retention period is 90 days. On uninstall, the plugin removes its custom analytics table, saved options, and scheduled cleanup hook.<\/p>\n\n<p>The plugin also adds suggested disclosure text to WordPress' Privacy Policy Guide.<\/p>\n\n<!--section=installation-->\n<ol>\n<li>Upload the <code>puffersights-ai-crawler-insights<\/code> folder to <code>wp-content\/plugins<\/code>.<\/li>\n<li>Activate PufferSights - AI Crawler Insights.<\/li>\n<li>Open the PufferSights menu in wp-admin.<\/li>\n<\/ol>\n\n<!--section=faq-->\n<dl>\n<dt id=\"are%20detected%20ai%20crawlers%20verified%3F\"><h3>Are detected AI crawlers verified?<\/h3><\/dt>\n<dd><p>No. The plugin matches user-agent strings. A request can claim to be GPTBot, ClaudeBot, Googlebot, or another crawler without being verified. Treat the dashboard as user-agent analytics unless you add server, CDN, or WAF verification.<\/p><\/dd>\n<dt id=\"does%20the%20plugin%20block%20ai%20crawlers%3F\"><h3>Does the plugin block AI crawlers?<\/h3><\/dt>\n<dd><p>Not by default. robots.txt publishing is off by default. If enabled, the plugin can append crawler-specific robots.txt rules, but robots.txt is voluntary and does not technically enforce access.<\/p><\/dd>\n<dt id=\"what%20is%20llms.txt%3F\"><h3>What is llms.txt?<\/h3><\/dt>\n<dd><p>llms.txt is a proposed Markdown convention for giving AI assistants a concise map of important public site content. PufferSights can serve a dynamic <code>\/llms.txt<\/code> file with selected published pages and posts. It does not expose drafts, private posts, or password-protected posts.<\/p><\/dd>\n<dt id=\"does%20the%20plugin%20send%20data%20to%20third%20parties%3F\"><h3>Does the plugin send data to third parties?<\/h3><\/dt>\n<dd><p>No. The plugin does not use external analytics APIs and does not send analytics, telemetry, crawler records, or site data to any third party.<\/p><\/dd>\n<dt id=\"what%20happens%20when%20i%20delete%20the%20plugin%3F\"><h3>What happens when I delete the plugin?<\/h3><\/dt>\n<dd><p>The <code>uninstall.php<\/code> cleanup removes the plugin options, scheduled cleanup hook, and custom analytics table.<\/p><\/dd>\n\n<\/dl>\n\n<!--section=changelog-->\n<h4>0.1.0<\/h4>\n\n<p>Initial release.<\/p>","raw_excerpt":"Monitor 100+ known AI crawler and AI agent user agents with local analytics, AI referrals, llms.txt, and opt-in robots.txt policy tools.","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin\/323840","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin"}],"about":[{"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/types\/plugin"}],"replies":[{"embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/comments?post=323840"}],"author":[{"embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wporg\/v1\/users\/senols"}],"wp:attachment":[{"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/media?parent=323840"}],"wp:term":[{"taxonomy":"plugin_section","embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin_section?post=323840"},{"taxonomy":"plugin_tags","embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin_tags?post=323840"},{"taxonomy":"plugin_category","embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin_category?post=323840"},{"taxonomy":"plugin_contributors","embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin_contributors?post=323840"},{"taxonomy":"plugin_business_model","embeddable":true,"href":"https:\/\/wordpress.org\/plugins\/wp-json\/wp\/v2\/plugin_business_model?post=323840"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}