Date of slack thread: 7/24/24
Anonymous: Hi yâall. Can you point me to guidance on how we can ensure our Statsig experiment variations are not crawled, indexed/cached by googlebot and other search crawlers?
Tarandeep Singh (Statsig): Robots.txt or meta tags wouldnât be an option since our variations wonât be on separate URLs.
Anonymous: Hi Tarandeep, We have been looking closer at bots and how they affect Statsig data; your feedback is helpful. I would recommend setting up a targeting gate for your experiments that filters out bots. You can do this by creating a gate called âNo Botsâ, add a rule that Fails known bots by browser name (see screenshot). If you need a list, our internal data says these are the top 20 self-identified bots by browser_name
. The top 4 account for 75% of traffic we see across Statsig customers. âGooglebotâ, âAdsBot-Googleâ, âApplebotâ, âFacebookBotâ, âbingbotâ, âPetalBotâ, âAhrefsBotâ, âYandexRenderResourcesBotâ, âBitSightBotâ, âYandexBotâ, âStorebotâ, âcom/botâ, ânet/botâ, âpingbotâ, âadsbotâ, âPingdomBotâ, âSmarshBotâ, âVirusTotalBotâ, âUOrgTestingBotâ, âMonsidobotâ Good luck.
Anonymous: Thanks. I will look at this.