NEW LIST: The most active crawlers and bots on the web
Allowing web crawlers to scan your site is a necessity if you want your web pages to appear in Google, Bing or other search results.
But at the same time, excessive traffic caused by non-human visitors can be costly in terms of bandwidth, website stability, and even potential outages. To help you understand web crawlers, bots and spiders visiting your site we released the most recent list of top 20 bots including their user agents.
Detect all web crawlers and spiders
A web bot or robot is a piece of software that runs various automated tasks which can be completed much quicker or cheaper than if carried out manually by a human. Web crawlers which automatically scan online content are deeply ingrained in the online world so that you may be unaware of the amount of web traffic these “machines” generate. Google Analytics doesn't report on bots and crawlers by default and you may not be able to see the entire share of non-human traffic.
DeviceAtlas is a device detection API and a repository of web-enabled device profiles which works through parsing User-Agent strings. All devices visiting websites have UAs, including non-human visitors, hence DeviceAtlas can provide you with a full report on the amount of bot and crawler traffic to your website.
To create a list of most active web crawlers, we used traffic to thousands of DeviceAtlas-powered websites between January and March 2016 (Q1 2016). The following list of common web crawlers and robots is only for your reference and comparison. Your server logs may show a different picture of non-human traffic depending on your audience profile, geographical location, etc.
To help you identify these bots in your server logs, we also included User-Agent strings.
The Complete Guide To User Agents.
Download our free e-book on User Agents to learn:
- What is a User Agent?
- How do you parse them?
- What can you do with them?
Download the Free Guide
The most active bot was a search engine – but not Google
We identified all web crawlers and bots that appeared in our User-Agent-based statistics. Majestic-12 bot was the most active, exceeding the amount of traffic from any other bot including Google and Bing. Majestic is a community-driven project aimed to create a search engine based on a distributed web crawler, the Majestic-12 bot which appears in our stats.
The following table shows the list of top 20 bots generating web traffic including some basic information about their purposes and their User Agent strings.
68.5% | MJ12bot | Search engine | Desktop bot |
Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+) |
16.8% | Googlebot | Search engine | Desktop bot |
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) |
3.2% | Googlebot | Search engine | Mobile bot |
Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) |
2.4% | Bingbot | Search engine | Desktop bot |
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) |
1.4% | SimplePie | RSS | Desktop bot |
SimplePie/1.3.1 (Feed Parser; http://simplepie.org; Allow like Gecko) Build/20121030175911 |
0.7% | Bingbot | Search engine | Mobile bot |
Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 BingPreview/1.0b |
0.6% | Yahoo! Slurp | Search engine | Desktop bot |
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) |
0.4% | Bingbot | Search engine | Mobile bot |
Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; http://www.bing.com/bingbot.htm) |
0.4% | Googlebot Mobile | Search engine | Mobile bot |
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html) |
0.4% | Googlebot Mobile | Search engine | Mobile bot |
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html) |
0.3% | Bingbot | Search engine | Mobile bot |
Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) |
0.3% | AdsBot Google Mobile | Search engine | Mobile bot |
Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html) |
0.2% | SiteLockSpider | Security | Desktop bot |
SiteLockSpider [en] (WinNT; I ;Nav) |
0.2% | OkHttp | Various purposes | Desktop bot |
okhttp/2.5.0 |
0.2% | Curl | Various purposes | Desktop bot |
curl/7.35.0 |
0.1% | Ips Agent | Market research | Mobile bot |
Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:14.0; ips-agent) Gecko/20100101 Firefox/14.0.1 |
0.1% | Googlebot | Search engine | Desktop bot |
Googlebot-Image/1.0 |
0.1% | BLEXBot | Market research | Desktop bot |
Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/) |
0.1% | Yandex Bot | Search engine | Desktop bot |
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) |
0.1% | ScoutJet | Search engine | Desktop bot |
Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/) |
Read more about how DeviceAtlas can detect bot traffic here.