Googlebot Behavior When Major Changes Made to Site

The question about Googlebot being allowed to crawl pages that had previously been blocked from Googlebot. Googlebot crawl activity, as shown in Google Search Console crawl stats, showed a huge spike one day, which doubled the next day and then tripled the third.

But Google wasn’t necessarily hitting the “new to Googlebot” pages, but crawling old cache pages as well.  John Mueller, in the last Google Webmaster Office Hours discussed in detail how Googlebot handles crawling when it recognizes major changes to a site.

I don’t think that would be – so I’m not exactly sure where you mean or what that looks like, but it doesn’t sound like that would be something like an effect that you would see from Google’s crawling.  Because usually what would happen in a case like this is if we recognize that there are significant changes on the website, we’ll just try to crawl the known URLs a little bit faster.

So we have a bunch of URLs that we already know from your website, and we basically decide, “oh, we want to make sure that our index is as fresh as possible.  Therefore, we’ll take this list of URLs and crawl them as quickly as we can.  It kind of depends on your server, what we think your server can take.  But we’ll try to get through that a little bit faster than normal.

So that’s particularly the case where we find significant changes across a site with regards to maybe structured data or with the URL choices, with real canonical redirects, those kinds of things.  So maybe that’s something that was triggered from your side, depending on what you changed.

But in general, that would be visible like normal Googlebot requests.  We would just go through our list of URLs, and we just check them all.  It wouldn’t be the case that you would see something like a Chrome cache or something.  So I suspect what you are seeing there is either an effect that Analytics picked up kind of incorrectly.

Mueller also mentioned that it is possible the increased crawling and the changes are completely unrelated as well.  SEOs are familiar with Googlebot suddenly spiking with activity on a site, even if nothing has been changed, particularly nothing that would be deemed significant.

Maybe, or maybe it’s totally unrelated.  Maybe it’s just Googlebot was crawling and this other random thing was happening at the same time, and it looks like they’re related but actually, they are two completely separate things.

Of course, if the site is low quality, then it can still take quite some time for Googlebot to recrawl those low quality pages – sometimes six months or more before changes in all those pages can be reflected in the search results.

It is worth noting that increased crawling does not mean increased indexing or rankings.  So if you are concerned Googlebot is hitting your site too hard, especially if you expect Googlebot to hit your site hard, you can throttle the amount it crawls.  But Googlebot is usually pretty good at recognizing when it might be crawling harder than what the site can support.

The following two tabs change content below.

My Twitter profileMy Facebook profileMy Google+ profileMy LinkedIn profileMy Twitter profileMy Facebook profileMy Google+ profileMy LinkedIn profile