Google Indexes and Ranks JavaScript Pages in Two Waves Days Apart

Tom Greenaway from Google spoke at Google I/O yesterday about how Google handles JavaScript pages, and he mentioned something pretty significant – Googlebot treats indexing and ranking of JavaScript pages much differently than non-Javascript rendered pages.

More specifically, when Googlebot crawls JavaScript sites, it does so in waves.  And the result is it will have an impact on the indexing and ranking of websites, especially new sites and new pages.  It can take days for Google to fully process a page that utilizing JavaScript.  Yes, days.

Googlebot includes its own renderer, which is able to run when it encounters pages with Javascript.   But rendering pages at the scale of the web requires a lot of time and computational resources.  And make no mistake, this is a serious challenge for search crawlers, Googlebot included.

And so we come to the important truth about Google search we would like to share with you today is that currently the rendering of JavaScript powered websites in Google search is actually deferred until Googlebot has the resources available to process that content.

Now you might be thinking, what does that really mean?  Well, I’ll show you.

In reality, Googlebot’s process looks a bit different.  We crawl a page, we fetch the server-side rendered content and then rerun some initial indexing on that document but rendering the JavaScript powered web pages takes processing power and memory and while Googlebot is very powerful, it doesn’t have infinite resources.

So if the page has JavaScript in it, the rendering is actually deferred until we have the resources ready to render the client-side content and then we index the content further.  So Googlebot might index a page before the rendering is complete and the final render can actually arrive several days later.  And when that final render does arrive, then we perform another wave of indexing on that client-side rendered content.

And this effectively means that if your site is using a heavy amount of client-side JavaScript for rendering, you could be tripped up at times when your content is being indexed due to the nature of this two phase indexing process.  And so ultimately, what I’m really trying to say is because Google’s Googlebot actually runs two waves of indexing across web content, it’s possible some details might be missed.

For any site with heavy Javascript, this can have a huge impact, especially if the pages are news oriented, where things can become outdated in the days between the two waves of indexing.  And if the initial view Googlebot sees doesn’t have enough content for indexing and ranking, those pages can not see any traffic for days, until Googlebot does its second wave of crawling the page to render the full JavaScript.

Mueller later stresses that new content “might take a while to be indexed.”

While designers love JavaScript, from an SEO point of view, it can definitely cause headaches, and this multi-day delay can cause even more issues.  The days delay means those pages may not get traffic from Google search, depending on what gets indexed in the initial pass by Googlebot, and with enough pages being added, this can result in a loss of potential traffic and revenue.

Mueller later stresses that new content “might take a while to be indexed.”

The following two tabs change content below.

My Twitter profileMy Facebook profileMy Google+ profileMy LinkedIn profileMy Twitter profileMy Facebook profileMy Google+ profileMy LinkedIn profile