An Example of Why You SHOULD NOT Let Google Crawl Your JavaScript Code

user-agent-googlebot-disallow-javascriptIn my article, titled Advanced SEO for Affiliate Marketing Links, and the follow-up article, titled Hey, Matt Cutts, I’m using JavaScript to hide links from Google, cool?, I discuss an SEO strategy that some sheep people consider to be gray hat or even black hat SEO. The basic concept involves using JavaScript functions to create links, placing the JavaScript code in an external file, and then blocking googlebot from accessing it, using robots.txt. The end result: only your users can see your JavaScript links; Google sees plain text.

In those articles, I discussed how to use this technique in a way that improves the user experience and prevents the passing of PageRank through paid links (as is required by the Google Webmaster Guidelines). One of the things I heard from the ignorant, self-righteous fucktards naysayers of this technique was that I shouldn’t block googlebot from viewing my JavaScript code, because Google is smart enough to “figure things out” for itself.

Also, in the past I’ve asked Matt Cutts if there’s any reason why I shouldn’t Disallow googlebot from crawling external JavaScript files. In his response, he advises people NOT to block Google and says the cost in bandwidth required to serve JavaScript files to Google is insignificant.

The following example shows that both arguments (Google understands JavaScript and it doesn’t cost you anything) are flawed and confirms my recommendation to Disallow googlebot from reading your JavaScript code (regardless of what the code actually does).

The code example below is from Google Instant Previews Experiment #01 – When is the Screen Captured? It is not from an external .js file–it is defined in the page’s <head> section. In other words, this is one of the few times I let Google see some JavaScript code…and you can see for yourself just how well Google has figured it out.

01 function showImage(int) {
02     int = ((int < 10) ? "0" + int : int);
03     var parentID = "update" + int;
04     var updatePs = document.getElementById("updates").getElementsByTagName("p");
05     var image = document.createElement('img');
06     var imgID = "image" + int;
07     var imgURI = "/img/google-instant-preview-" + int + ".png";
08     var imgALT = "Google Instant Preview #" + int;
10     for (var i = 0; i < updatePs.length; i++) {
11         var imgObj = updatePs[i].getElementsByTagName("img");
12         if (imgObj[0]!=null){
13             imgObj[0].parentNode.removeChild(imgObj[0]);
14         }
15     }
17     image.setAttribute("id", imgID);
18     image.setAttribute("class", "preview-image");
19     image.setAttribute("src", imgURI);
20     image.setAttribute("width", "302");
21     image.setAttribute("height", "585");
22     document.getElementById(parentID).appendChild(image);
23 }

What the script actually does is it allows me to easily update that post by adding images to the rollover, using simple CSS classes/ids. But that doesn’t matter; what matters is that Google has pulled an arbitrary string from the code and is treating it like a link URL.

In other words, Google isn’t curiously testing the string to see if it’s a URL–no, Google is boldly declaring: This is definitely a URL, and I’m definitely counting it as a link, and therefore you definitely have a broken link on this page.


Bottom line: Google sucks at understanding JavaScript, and there’s a real possibility that its reckless misinterpretation of your script will end up causing damage to your website’s rankings, its crawl rate, and/or its depth of indexation.

Recent Content

link to SparkToro Review

SparkToro Review

Doing effective customer research is a challenging task that requires a lot of legwork. In a typical customer research process, marketers struggle to find out where their customers hang out, what they like, who they follow, what publications they read, what podcasts they listen to, and a lot more. This shows how understanding your customer […]
link to Combin Review: How to find and maintain the interest of your target audience on Instagram

Combin Review: How to find and maintain the interest of your target audience on Instagram

This post was sponsored by Combin. Instagram has enjoyed massive growth since its inception in 2010. With over 1 billion active users, Instagram is one of the most effective social media platforms for marketing brands. It has a highly active community that makes it a potential goldmine for businesses looking to increase customer engagement. Despite the […]