Robots.txt Broke and My Site Tanked

Our site has ranked at #1 in Google for “kansas city internet marketing” for years. A few days ago I noticed that it had dropped off completely with only sporadic internal pages ranking at #6 and beyond. Needless to say, I was unhappy. I spent several days trying to figure out the reason for this sudden MASSIVE drop in ranking. I investigated everything from a black-hat attack to a spam penalty against my site. I found nothing.  Something that was really bothering me was the fact that our homepage had NO cache in Google!  This is often the signature of a spam penalty – but with other pages on my site still ranking it didn’t have the same “fingerprint” of a spam penalty.  I don’t even spam anything anyway so it couldn’t have been a spam penalty. I remember saying to myself “I don’t know what’s going on. Time to freak out.”

 The above image shows page crawl rate. You can see the day the robots.txt started having a problem. My site hung on for another week before being removed from it's #1 ranking.
The above image shows page crawl rate. You can see the day the robots.txt started having a problem. My site hung on for another week before being removed from it's #1 ranking.

After checking everything I could think of I decided to look at Google Webmaster Tools again to see what Google was saying about my site. Unlike the last time I looked, Google now said EVERY page was unavailable! How could that be when my site was obviously live?! I checked the site on my iPhone and on my 3G enabled laptop and everything worked properly from outside my network. I decided to check the server headers on every single page on my site to see what the problem was and after receiving 200OK for every HTML file I was still flummoxed. I then decided to check other files like images and pdfs and they all got 200OK.  Lastly I checked the robots.txt file from my 3G laptop and lo-and-behold it returned nothing.  Absolutely nothing.  Just a blank page.  It resolved fine on my local network so I knew something was getting in the way of my robots.txt file getting out to the web. It could be only one thing – the firewall.

The problem with an empty response for the robots.txt is that apparently it’s not really an error to Google. It looks like they treat it as a “disallow all” statement. Crazy. You’d think that they’d treat it just like they would of there was no robots.txt file and go about their business indexing the site.  They don’t.

UPDATE
The problem was that the router / firewall I use received a software “update” on that day which caused a problem with one of the filters.  It seems that a filter was scanning text files in transit but not putting them back together correctly or something.  Somebody somewhere made a small mistake in their coding and it was disastrous for us.

One thought on “Robots.txt Broke and My Site Tanked”

  1. Our site came back – but sadly not to #1. It’s now at #2 and will require a lot of work to bump back up to #1. Glad to have it fixed though.

Comments are closed.