Sneaky Referral Spammers

Before the Thanksgiving break I was looking through all the stupidly-annoying referral spam we get, and found one that looked curious. It was something that said but the G was the same height as the lowercase “o” in Google. Weird. It’d been coming since Nov 8th and peaked on Nov 30th with 356 “referrals” that day. I decided to try the URL, but my firewall wouldn’t let it work. I tried manually allowing it through my firewall, but then my antivirus interfered. I decided to stop there and investigate further. It turns out that the tiny “G” in that URL is actually a Latin-language character. Some sneaky you-know-what registered that URL in a mix of English and Latin characters. “Wait… what? You can register domains with non-English characters?” You might be wondering. Yep – and it’s nice if you speak a language with special characters, such as Japanese (shown below), but this opens the door to abuse. Unfortunately, it’s possible to pair non-English characters with English characters, which makes it possible for people to impostor well-known sites. I looked through the rest of the referral spam for our site and found another URL that’s got a crazy character in it – supposedly Lifehacker, but it’s not. Check out the two screenshots below. Pay attention to the tiny “G” in Google and the tiny “K” in Lifehacker.

Note the Small Capital G. That’s a Latin “small capital ‘G'” character.

Note that funky not-quite-tall “K”. This is the Latin letter “Kra”.


An example of a non-English character domain
An example of a non-English character domain. This one is Japanese.

If you’re on a Windows machine, open your Character Map and take a look at all the different letters that might be included in URLs to pretend to be a proper URL. Below are screenshots of the little G and weird K that I found. There are so many non-English characters that look like English characters that spammers could register just about anything! They could pretend to be Nіke, Yɑhoo, ВOA, СNN, ҒORD, etc. (Yes, every one of those has a non-English character impostor.)

This is the Latin small capital G that a spammer is using to pretend to be Google.

This is the Latin small “Kra” that a spammer is using to pretend to be lifehacker.

There was also a recent burst of referral traffic supposedly from a Reddit page. As a webmaster, and active Redditor, I was a bit worried when I saw the bump. Reddit referrals can either be very good or really, really bad for a site. After some investigating, it turns out that a spammer had posted a link to their site on Reddit, and then dumped a barrage of referral spam trying to get people to the post and to click their link in hopes of getting pageviews and ad revenue. That’s pretty desperate… making a post, referral-spamming, and then hoping that the people who are savvy enough to notice referral spam in Analytics are not savvy enough to know not to click the spam link? Whoever has this referral spam botnet isn’t the brightest bulb in the box. They’re going after some of the web’s savviest people – webmasters – with a very complicated ploy. Sheesh.

The moral of the story is to be careful. Pay very close attention, because spammers are getting more and more sophisticated to get through equally sophisticated countermeasures. If a character doesn’t copy and paste properly, it’s probably one of these non-standard characters designed to grab you and infect your computer with malware.

Referral spammers are the worst. Seriously, screw you guys. Leave my analytics alone.

** Update**

I got another new client recently and this little gem was in their referrals. Sheesh… Thanks for cluttering up the data.