Advanced link cloaking techniques

by Hamlet Batista | June 06, 2007 | 3 Comments

The interesting discussion between Rand and Jeremy had me thinking about some of the things affiliates do to protect their links. I am talking about link cloaking — the art of hiding links.

We can hide links from our potential customer (in the case of affiliate links), and we can hide them from the search engines as well (as in the case of reciprocal links, paid links, etc.).

While I think cloaking affiliate links to prevent others from stealing your commissions is useful, I am not encouraging you to use the techniques I am about to explain. I certainly think it is very important to understand link cloaking in order to protect yourself when you are buying products, services or links.

When I am reading a product endorsement, I usually mouse over the link to see if it is an affiliate link. Why? I don’t mind the blogger making a commission’; but, If I see he or she is trying to hide it via redirects, Java-script, etc. I don’t perceive it is as an endorsement.  I feel it is a concealed ad. When I see <aff>, editor’s note, etc. I feel I can trust the endorsement.

Another interesting technique is the cloaking of links to the search engines. The reasoning behind this concept is so that your link partners think you endorse them, but you tell the search engines that you don’t. Again, I am not supporting this.

Cloaking links to the potential customers.

Several of the techniques, I’ve seen are:

Script redirects – the use of simple script that takes a key (i.e: merchant=eBay), pulls the affiliate link from a database or from an in-line dictionary (programming term), and sends the visitor’s browser an HTTP 302 or 301 (temporary or permanent redirect) to the merchant site.

Meta refreshes – the use of blank HTML pages with the meta refresh tag and the affiliate tracking code embedded. This is very popular.

In-line Java-script– the use of Java-script to capture the mouse over, and the right click event from the target link, in order to make the status bar display the link without the tracking code. I feel this one is very deceptive.

Encoding URLs – the use of HTML character entities or URL encoding to obfuscate the tracking links or tracking codes from your visitors. This works because browsers understand the encoding and humans are unable to understand them without some work.

Java-script + image links – This is really advanced. I haven’t seen this being used much. The idea is to use Java-script to capture the on_click event and have the code pull a transparent image before transferring control to the new page. The trick is that the URL of the transparent image, is in reality a tracking script, that receives the tracking code as a parameter or as part of the URL.

Cloaking links to the search engines.

These are some of the techniques I’ve seen:

Use of rel=”no-follow” anchor attribute. I would not say this is technically cloaking, but the results are the same. Search engines (Google, Yahoo and Live) will not ‘respect’ those links.

Use of no-follow and/or no-index meta tag. There is a slight difference between the use of no-follow in the anchor link tag vs the meta robots tag. When used on the robots meta tag it means: “do not follow the links on this page”. When used on the anchor tag it tells the search engine “do not consider my link for your scoring” ( this link is not a vote/endorsement).

Crawler user agent check. This consists in detecting the search engine crawler by user agent via the HTTP REFERRER header and hiding the link or presenting the search engine a link with rel-no-follow. Normal visitors will not see this.

Crawler IPs check. Black hat SEOs keep a list of search engine crawler IP addresses to make cloaking more effective. While search engine crawlers announce their presence via the user agent header, when using cloaking detection algorithms they don’t.  Keeping a record of crawler IPs help detect them.

Once the crawler is detected, the same technique I just mentioned is used to hide the target links.

Robot.txt disallow. Disallowing search engine crawlers access to specific sections of your website (ie: link partner pages)via robot.txt, is another way to effectively hide those links from the search engines.

The use of robots-nocontent class. This is a relatively new addition (only Yahoo supports this at the moment). With this CSS class, you can tell the Yahoo crawler that you don’t want it to index portions of a page. Hiding link sections is another way to cloak links.

Robot.txt disallow + crawler IPs check. I haven’t seen this being used, but it’s technically possible. The idea is to  present search engines a different version of your robot.txt file than you present to users. The version you present to the search engines prohibits sections of your site where the links you want o hide are. You detect the search engine robot either by the user agent or by a list of known robots’ IP addresses. Note that you can prevent the search crawler from caching the robot.txt file making detection virtually impossible.

Now, as I said before, I am not exposing these techniques to promote them. On the contrary, here’s why it’s important to detect them.

The best way to detect cloaked links is to look closely at the HTML source and robot.txt file, and specially at the cache versions of those files. If you are buying or trading links for organic purposes (assuming you don’t get reported as spam by your competitors), don’t buy or trade links that use any of these techniques or that prevent the search engines from caching the robots.txt file or page in question.

Hamlet Batista

Chief Executive Officer

Hamlet Batista is CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He holds US patents on innovative SEO technologies, started doing SEO as a successful affiliate marketer back in 2002, and believes great SEO results should not take 6 months

3

REPLIES

Try our SEO automation tool for free!

RankSense automatically creates search snippets using advanced natural language generation. Get your free trial today.

OUR BLOG

Latest news and tactics

What do you do when you’re losing organic traffic and you don’t know why?

Getting Started with NLP and Python for SEO [Webinar]

Custom Python scripts are much more customizable than Excel spreadsheets.  This is good news for SEOs — this can lead to optimization opportunities and low-hanging fruit.  One way you can use Python to uncover these opportunities is by pairing it with natural language processing. This way, you can match how your audience searches with your...

READ POST
Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

As we continue to improve the RankSense app for Cloudflare, we are always working to make the app more intuitive and easy to use. I'm pleased to share that we have made significant changes to our SEO rules interface in the settings tab of our app. It is now easier to publish multiple rules sheets and to see which changes have not yet been published to production.

READ POST

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner

For the following Ranksense Webinar, we were joined by Antoine Eripret, who works at Liligo as an SEO lead. Liligo.com is a travel search engine which instantly searches all available flight, bus and train prices on an exhaustive number of travel sites such as online travel agencies, major and low-cost airlines and tour-operators. In this...

READ POST

Exciting News!
seoClarity acquires RankSense

X