Legitimate cloaking — real world example and php source code

by Hamlet Batista | June 26, 2007 | 5 Comments

It's been a while since I posted some juicy source code. This time, I am going to explain the infamous black hat technique known as cloaking with some basic PHP code.

While most people think of cloaking as evil (asking for search engines to penalize your site), there are circumstances where it is perfectly legitimate and reasonable to use it.

From Google quality guidelines:

Make pages for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."

What is cloaking?
It is the use of some clever, dynamic code to present different content to search engines than which is presented to users. Black hats use this to present optimized (keyword stuffed) content to the search engine spiders and sales/affiliate pages to users. Using it this way is potentially very risky; as, there are ways for search engine quality engineers to identify this easily, once reported.

Yesterday, Shoemoney reported about his experience in typing some technical questions in Google and finding links to the answers in ExpertsExchange. The interesting thing is that part of the answer is visible on the SERPs (search engine result pages), but once you land on the website you are presented with a login/subscription screen. I am sure you have probably experienced something similar with the New York Times Online, and some of the other news subscription services sites as well. They provide the real content to the search bots (in order to get the search referrals), and a subscription screen to the user. These are legitimate ways where cloaking can be used. Note that they are not trying to manipulate rankings,they are simply trying to increase their sign-ups.

The clever Jeremy figured it out by using the Google cache. He did not have to register with Experts Exchange, and received access to the full content 🙂

That is exactly how your competitors and the search quality engineers can tell you are cloaking. In order to avoid this, you only need to use this meta tag on the cloaked pages:

<meta name="robots" content="nocache">

This tells  search engines to remove the evidence that you are cloaking.

Now to the best part.

How can you implement cloaking in your pages?

Depending on the detection method, you can cloak using two techniques: detecting user agent or detecting robot IP address.

Detecting robot user agent. The dynamic code checks the HTTP_USER_AGENT that is passed from the web server. If the user agent matches a known robot, it displays the content to be cloaked, the page intended for the user otherwise.

Detecting search engine robot IP. The dynamic code checks the HTTP_REMOTE_ADDR that is passed from the web server. If the IP address matches a known robot, it reveals the content to be cloaked or the page intended for the user. You can compile the user agent and IP lists by studying your log files (look for hits to robots.txt), or you can use the lists compiled by other webmasters:
http://www.user-agents.org/ List of Search Engine Robots' User Agents
http://iplists.com/ List of Search Engine Robots' IP addresses

Here is the PHP source code. Enjoy!

cloaking.zip

Hamlet Batista

Chief Executive Officer

Hamlet Batista is CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He holds US patents on innovative SEO technologies, started doing SEO as a successful affiliate marketer back in 2002, and believes great SEO results should not take 6 months

5

REPLIES

Try our SEO automation tool for free!

RankSense automatically creates search snippets using advanced natural language generation. Get your free trial today.

OUR BLOG

Latest news and tactics

What do you do when you’re losing organic traffic and you don’t know why?

Getting Started with NLP and Python for SEO [Webinar]

Custom Python scripts are much more customizable than Excel spreadsheets.  This is good news for SEOs — this can lead to optimization opportunities and low-hanging fruit.  One way you can use Python to uncover these opportunities is by pairing it with natural language processing. This way, you can match how your audience searches with your...

READ POST
Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

As we continue to improve the RankSense app for Cloudflare, we are always working to make the app more intuitive and easy to use. I'm pleased to share that we have made significant changes to our SEO rules interface in the settings tab of our app. It is now easier to publish multiple rules sheets and to see which changes have not yet been published to production.

READ POST

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner

For the following Ranksense Webinar, we were joined by Antoine Eripret, who works at Liligo as an SEO lead. Liligo.com is a travel search engine which instantly searches all available flight, bus and train prices on an exhaustive number of travel sites such as online travel agencies, major and low-cost airlines and tour-operators. In this...

READ POST

Exciting News!
seoClarity acquires RankSense

X