It's been a while since I posted some juicy source code. This time, I am going to explain the infamous black hat technique known as cloaking with some basic PHP code.
While most people think of cloaking as evil (asking for search engines to penalize your site), there are circumstances where it is perfectly legitimate and reasonable to use it.
Make pages for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."
What is cloaking?
It is the use of some clever, dynamic code to present different content to search engines than which is presented to users. Black hats use this to present optimized (keyword stuffed) content to the search engine spiders and sales/affiliate pages to users. Using it this way is potentially very risky; as, there are ways for search engine quality engineers to identify this easily, once reported.
Yesterday, Shoemoney reported about his experience in typing some technical questions in Google and finding links to the answers in ExpertsExchange. The interesting thing is that part of the answer is visible on the SERPs (search engine result pages), but once you land on the website you are presented with a login/subscription screen. I am sure you have probably experienced something similar with the New York Times Online, and some of the other news subscription services sites as well. They provide the real content to the search bots (in order to get the search referrals), and a subscription screen to the user. These are legitimate ways where cloaking can be used. Note that they are not trying to manipulate rankings,they are simply trying to increase their sign-ups.
The clever Jeremy figured it out by using the Google cache. He did not have to register with Experts Exchange, and received access to the full content 🙂
That is exactly how your competitors and the search quality engineers can tell you are cloaking. In order to avoid this, you only need to use this meta tag on the cloaked pages:
<meta name="robots" content="nocache">
This tells search engines to remove the evidence that you are cloaking.
Now to the best part.
How can you implement cloaking in your pages?
Depending on the detection method, you can cloak using two techniques: detecting user agent or detecting robot IP address.
Detecting robot user agent. The dynamic code checks the HTTP_USER_AGENT that is passed from the web server. If the user agent matches a known robot, it displays the content to be cloaked, the page intended for the user otherwise.
Detecting search engine robot IP. The dynamic code checks the HTTP_REMOTE_ADDR that is passed from the web server. If the IP address matches a known robot, it reveals the content to be cloaked or the page intended for the user. You can compile the user agent and IP lists by studying your log files (look for hits to robots.txt), or you can use the lists compiled by other webmasters:
http://www.user-agents.org/ List of Search Engine Robots' User Agents
http://iplists.com/ List of Search Engine Robots' IP addresses
Here is the PHP source code. Enjoy!