How to Act Like an SEO Expert: Four mistakes to avoid when performing SEO experiments

by Hamlet Batista | March 07, 2008 | 10 Comments

In yesterday’s post I explained my creative process for uncovering new and interesting search marketing ideas. In this post I want to focus on the other critical element toward becoming an expert: endless experimentation. Of course testing must be done carefully to avoid arriving at the wrong conclusions, which will bring us to another of my favorite topics: human error.
As I like to do, let me explain my process with an actual example.
Last month there was an interesting post on SEOmoz about session IDs and HTTP cookies. In the post, Rand asserted that search engines don’t support cookies, and it’s therefore another alternative to controlling robot access to a site. Very clever; I don’t know how I didn’t think about that first! 🙂
Well, in the comments, King questioned the validity of the original assumption that search engines don’t accept cookies. Here is what he had to say:

I’m not sure its [sic] really true that search engines (Google at least) don’t accept cookies. I recently (well 6 months ago) created a site that checks for cookies before allowing customers access to the shopping cart. If cookies are disabled it sends the user to a[n] info page on the topic Google indexed the actual shopping cart page perfectly well, they totally bypassed the “cookie info” page, and never indexed that at all. Cookie checking was done entirely via PHP code.

For a while I have assumed that Google does not support cookies, but the truth is that search engines are constantly being improved and have evolved over the years. For instance, years ago search engine crawlers did not follow links embedded in JavaScript, but recent experiments have proven that at least Google does follow the less intricate ones.
So, this was a perfect candidate for a simple experiment. Let’s confirm whether search engines accept cookies or not. As best I can, I like to follow the scientific method.
The observation
In order to determine whether or not search engines accept cookies I configured my web server to append cookie information to my visitor log file. If you use Apache as your webserver, this is how you do it. Under your website configuration, change your log format to include HTTP cookie information, like this:
LogFormat “%h %l %u %t \”%r\” %>s %b \”%{Referer}i\” \”%{User-Agent}i\” \”%{Cookie}i\””
The reason I choose logs for my observation is because search engines do not execute the JavaScript tags commonly used by web-based analytics packages. I need to see the behavior of the robots on the site, so my logs are the most logical option. An alternative would be to use a packet sniffer such as tcpdump, but sniffers spit out far more information than I need and parsing web server logs with regular expressions is very simple and straight forward. There is no need to complicate things.
First, I check the log for regular user visits to the site (especially the pages that return HTTP cookies) and I confirm that the cookies are being logged when the user is accepting (and returning them).
195.62.206.192 – – [01/Mar/2008:03:03:04 -0500] “GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1” 200 61477 “http://hamletbatista.com/2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/” “Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; MediaCenter PC 5.0; .NET CLR 3.0.04506; InfoPath.2)” “__utma=205505417.902185886.1204356928.1204356928.1204356928.1; __utmb=205505417; __utmc=205505417; __utmz=205505417.1204356928.1.1.utmccn=(organic)|utmcsr=google|utmctr=link%3Awww.mutinydesign.co.uk|utmcmd=organic; fbbb_=1343817243.1.1204357560473; subscribe_checkbox_88ce75a961c252a943f6a63bd04c8d5d=unchecked; comment_author_88ce75a961c252a943f6a63bd04c8d5d=Webeternity+web+design; comment_author_email_88ce75a961c252a943f6a63bd04c8d5d=goodsite%40webeternity.co.uk; comment_author_url_88ce75a961c252a943f6a63bd04c8d5d=http%3A%2F%2Fwww.webeternity.co.uk”
Here most of the cookies are from Google Analytics (check the ones that start with __utm and utm).
Side note: Here I confirmed my suspicion. My loyal reader David Hopkins is responsible for the large amount of manual comment spam I’m receiving lately. Apparently his competitors want to rank top ten in Google for “web design” too 🙂
If search engines support cookies, they should return them back and the server will log them with their visit in an entry.
Now, let see what happened when each one of the top search engines visited the site.
Google – no cookies logged

74.52.123.218 – – [04/Mar/2008:00:20:56 -0500] “GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1” 200 61477 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” “-”
Yahoo – no cookies logged
74.6.28.203 – – [01/Mar/2008:19:18:36 -0500] “GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1” 200 61477 “-” “Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)” “-”
Msn/Live – no cookies logged
65.55.209.101 – – [02/Mar/2008:06:22:15 -0500] “GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1” 200 61477 “-” “msnbot/1.0 (+http://search.msn.com/msnbot.htm)” “-”
As you can see from the log, none of the top search engines returned the cookies and hence the web server didn’t log them.
Formulation of the hypothesis
From direct observation I can conclude that, as of this moment, top search engines do not support HTTP cookies.
Use of the hypothesis to predict the existence of other phenomena
I can therefore predict that it’s possible to use cookies to control or modify the access of robots to my website. A lot of creative things can be done using this technique.
Performance of experimental tests of the predictions by several independent experimenters
Here is my call to you to duplicate these tests on your site and report back whether you get the same results. This is the step that most experimenters miss. You need to share your findings with your peers and the exact procedure you used to arrive at your conclusions, and invite them to test as well and see if they arrive at the same results. “But why do we need to do this?” you might ask. It’s because of human error.
Incorporating Human Error
We are imperfect and we make mistakes. I first learned this lesson years ago in a physics class I took in high school. The teacher taught us to repeat each measurement several times, average the results, and use the lowest and highest values as limits. The idea is that we are must face the fact that we won’t know the exact value, but we can determine a pretty accurate range. The concept of human error in that class was so interesting that I have never forgotten about it (as you can see :-)).
Human error is not limited to just taking measurements. There are many psychological issues as well. Here are four common mistakes that I regularly make and that I see others making when they come up with new SEO theories:

1. Bias. Many times when you are testing a theory you already want it to be true or you want it to be false. It is very hard to start experimenting without some prejudice about what you expect the outcome to be. At the same time, it is just as easy to ignore supporting evidence that is contrary to your desired outcome. Sometimes you want to believe so hard ignore what the data is actually telling you. This is particularly true when you are testing just to prove a point or to prove somebody else wrong.

2. Failing to estimate the errors in the experiment. As I explained above, there will always be a margin of error, and we need to account for it or risk losing the entire value of the experiment.

3. Failing to repeat the experiment under different scenarios and circumstances. I have to admit that this is one mistake I make too often. I guess I am too lazy to repeat my experiments, but nonetheless I know very well the importance of being able to repeat and confirm the conclusions. It is particularly important that the test be duplicated by your peers who will hopefully have different biases than you.

4. Identifying symptoms as diseases. One of the disadvantages that we (SEOs) have is that search engines are black boxes and we don’t know for sure what is going on inside them. We can see the search results and study patterns to arrive at conclusions, but for example, many times observations we make are mistakenly labeled as penalties. It’s easy to jump headlong in the wrong direction. I like to draw this parallel: imagine telling your doctor that you have a headache and he returns with, “Oh, you must have a brain tumor.” There are probably thousands of diseases that share a headache as a symptom and only further tests are going to get at the right diagnosis. The same happens when you observe search results or search engine robot behavior. There are probably hundreds of reasons why a particular result has changed, including something as simple as a random search engine glitch. To be on the safe side, I simply ask myself a common sense question: “What would be the purpose of the search engine doing this? Does this help them do a better job or am I misinterpreting the results?”

Experimenting and testing theories is what separates experts from the pretenders. You need to be highly skeptical of any new concept unless you can see solid proof or you can test it yourself. I’ve witnessed many interesting ideas and concepts unearthed, not because of deep research or deduction, but by observation and trial and error. It’s important to have a receptive mind. Try to avoid these mistakes when you are doing your SEO experiments and I am sure you will become a stellar SEO expert in no time!

Hamlet Batista

Chief Executive Officer

Hamlet Batista is CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He holds US patents on innovative SEO technologies, started doing SEO as a successful affiliate marketer back in 2002, and believes great SEO results should not take 6 months

10

REPLIES

Try our SEO automation tool for free!

RankSense automatically creates search snippets using advanced natural language generation. Get your free trial today.

OUR BLOG

Latest news and tactics

What do you do when you’re losing organic traffic and you don’t know why?

Getting Started with NLP and Python for SEO [Webinar]

Custom Python scripts are much more customizable than Excel spreadsheets.  This is good news for SEOs — this can lead to optimization opportunities and low-hanging fruit.  One way you can use Python to uncover these opportunities is by pairing it with natural language processing. This way, you can match how your audience searches with your...

READ POST
Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

As we continue to improve the RankSense app for Cloudflare, we are always working to make the app more intuitive and easy to use. I'm pleased to share that we have made significant changes to our SEO rules interface in the settings tab of our app. It is now easier to publish multiple rules sheets and to see which changes have not yet been published to production.

READ POST

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner

For the following Ranksense Webinar, we were joined by Antoine Eripret, who works at Liligo as an SEO lead. Liligo.com is a travel search engine which instantly searches all available flight, bus and train prices on an exhaustive number of travel sites such as online travel agencies, major and low-cost airlines and tour-operators. In this...

READ POST

Exciting News!
seoClarity acquires RankSense

X