There has been a heated debate on Sphinn about a controversial post by Rand Fishkin of Seomoz. There is a lot to learn from that discussion, but instead of focusing on the debate, I want to talk about something that keeps coming up: Google's Sitelinks.
Google doesn't provide a lot of information, but this is what they say about the matter:
-
Sitelinks are presented if they are found to be somehow useful.
-
A site’s structure allows Google to find good Sitelinks.
-
The process of selection, creation and presentation of Sitelinks is fully automated.
Let's forget the technical details for the moment and focus on what Google's purpose is here: they want to save users some clicks by pointing them to the right page directly in the search results. Sitelinks appear only for the first result, and only for sites with meaningful traffic. (Google uses the toolbar data of visitor frequency to make this determination.)
I decided to dig deeper and study the sources, try some examples of my own and make my own conclusions. I'd definitely like to have Sitelinks when people search for my blog, and I'm sure many of my readers here would like the same. Here’s what I learned…
The Sitelink Debate
Michael VanDeMar hints that Google's Sitelinks are an indication of a site's authority in Google's eyes. Danny Sullivan is not too convinced and points to some curious examples. Essentially, he notes that some sites that would be considered 'bad' by Google, such as ThreadWatch and TextLinkBrokers, have Sitelinks. Conversely, he shows that Volkswagen's website, which is certainly an authority for VW terms, does not show Sitelinks when you search for 'vw' or 'volkswagen.'
There are some weak points in Danny's logic. For example, when you search for “Volkswagen of America,” the Sitelinks do appear. Also, after reading Google help pages about Sitelinks, it is clear that they are generated algorithmically, not manually, so it doesn't really matter if Google engineers don't like a particular website such as ThreadWatch. I bet they don't like Aaron Wall's either, and it has Sitelinks. Computers are clearly making the decisions.
Bill Slawski found a really useful patent that provides even more useful information. I highly recommend you read the whole post on his site. For my purposes here I am just quoting part of his conclusion:
It’s interesting, but not terribly surprising, that so much of the generation of these additional links are based upon user-behavior based information. The patent does note that it is only the top result they are showing these additional links for, so to have lists like this appear, it’s helpful to rank pretty well.
Beyond being number one, the first step in getting Google to show additional links from your site may be to get lots of traffic to your pages. It’s hard to tell how much is enough, but it has to be enough for them to think that this will be a good user experience for searchers to list those pages.
The second may be to have a core group of pages that tend to get visited more than other pages of the site – the only reason to list pages like this is if you are helping make it easier for searchers find what they may be looking for.
Fun with Sitelinks
Now for the interesting part. Let's see how popular SEO blogs are doing in terms of Sitelinks. Rand managed to gather analytics data last year from a sizeable group. Assuming that most of the sites have grown in similar proportion, I performed obvious searches to see if there were sites that do not have navigational links. I found it interesting that Matt Cutts, Pronet Advertising, Search Engine Journal, and several others don't have Sitelinks. Those sites get more traffic (at least last year) than Jim Boykin's that does have Sitelinks.
Let’s face it, Sitelinks take up valuable screen real estate that we all want. Having Sitelinks is like having the first 5 or 6 positions in the SERP (search engine result page). You won't get higher click-through than that. I don't know you, but after considering all this, it’s hard to argue that Sitelinks are not a clear indication of a site's quality and authority.
But that’s not the whole story. After carefully examining those same SEO sites, it became clear to me that there is a reason why they don't have Sitelinks. They don't have site navigation that encourages people to click onto other pages in the site. I'm sure most people visiting Matt's or Pronet's blogs are mostly interested in the posts; other pages simply don't get enough visits for Google to consider useful enough to include in search results.
I think that Sitelinks are a good signal of the authority and quality of a site, but not necessarily the inverse. The fact that a site doesn't have Sitelinks does not mean that it is inferior or unimportant. If you want to get Sitelinks to your blog, the first step, as Bill recommends, is to increase the traffic to it; the second is to make sure you have a set of links that your visitors will consider going to as soon as they land.
This makes me think that I should probably move my recommended reading block above the fold 😉
Geoff
August 9, 2007 at 3:11 am
I guess I never thought about sitelinks before. You make a VERY good point, however, about the power they have in offering so many spots in your site right at the top. Viva la nav!
Michael VanDeMar
August 9, 2007 at 6:57 am
Hamlet, you make this claim: <blockquote>(Google uses the toolbar data of visitor frequency to make this determination.)</blockquote> Can you cite any sources on that one? Thanks.
Hamlet Batista
August 9, 2007 at 8:29 am
Michael - Thanks for your comment. My source is the patent. I noticed that the link to the patent was broken, sorry. It is fixed now. Here is your answer: <blockquote>A method, comprising: receiving a search query from a user; generating search results based on the search query; identifying a plurality of web pages associated with at least one of the search results <strong>based on a quality factor associated with the plurality of web pages</strong>, wherein the plurality of web pages and a web page associated with the at least one search result comprise web pages in a same web site; and providing the search results and a plurality of links associated with the plurality of web pages to the user. </blockquote> What is this quality factor? <blockquote>4. The method of claim 1, wherein <strong>the quality factor is based on a number of times each of the plurality of web pages has been accessed</strong>. 5. The method of claim 4, wherein the providing a plurality of links includes: providing the plurality of links <strong>in an order where a link to a more frequently accessed one of the plurality of web pages is listed before a link to a less frequently accessed one of the plurality of web pages</strong>. 6. The method of claim 1, wherein the quality factor is based on an amount of time a plurality of users have viewed each of the plurality of web pages. 7. The method of claim 1, wherein the quality factor is based on a number of web pages with a link pointing to one of the plurality of web pages. 8. The method of claim 1, wherein the quality factor is based on a score associated with how closely the search query matches information contained on each of the plurality of web pages. 9. The method of claim 1, wherein the quality factor is based on items purchased via each of the plurality of web pages. 10. The method of claim 1, wherein the quality factor is based on the user's prior behavior with respect to the plurality of web pages. </blockquote> How can they tell how many times a particular page has been accessed, how long a user stays on a page, etc.? <blockquote> [0042] Processing may begin by log processing system 125 receiving data via network 140 (act 610). For example, front end 310 may receive data when clients 110 access various web sites. <strong>In one implementation, assume that users have downloaded/installed a toolbar on their respective clients 110 that facilitates web searches on a search engine, such as search engine system 135.</strong> In this case, <strong>the toolbar may include software code that instructs a client 110 to send hypertext transfer protocol (HTTP) requests to server 120 for each web page that client 110 accesses. FE 310 may use the information in the HTTP request to identify the particular web page and web site associated with the web page that client 110 has accessed. Alternatively, FE 310 may receive similar data when clients 110 click on links provided by search engine system 135. In addition, the information received from clients 110 may enable FE 310 to identify other information associated with web site accesses, such as an amount of time a client 110 accesses a particular web page, whether client 110 scrolled through the particular web page, whether a purchase was made via the particular web page, etc.</strong> </blockquote> I hope this helps.
Mutiny Design
August 9, 2007 at 1:24 pm
You make a good point about the sitelinks giving you a good dose of SERP real estate, although since sitelinks generally come up for a company's name it probably doesn't make much difference. Obviously, as something that is determined by an algorithm, the quality of the site in Google's eyes is going to be a bit of a loose factor. Throwing up a few here that will almost definitely be seen as low qulity sites but have sitelinks: wank.net - youtube for porn, including the sitelink "Skank Gets Her Ass F*cked By" - PR4 crackzplanet.com - PR4 <a href="http://www.warez.com" rel="nofollow">www.warez.com</a> - PR6 <a href="http://www.hightimes.com" rel="nofollow">www.hightimes.com</a> - PR5 All of the above have good traffic and the visitors probably spend a lot of time on the site but then Dean Edwards doesn't have them with a PR8 and undoubtedly a low bounce rate and high traffic. Messy URLs doesn't seem to cause any problems either: al-jazera So, a bit of a mystery.
Hamlet Batista
August 12, 2007 at 5:07 am
Mutiny - As I said, computers are not smart enough to tell what is good/ethical vs what isn't. That is one of the drawbacks of full automation. <blockquote> although since sitelinks generally come up for a company’s name it probably doesn’t make much difference.</blockquote> It is good because searchers will automatically assume the site/brand can be trusted.
Heather Paquinas
August 12, 2007 at 9:23 am
Those sites are important in their respective neighborhoods though, right? Regarding jim boykin having sitelinks vs matt cutts, pronetadvertising, and sel having none, perhaps jim boykin reaches more newbie-nontechnical readers (eg people typing in jim boykin into search engines) than those other sites; although a flip side of that is: whoever clicks the links "Microsoft Land" or "Google Land" while on SEL? They aren't that useful/aren't a destination. matt cutts blog is the only destination on his site as well. Maybe it's both, a correlation of volume of searches combined with toolbar clicks. Will we ever know with certainty? Probably not.
Hamlet Batista
August 13, 2007 at 5:17 am
Heather - I think the reason those site don't have sitelinks is they don't have navigation (matt cutts) or the navigation on their sites is too weak (pronet) and doesn't encourage clicking. SEL does have sitelinks, search for 'search engine land'.
bestoptimized
August 28, 2007 at 3:56 am
I agree - having sitelinks doesn't make you an authority. People are currently using them as a sign of authority especially with directories.
Hamlet Batista
August 28, 2007 at 2:11 pm
bestoptimized - thanks for your comment. Authority or not, I'd say they are really nice to have. Don't you think?
Florchakh
August 10, 2007 at 4:33 am
Seems third paragraph of quote included into your post is a fairy tale, other ones are just probable... What does Matt Cutts say about sitelinks, have somebody ever asked him about this issue?
Hamlet Batista
August 12, 2007 at 5:09 am
Bart - It would seem so until you actually read the patent.
Mutiny Design
August 14, 2007 at 12:46 am
One common trait I have noticed amoung sites with sitelinks is they have incoming links spread throughout their site, not just on the homepage.