My loyal reader Jez asks a very interesting question. I am sure the same question is on the minds of others in the same situation.
Finally, I am in the process of creating multiple sites around a similar theme. I have unique content for all sites, and will host on different servers in Europe and the US, however the whois for each domain will show my name (The company I used does not allow me to hide this info). Is the common whois likely to make much difference when I begin cross linking the sites?
Cross linking (or reciprocal linking) in a small scale (maybe 10 to 15 sites maximum) should not be a major concern. I've seen many sites do it and they are ranking in highly competitive phrases. Most of their link juice comes from non-cross-linked sites though.
When you try to do this on a massive scale, things start to get interesting. I know this from experience.
Back in 2003 and 2004, I managed to get a couple of my sites ranking on Google for "Viagra" and most variations. That is one of the most competitive industries, because you make really good money as an affiliate. I got those rankings through link exchanges exclusively. Being a developer, I created scripts to 'borrow' links from my competitors link directories and later traded links with my sites. When I hit the 5,000 links mark, my sites got banned and I dropped in all my rankings. Back then, Google was not as sophisticated as it is now.
Later, I carefully studied competitors that were doing a more advanced type of cross linking. They created large networks of sites that they owned, and they created complex inter linking structures to boost the rank of a few of their sites for highly competitive terms. Pair.com was a common web host as they provided IP address in different class C blocks.
That worked well for a while–until Google became a registrar. It is illegal to use fake domain registration information, and by having access to the domain ownership information Google could more easily identify complex cross linking. I think they became a registrar with that sole purpose. I don't see them selling domains in the future. They haven't yet. Have they?
Making your cross linked domains' registration private won't help much either. I think registrars have access to the real information anyways, but even if I am wrong, it would be suspicious for your site to have all inbound links coming from private registrations.
There are far more complex cross linking schemes where there are a few owners cooperating in the creation of massive collection of websites with well planned link boosting structures. The funny thing is that search engine researchers have already identified most of them. Check the paper "Link Spam Alliances", it is a very interesting read. So, If you want to cross link on a massive scale, you better have a very intricate and complex linking plan to avoid detection.
Tom
June 20, 2007 at 6:08 am
Hi Hamlet, Great post - and that PDF you link to looks fascinating (I knew my maths degree would come in handy at some point in my career!) although I haven't had time to read the whole thing yet. I'd agree with your overall point about crosslinking on a mass scale however I would say that on a smaller scale, between say 5 - 10 sites on relevant topics I would recommend that a certain amount of crosslinking would help all of them improve their rankings - particularly if you manage the anchor text intelligently. I don't think you need to worry about the registrant information until you start linking on a larger scale.
Hamlet Batista
June 20, 2007 at 1:15 pm
Tom, Thanks for your comment. Now I know where to look for help with the Math :-) It is very interesting the amount of information we can dig out of research papers.
Jez
June 20, 2007 at 2:20 pm
Thanks for answering my question regarding whois.... I was thinking of creating something similar to the smaller networks illustrated in that document. It still does not get to the core of how you separate a spam network from an organic one. It relies on the assumption that those patterns are only used by spammers... that they do not occur organically. If these patterns are dropped by search engines, then spammers will move toward something more organic.... eventually they will emulate very closely the "genuine" structure of the internet. This will mean they lose part of their advantage (optimised structure). The spammers will counter this by building more and more spam sites that look organic....
Hamlet Batista
June 20, 2007 at 9:47 pm
Jez, You are right. It think it is going to be a never ending battle. The real challenge will be if/when Google makes personalized search the default. That can potentially change the whole picture.
Jez
June 21, 2007 at 1:58 pm
I do not know much about personalised search, but I assume Google will make assumptions about what sites users are likely to like based on the behaviour of other users.... IF this is the case then spammers will run multiple I.P. addresses and bots to create associations between popular phrases and thier sites.... better still infect other users machines with bots... as is done by DDOS attackers... I read an article some time back where the author suggested that google would be damaged by social media sites... at the time I thought that was rubbish as those sites are so easily gamed quickly falling under the control of syndicates.... like DMOZ before them... anything that relies on people / human bahaviour as opposed to algorithms will innevitably become corrupt and lose credibility... IMO
Hamlet Batista
June 21, 2007 at 5:41 pm
Jez, I have to agree with what you are saying
Matt Wardman
June 22, 2007 at 10:48 am
You asay network of 10-15 should be OK in search engines, while large networks are questionable. What about large blog networks? For example, the likes of B5Media have a cross-network blogroll with around 200 sites. Why are they not banned? Are blogs different, or are B5Media different? Cheers
Hamlet Batista
June 22, 2007 at 11:19 am
Matt, Thanks for your comment. Please note that my advice is related to deceptive practices. The only reason, IMHO, Google would ban a site or network is if there is something deceptive or manipulative. Creating a bunch of sites hosted on different servers for the sole purpose of boosting the rankings of one or a few sites, is one typical case. In the case of B5Media, they have many blogs that are of exceptional quality and they are obviously doing it for the combined traffic. They are not trying to hide it. This is common practice among blogs or site conglomerates. If all the blogs share the same whois information or class C IP block, the worst that could happen is that Google devalues those links. They would still do great thanks to all the external links they get.
Johny
June 21, 2007 at 9:02 am
Very nice post, I had this question on the back of my head for such a looooong time!
Jez
June 21, 2007 at 2:06 pm
Sorry, a couple more questions... Given the fact that whois gives away the fact my domains are linked... do you think there is there any benefit in using different servers? What I really need to know is whether I would be better off creating multiple sites or putting all my content into one big site... When i said cross linking i did not just mean reciprocal linking, I was mainly going to create one way links into my main site.... reciprocal links are relatively easy to obtain from other site owners, one-ways are not.... Running multiple sites also allows you to triangulate links, trying to get something closer to a one way link by trading links off other sites....
Hamlet Batista
June 21, 2007 at 5:52 pm
Jez, Don't worry, I appreciate your conversations. If we were in 2003-2004 I would go with that strategy. I used to own a large network of "thin content" sites and found it increasingly difficult to keep the rankings through the years. My recommendation is to create a single site and build useful content. You can hire really cheap content writers if you look. Build content for your site and for publishing on other related sites. That is the best I can tell you. Doing advance black hat stuff requires a great deal of planning and resources. You will also need a lot of throw away domains so you can move on when the rankings drop. I got tired of playing that game back in 2005. I built an affiliate network instead, which I still run. The income is far more stable than the search engine rankings.
Jez
June 22, 2007 at 12:43 pm
Thanks for the reply, I wish I had asked this sooner, I could have spend money on buying links instead of domains! I have already been speaking to some copywriters who are very competitively priced and sound as though they will do a decent job... it was this content I was not sure how best to use. I think it will come down to quality... if it is good enough I will put it on my existing site, if not I will dump it on another domain.... As for black hat stuff, it interests me a lot, but I think it takes years to put the tools together... a lot of those guys are writing their own applications in order to mass deploy faster than they are banned... its complex stuff... That said I have a markov based site which is still in the SE's after 6 weeks getting 40 uniqe IP's per day.
Hamlet Batista
June 22, 2007 at 4:24 pm
<blockquote> I think it will come down to quality… if it is good enough I will put it on my existing site, if not I will dump it on another domain…. </blockquote> The best way to get links to your site is to do the opposite. Publish your best content on other popular sites and leave a link back to yours. <blockquote> As for black hat stuff, it interests me a lot, but I think it takes years to put the tools together… a lot of those guys are writing their own applications in order to mass deploy faster than they are banned… its complex stuff… </blockquote> Depening on your niche, going black hat maybe your only option. For example, the PPC (Porn, Pills, Casino) niche is so competitive that I don't even think it is possible to keep your rankings with white hat techniques. If you know what you need it shouldn't take you more than a couple of months to build it. Just look in the right <a href="http://www.elance.com" rel="nofollow">place</a>. <blockquote> That said I have a markov based site which is still in the SE’s after 6 weeks getting 40 uniqe IP’s per day. </blockquote> I am not sure what you are talking about.
Jez
June 22, 2007 at 10:42 pm
Thats really interesting advice about putting the best content on other sites... it makes sense to do that... but I had not thought of it... I dont touch the three P's, so I have options... I think I have used incorrect terminology... it is really grey hat I was talking about.... mass deployment of auto generated sites which scrape - rewrite content.... Markov chains are used in text generators: <a href="http://en.wikipedia.org/wiki/Markov_chain" rel="nofollow">http://en.wikipedia.org/wiki/Markov_chain</a> They generate human readible, but ultimately meaningless text... it can be good enough to fool a reader for the first few sentences... Typically you seed this text from a combination of relevent text / articles (used in the body) and long-tail keywords (used for title, H1 and injection into the body at a specified density). If you do it properly Google indexes the pages no problem. I think Google must rely on users spam reports to get rid of Markov sites.
Hamlet Batista
June 23, 2007 at 7:55 am
Jez, That sounds very interesting. I think it is definitely black hat :-) I was familiar with the concept of Markov Chains but not with their use in the content generators/scrappers. If you want to do that, you have to do it now. It is going to be increasingly difficult for those type of sites to survive. Why? If users feel a site is crap, they will hit the back button and search again. This tells Google that the search was not successful. Google is far more complex than we think. Please check this discussion at <a href="http://www.seomoz.org/blog/relevance-feedback" rel="nofollow">Seomoz.</a>
Matt Wardman
June 22, 2007 at 11:52 am
Thanks for your reply. I think that I agree. Matt
Aaron Hall
January 9, 2009 at 6:10 am
Here is an idea that builds upon your idea. Create file called CROSSLINK.PHP. Have this file run in the footer of every page you own. CROSSLINK.PHP checks the URL of the page it was run on, compares it to a MySQL database that has a list of every URL you own connected to a random link to 3-10 other URLs you own, and displays the appropriate 3-10 links to other URLs you own. In other words, CROSSLINK.PHP displays 3-10 unique links on every page you own---which means you have 3-10 incoming links to every page you own. These are deeplinks. Applying this on sites hosted on different IP addresses makes it even more powerful. I use this method and it works great. Also, there is nothing wrong with linking to your own pages, so there is nothing blackhat about this method.