Web Browsers Most Susceptible to Browser Fingerprinting
Many believe unique browser fingerprinting is a silver bullet to single-handedly and anonymously identify users. Sadly or luckily (depending upon your perspective), browser fingerprints are far from unique. But as I’ll explore in this post, there are better ways to use browser fingerprinting to make it more unique/useful and there are also ways to hide amongst common fingerprints or at least make it difficult to be tracked (as a user).
This post tests which browsers are most susceptible to unique browser fingerprints and configurations you can hide behind.
First, let’s re-visit what a browser fingerprint is
Panopticlick and a number of other sites have discovered that by taking all of the information about a user’s browser (such as the web browser itself, operating system, installed plugins, screen resolution, bits per pixel, system fonts and other elements about their configuration), it is possible to generate a unique ID for a vast majority of browsers accessing their site. The idea behind it is that it can uniquely identify a browser (without the use of cookies) throughout the internet - whether or not a user clears their cookies or even has them enabled.
Browser fingerprints as I captured them, contain the following information which is then stored as an MD5 hash in Google Analytics (alongside no other unique IDs, except country, city and ISP/service name):
- User agent
- Screen resolution & depth
- Timezone offset
- Local & session storage
- Browser plugins (Skype, MS Office, Adobe software, Steam etc)
Here is what my browser fingerprint roughly consists of:
“Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31###1080x1920x32###-600###true###true###Shockwave Flash::Shockwave Flash 11.7 r700::application/x-shockwave-flash~swf,application/futuresplash~spl;Chrome Remote Desktop Viewer::This plugin allows you to securely access other computers that have been shared with you. To use this plugin you must first install the Chrome Remote Desktop webapp.::application/vnd.chromium.remoting-viewer~;Native Client::::application/x-nacl~nexe;Chrome PDF Viewer::::application/pdf~pdf,application/x-google-chrome-print-preview-pdf~pdf;Google Update::Google Update::application/x-vnd.google.update3webcontrol.3~,application/x-vnd.google.oneclickctrl.9~;Intel® Identity Protection Technology::Intel web components for Intel® Identity Protection Technology::application/x-vnd-intel-webapi-ipt-2.1.42~;Intel® Identity Protection Technology::Intel web components updater - Installs and updates the Intel web components::application/x-vnd-intel-webapi-updater~-2-0;Microsoft Office 2013::The plugin allows you to have a better experience with Microsoft SharePoint::application/x-sharepoint~,application/x-sharepoint-uc~”
Under an MD5 hash, it could resemble the following:
3c674d0f6ccce277b37370bc8ca6bxxx
I then store it in Google Analytics as a custom variable and then I mash it together with IP information available in Google Analytics (hostname, city and country). So, given the following fingerprints, I count all three as being unique:
3c674d0f6ccce277b37370bc8ca6b878 (Brisbane, Australia, ISP: Telstra)
3c674d0f6ccce277b37370bc8ca6b878 (Brisbane, Australia, ISP: ACME Internet)
3c674d0f6ccce277b37370bc8ca6b878 (Gold Coast, Australia, ISP: ACME Internet)
Again, it’s possible this visitor IS travelling between access points and cities (particularly for mobile users) but let’s save that for another day. With the methodology out of the way, now you know to take the data with a healthy dosage of salt.
But it’s not all that unique or reliable…
Certain configurations are simply too common to be able to accurately pin down (e.g. iPhone). Secondly, so many factors can alter your browser fingerprint that it’s simply unfeasible to use alone as a method to reliably and uniquely identify users to your site. For instance, the following things can alter your browser fingerprint:
- Updating your browser to the latest version
- Installing new plugins
- Using multiple browsers (sorry porn browsers, incognito/private mode is not a different browser)
- Changing your screen resolution (or using a laptop to plug and play monitors all day)
- Enabling and disabling cookies or other browser functionality
- Disabling JavaScript
- Switching timezones (including daylight savings)
Every time Chrome updates, your fingerprint will change
Just look at how often Google releases new versions of Chrome! Thanks to Andre Mafei for this excellent visualization of Chrome updates:
Browser uniqueness
Browser fingerprints for browsers are surprisingly unique for desktop based browsers. For mobile devices, particularly those on iOS (thanks to Apple’s iron grip), are not very unique when you look at the variety of browser fingerprints. Don’t get too giddy Apple users, as you will soon see this doesn’t matter much at all:
Look what happens wen you mash it together with IP information…
Boom! Unfortunately (or fortunately for that matter), mashing fingerprints together with geo IP and network information available in GA (network domain, city, country) paints a very different picture:
All browsers that were once considered relatively anonymous are now >99% uniquely identifiable.
Operating system uniqueness
No surprises here that you have virtual anonymity when using an iOS based device.
Now let’s add a dash of IP related information into the mix
Muahaha… Once more, the moment you pass in IP information, the playing field is leveled:
One application: Plausible solution for cookie-less tracking
One of the more prominent web analytics vendors, Adobe has introduced cookieless identification of unique browsers which it achieves through a combination of browser fingerprints and a user’s IP address. I’m not certain how they perform this, but I would imagine they use something similar. Even if it’s not 100% accurate across sessions, it should give marketers a clear picture 90% of the time. And this is OK - remember, no web analytics tool is ever going to be completely accurate.
If you have a high proportion of visitors who do not allow cookies (a 2009 study revealed roughly ~5% of users do not allow first party cookies), then you may also have a use for this.
I would also be surprised if Google and other big ad networks WERE NOT testing or using fingerprinting to track visitors throughout the web.
Future exploration of fingerprinting
There are a number of things which I would really like to explore further in the realm of browser fingerprinting, even though I have essentially sidelined it as a useless dimension.
Using data in GA to develop your own fingerprints
I haven’t tried, but it may be possible to use GA data to generate fingerprints. E.g. combine all the browser configuration information and count the unique instances.
Tracking it alongside Google Analytics visitor IDs
This would allow you to see if visitors are clearing cookies and what proportion of them do this. Alternatively it may provide insight into how unique fingerprints actually are.
Tracking IP address and Browser Fingerprint together
Even more granular than network domain in GA is the specific IP address. Unfortunately even though this is more granular, it’s susceptible to dynamic IPs which will add to the volatility of your fingerprints.
Adding Adobe Flash into the mix
As Panopticlick data shows, one of the most uniquely identifying bits of information - fonts installed - can be accessed through Adobe Flash. This will greatly improve the uniqueness of fingerprints.
How users can avoid being uniquely fingerprinted
- Use a common browser configuration (e.g. get the most popular phone in your region and use the default browser)
- Avoid using Adobe Flash (Flash allows access to reading system fonts installed - one of the most uniquely identifying sources of information)
- Use a different IP address regularly (e.g. proxies)
- Update your browser and other software regularly
- Disable JavaScript (to avoid any probing scripts)
- Fake your user agent (personally I like to browse sites with IE 5 on WIndows 3.1 just for shits and giggles)
How browser fingerprinting can be augmented
- Never use it alone - always tie it back to a user’s cookie, IP address, customer ID or all of the above
- Rather than taking a hash of the user agent and a host of other identifying features, collect all the pieces of identifying information individually and account for small updates here and there.
Anyway, I hope you found this analysis on browser fingerprinting useful. Obviously there’s a lot more to it than what I have covered above but it may help answer some questions explored on sites like Reddit, Github and Stack Overflow.
I’m confused how the IP address is going to help you uniquely identify someone on an ongoing basis.
Common ways to access the internet:
* residential
* mobile
* wifi
* business
* …
The first three in the list above don’t use fixed IP addresses in the majority of cases. That means the fingerprint you’d generate for those users today is going to change tomorrow when their IP address changes. In fact they are going to be a different user when using their mobile at home via their wifi, in the car at traffic lights via 3G and different again via wifi through an open network for convenience at the office.
I haven’t put this to the test yet but I thought the idea put forward by Daniel Miessler to use a 301 redirect to track users was very clever. Again it isn’t foolproof as not every user agent will honour the caching headers but I liked the idea
Am I missing something blatantly obvious in the IP address thing, forgive me if I have - it could very well be staring me in the face and I just can’t see it.
Thanks so much for your incredibly detailed post, Alistair. You make a great point - There are so many ways to access the internet and an IP will never uniquely identify someone. DHCP and NAT are perfect examples of why you shouldn’t trust IPs to link to individuals to. Unfortunately this remains one of the limitations of this method.
I didn’t exactly look at IP address in GA - this is no longer available in the filters and I’m not tracking it. Rather, I used the geo location and ISP name fields in GA to hone in on a particular browser. This gets around the DHCP stuff to a certain extent, but it introduces other issues.
I.e. Two similar browsers on the same ISP in a similar geolocation.
I love the sound of Daniel Miessler’s cookieless approach. I wonder how long a browser caches redirect for? Or if the behaviour is consistent across browsers. Definitely worth giving this a shot!
Wow that’s seriously an epic post, so how long does it take you to setup something like this on someone’s website?
Cheers, David.
Fingerprints can be captured pretty easily - I wrote about it here. With that you could probably set it up within 15 minutes: http://www.optimisationbeacon.com/analytics/using-brower-fingerprinting-in-google-analytics/
In terms of making them useful however, that’s the difficult part. And unfortunately, it’s not that easy to answer.