Organizations are constantly collecting our data as we browse the internet. In some cases, users consent to giving over this information, like when they choose to fill in an online form. In other cases, this occurs without the user’s agreement. The large scale collection and analysis of personal information in fact makes up the core business of many companies, including service providers, content providers and other third parties, who each use it for commercial purposes.
The reason behind this data collection is that it allows businesses to target ads to particular customer profiles or demographics, to use price discrimination or to identify details about customers, like their financial standing, age, or health. Governments might also make use of this data for surveillance purposes, and identity theft criminals also attempt to access this user data.
However, the majority of users don’t want to be tracked in this way, nor to share their personal data with corporations without prior consent. Some preliminary defense practices have been developed, such as using ad blockers and clearing the browser cache - however as ways of tracking online are constantly evolving and becoming more sophisticated, these methods simply aren’t enough to defend against all threats. Indeed, only a very small number of internet users are fully aware of the sheer number of web tracking mechanisms that gather data about them as they browse. So, let’s take a deeper look into this growing practice. What are the different web tracking mechanisms? And how can users defend against them?
Web Tracking Mechanisms
The results of a collaborative research about Web Tracking, conducted by Talaia researchers and UPC BarcelonaTech, have been published in the Proceedings of the IEEE. The comprehensive survey presents more than 25 different web tracking methods that can be divided into 5 main groups: session-only, storage-based, cache-based, fingerprinting & other methods. Here’s an overview of the different approaches being used, and what you can do to protect yourself and your personal information in the face of these mechanisms:
Flash cookies are used by Adobe Flash to store data on a user’s computer. They can be up to 100 KB big and since they are stored on the disk directory (as .sol or .sor files), they don’t expire and are more difficult to eliminate for the average user. Additionally, as Flash plugins share the same directory regardless of browser type, they allow for user tracking across all browsers.
How can I defend against it?
To minimize tracking via cookies, users can disallow HTTP cookies when visiting a new site (by simply press “no” when the website asks the user for consent). To avoid Flash cookies tracking, users can block Flash from placing data in their browser storage or disallow Flash cookies altogether in the browser or Flash manager settings.
Clickjacking is a method of causing users to follow a link to a sensitive website element which can lead to user data being compromised. This can occur in many forms, mostly through the disguising of the attacker’s link among other site elements, or through fake web cursors which trick users into thinking that their cursor is in a different location on the page. Clickjacking can lead to compromising users’ anonymity, the theft of users’ email and other private data, seizure of Paypal credentials, or spying on a user by a webcam.
By disguising links, input controls and other website elements, clickjacking is difficult to spot by users.
How can I defend against it?
Cooperation via CAPTCHA
Other techniques involve the unwitting cooperation of users, such as one technique which involves the creation of an image which poses as a CAPTCHA - in which users type a code into a field to confirm they’re not a robot. However, links to website destinations can be concealed within the words, letters or symbols of the CAPTCHA. Given that visited and unvisited links show in different colors on the browser, the symbols that users see, and therefore type into the field, discloses which destinations they’ve previously visited.
Visible words and characters represent links to website that the user hasn’t visited. Blank spaces are hidden links that the user visited before. By typing the words and characters into the box that users can see, they reveal part of their browsing history to attackers.
How can I defend against it?
A fingerprint is a unique identifier of any device, operating system, browser version or instance which is made up of values that can be obtained by a web service when a user browses a website. Fingerprinting gains access to these identifiers via a variety of methods. It permits tracking without storing any cookies, across multiple websites, and is completely transparent to users.
Device fingerprinting is a method which allows companies to identify a computing device - be it a laptop, tablet, PC or mobile device. This is particularly relevant when customers are using multiple devices to connect to the internet.
Location and network fingerprinting involves determining the global network and IP-based geographical location of any user - one of the most straightforward elements of a fingerprint that can be determined using the headers of incoming HTTP requests. By using network tools, the service is able to identify the name of the domain and the user’s Internet Service Provider.
Browser fingerprinting, such as via CSS and HTML5 fingerprinting, allows sites to recognize the family and version of web browser being used.
Cross-browser fingerprinting is more powerful and harder to evade than browser fingerprinting, since it lets people track you even if you use multiple browser. The underlying technology is based on a code that instructs browsers to perform a variety of tasks which require to draw on operating-system and hardware resources such as graphics cards, CPU cores, installed fonts and audio cards. Since these resources differ for each computer, users can be identified easily.
OS Instance Fingerprinting
Passive Client Fingerprinting
Passive client fingerprinting collects attributes from network-connecting clients or servers. For example, TCP properties, TLS capabilities or characteristics about HTTP implementation can be collected from transport, session or application layers. Based on these attributes, operating system, system uptime and browser type can be identified. The three most popular passive fingerprinting techniques are TCP/IP fingerprinting, TLS fingerprinting and HTTP Fingerprinting.
Canvas fingerprinting is a commonly used tracking technique which relies on using the browser canvas API to draw invisible graphics. Since each browser instance will draw the graphics in a unique way, this identifies the browser being used.
How can I defend against it?
The Future Of Web Tracking
In the next years, web tracking mechanisms will continue to evolve and users will be exposed to new types of data privacy threats. To face this development, several projects and initiatives are arising to defend users against web tracking. A good example is the Data Transparency Lab (DTL), a community of technology and industry experts, researchers and policymakers working to advance online personal data transparency through scientific research and design. It is backed by big tech companies like Mozilla and Telefónica. Another interesting organization that promotes defending digital privacy, free speech, and innovation is the nonprofit organisation Electronic Frontier Foundation (EFF). One of their research projects called Panopticlick is particularly useful for internet users. The website includes a free tool that lets you check how good your defense against web tracking really is. In addition, in order to improve users' privacy defense, the EFF also developed the browser add-on Privacy Badger to stop advertisers and other third-party trackers from secretly tracking the user.
Over time, methods of web tracking continuously evolved to become more and more invasive. Whilst in the past it was simple for users to prevent tracking, for example by removing HTTP cookies if they didn’t want them to be stored, recent technologies are significantly more difficult to avoid. Indeed, many users have very limited awareness that they are being tracked online which is particularly dangerous due to the potentially harmful nature of modern web tracking. Thankfully, there are a number of tips outlined in this article that users can follow to better protect themselves and their personal data online.
If you’d like to experience how Talaia’s network monitoring solutions can increase your security and privacy online, try our free 7-day trial here.