Privacy, Web Browsing and Media

Contents

  1. Introduction and General Issues
  2. What is Web Browsing and what is Web content?
  3. What is Multimedia Content?
  4. What is Web tracking?
  5. Main Privacy Settings with the Web Browser Firefox
  6. What is Digital Right Management (DRM)?
  7. Automatic Start of Software on your Device
  8. The Brave Concept for Web Browsing and Ads Business Model
  9. Anonymous Web Browsing Using the Tor Network

Introduction and General Issues

Web Browsing is the activity of "going over the internet" to visit Web sties or Web Apps, such as social networks, video platforms, gaming sites, e-commerce sites, etc.

When you browse over the internet, you leave a lot of trails behind you, which allow servers to know a lot about you. The servers can then exchange this information between them to determine a very detailed profile of who you are, your tastes and habits, your friends, family and enemies... See this page to get an idea of what is going on on the server side, privacy and regulatory issues related to digital user tacking.

We provide here some information about what kind of information Web sites can collect, and how you can prevent some of the privacy problems by taking a few steps and habits which cost little, in comparison with the way you might feel about giving away all this data.

The content and design of Web Applications is particularly sensitive, as you are liable to stumble upon a potentially malicious site very casually through regular browsing activities.

What is Web Browsing and what is Web content?

Web Browsing is the activity of visiting Web sites or so-called Web applications (that is, using services on the World Wide Web, also commonly called the internet), using some dedicated software, which is called a Web Browser, such as chrome, Firefox or Brave. Some content, the source of which is on one or several Web server(s), is fetched, possibly processed dynamically, and displayed in the Web Browser.

The content on the World Wide Web

  • Is very often discovered using a search engine. See this page to see how to chose and set-up your favourite search engine a Web browser such as Firefox.
  • Is preferably specified using standard technologies, that is, content, data, and languages which are agreed upon by some universal international consortium, typically the World Wide Web Consortium. The purpose of this standardization is
    1. Make sure that Web Browsers developed by any party can interpret (parse) the content correctly and without ambiguity
    2. Ensure that properly developed Web sites and Web Applications can be safe, and follow transparent and open guidelines and rules.
    3. Standard Web technologies which run programs on the device of the users (i.e. on the client side software for Web Apps) generally have to be Open Source. This choice impacts property and might reveal security breaches to potential hackers, the prevention of which lies with Web developpers. Note that the most important steps to protect web sites from massive attacks are on the server side, the source code of which is not visible from the client anyway (as the two communicate through networking protocols). However, those open technologies also allow an independent expert to evaluate the privacy policy of the software, at least for identification of the data being collected. Open source code also allows to uncover some potentially malicious design features in the software.
  • Can sometimes be processed by a closed proprietary plugin, that is, an extension for the functionalities of the Web browser, which allows a company to make Web technologies closed, non standard, and sometimes less protective of privacy. (see this page to understand the difference between free software and proprietary software and why you should care)

At last, Web content can be aggregated, which means that content from different sources can be combined and displayed on a single Web page. In particular, Web pages can contain some Web advertising content such as Web banners, which can come from a different source from the source of the main content in which a human user might be interested.

What is Multimedia Content?

We call Multimedia Content on the internet such content as Audio (for example music or radio programmes), Video (For example movies or news TV programmes), graphical animations, 3D animations, video games, etc. (or any combination of these.)

There are some standard ways to put multimedia content on the Web, such as dedicated HTML tags for videos, graphical canvas, and 3D programming with WebGL.

There are also a number of proprietary plugins for multimedia over the Web. The use of either a proprietary plugin or an open and standard technology lies with the Web Developers who design the Web site or App.

The choice of a proprietary plugin might be for convenience, for obfuscation in relation to property, bu it can also hide privacy policies, which might be legal, but which the user can consider to be a major issue. See also the section about Digital Rights Management for other ways to protect copyrighted digital objects.

What is Web tracking?

The visitors to a Web stie can be tracked, in the sense that the human user's or its device's activity (e.g. activity of a given computer, smartphone, etc.) is identified by the Web Server and memorized. As the different Web Servers are organized into networks and can exchange information, a given user's activity can be followed from site to site to construct a complete profile of the user, her habits, her tastes (e.g. political, consuming...), as well as her friends' and enemies' profiles. See this page to get an idea of what is going on on the server side, privacy and regulatory issues related to digital user tacking.

Web tracking can be achieved by organizations, governments or businesses by different routes:

  • A user creating an account with a site (e.g. payment details, phone number and shipping address on a e-commerce site), will provide personal information. The user might as the site to "remember me" by checking a box, which allows to access one's personal data later without going through password authentication. This feature is generally implemented using cookies, which is some data left in the user's Web Browser, and allows to uniquely identify that same user automatically at the next visit.
  • Some of the user's metadata, such as the IP address, the hardware address address, the kind of Web browser or the Operating System (OS) used, or even the screen size, etc. allow to construct fingerprints which enable the unique and unequivocal identification of an individual human user.

Main Privacy Settings with the Web Browser Firefox

Here is an overview of the privacy setting for the Web browser Firefox:

  • Disabling trackers will cause some pages, who enforce tracking as mandatory, to have their content blocked (see examples below)
  • Third parties cookies is related to aggregated content in which the different providers for the content (for example advertisers or embedded social media posts) put their own cookies, without any explicit mention of that happening.
  • The "Do not track" explicit signal that you might send does not technically prevent a Web site from disregarding your will and implement tracking.
  • Your Web Browser (for example here Firefox), can store your passwords for the different sites, so that you don't have to remember all of them. However, that might allow somebody who can physically access your computer while you're logged in to retrieve your passwords or access your sensitive data on the Web. If you think necessary to protect yourself from that, you can set a "master password", which is a unique password that protects the access to all of your stored Web passwords. See this section about the basics for data protection and passwords management.

Figure 1. Setup of the privacy options in Firefox's preferences.

Note that the choice of a search engine also strongly affects privacy. Indeed, the search engines you use can determine your fingerprints, know about the stuff you're looking for, the sites you visit after a search, etc. We remind you that you can find out how to chose your favourite search engines on this page.

Social networks and media also know a lot about you because you give them a lot of personal information, as well as your friends, the people you cite or refer to (as well as their own profiles), the type of content you cite, etc.

What is Digital Right Management (DRM)?

Digital Right Management (DRMs for short) is a family of techniques intended to protect the rights of legitimate owners of a copyrighted digital asset. That can be software, music, movies or films, documents, etc. DRMs can involve encryption, digital signatures in the software, and may involve specific hardware engineering to control the devices on which the copyrighted work can be accessed, copied or used.

A distinction must be made between enforcement of copyright and the remuneration of creators, as the conditions for their revenue can be very different (and generally much smaller) than the overall revenue generated by their work.

As can be seen by going through the wikipedia page for DRMs, there is a large panel of techniques for each kind of digital asset, with a number of controversies. The main issues are:

  • The users can be unable to make fair use (see the US legal definition of fair use) cases of the asset they purchased, such as copying on different devices for different personal use cases or making backups, keep using after buying a new computer or re-installing the OS, etc.
  • Incompatibility of DRM systems between providers making impossible to keep using purchased material after changing provider. It seems, for instance, that the policy of amazon is that all the intellectual creations such as e-books (generally readable only with kindle), bought MP3s, etc. can only be used with hardware registered with amazon, which implies that all the corresponding devices can be tracked as belonging to the same amazon account.
    Note that the artistic and intellectual creations are generally tagged in those systems to create metadata specifying features of the files, such as a target social group, a social or political message, etc. This metadata can then be used by machine learning algorithms (so called Artificial Intelligence techniques) for the purpose of targeted advertising or of suggesting other targeted contents to the user. See also this page for general issues about dependency to a software provider.
  • Possibility of obsolescence of the assets purchased by change of terms of use or even sometimes discontinuation of the service.

Automatic Start of Software on your Device

By adjusting the setting in Firefox as in Figure 1, you can disable (block) some of the automatic and unnoticeable running of software on your device (within your Web Browser) which can affect privacy. You can then chose to enable those programs to run explicitly on a case by case basis, as they try to launch while you are browsing. Here are a couple of examples:

a) Enabling tracking explicitly for a site
b)Running plugins on a case by case basis

Figure 2. Allowing client side software to launch manually on a case by case basis.

On Figure 2.a, a Web site offers different possibilities, which is either free access with tracking or creating an account (and therefore entering personal data and allowing to track). The Web site also provides some political content, and is owned by the head and founder of a famous big tech company, which also provides other media such as video streaming, as well as cloud services and AI capabilities. We might suspect that those services are interconnected on the server side to nudge users towards behaviours, be it about consumption (e.g. e-commerce habits), or about politics. See also this page about the way your data is handled on the server side and regulatory issues related to privacy.

On Figure 2.b, you can manually enable a plugin for a video with audio content to be displayed. As we noted, the use of this plugin is a choice not to use standard open technology, possibly combined with Digital Rights Management (DRM) techniques to protect the property of the content's owners. This choice can be guided by a will to hide privacy policies. In this instance, the company is also known for a business model involving AI and the plugin is based on a technology that features specific proprietary privacy management techniques, again as a choice not to use an open standard.

Note that the data can be considered by American companies as subject to US regulations about privacy, which makes in particular (in the US constitution) a difference between the privacy of US citizens and non US citizens, even when the servers storing the data is not located in the US, by virtue (so to speak) of the Cloud Act. Beyond that, the American ethnocentrism, as reflected in the cloud act and countless other attitudes (conscious or not...) by Americans, can lead technology companies to handle the data by standards corresponding to the American culture.

From that point of view, it should be noted that, in the US, broadly accepted interpretations of the First Amendment of the US constitution is that it applies to political advertisement, which is used routinely in US elections campaigns. This freedom for money to speak, currently guaranteed in the US, can appear outrageous or shocking from abroad, but that seems to imply that by accepting to be tracked for the purpose of advertising by the site on Figure 2.a, and by browsing on the site of Figure 2.b, your data collected by the proprietary plugin in Figure 2.b could be legally used to send you targeted political messages (under the assumption that some chain of partnerships relates the two companies).

Note that in the example of Figure 2.a, checking the box is requested only because of the settings chosen in Firefox as in Figure 1. Otherwise, you could browse through the site of Figure 2.a transparently without noticing anything unusual (and, in fact, it is not unusual...). See also this page about the way your data is handled on the server side and regulatory issues related to privacy.

The Brave Concept for Web Browsing and Ads Business Model

A recently introduced Web Browser called Brave proposes a novel way to manage the Web Advertising Business. I tried the Web Browser, and it looks just great. The installation process for Ubuntu is very simple, even though it requires to copy paste a few commands into the console (in the start menu of the LXDE Desktop Environment, go to System Tools and then LXTerminal to open the console). I might adopt it as my default browser pretty soon...

The CEO, and co-founder of the Basic Attention Token concept, is Brendan Eich is one of the pioneers of open Web technologies, who has been involved in many game changing endeavours to preserve a free, pluralist and diverse internet. He has, for instance, created the JavaScript language, which is now recognized as the standard client-side Web Programming language by the World Wide Web Consortium (W3C) . If you don't understand exactly what that means, believe me, this is no small achievement!
See also my own lecture about client side Web programming with JavaScript.

The concept and business model aims at fixing all the annoyances and issues mentioned above, thus improving security for users, reducing energy costs, fixing privacy problems, through an innovative global block-chain technology solution. The ideas are set out in a compelling way by the creators themselves:

Anonymous Web Browsing Using the Tor Network

If you really need stealth on the Web and the closest to untrackable browsing, you can use the Tor network through the Tor Browser. Note that this technique is not meant for everyday use; only for specific purposes which require stealth, such as, for example, whistle blowing.

To understand the concept, beyond the tracking techniques mentioned above, you would need some networking notions such as routing in the Internet protocol suite, as well as the Transport Layer Security protocol. See for example my own lecture about network administration.