Tuesday, September 8, 2015

Seminar 4 Position Statement: Wifak Gueddana, LSE

My user, my data! When my ip was sold ten thousand times

Saying that the digital economy has taken over the global economy can only mean that the Internet has grown into a big and global market, where people are situated in a one-way  spectrum ranging from internet service providers ISPs (producers) to users (consumers).

Based on that the process or theory that connects what ISPs do (services), how they assess and rate their performance (rhetorics of supply) and the marginal added worth they contribute is being increasingly determined by users’ data (market resource). Also the scale of this digital data exploitation (collection and processing) is bound to be more invasive for people’s rights and privacy. Finally, while much of the industry practices and technology advances in this field are still understudied and obscure, the decisions that are taken today will shape this  industry and our future.

As a way of simplification, let’s start by agreeing that most ISPs are at the same time, advertising companies. I define advertising companies as entities whose core business involves among other things surveilling the activity of their services’ users in order to increase the visibility of that of third parties and help them sell their products. For example, Google was first known by users as a web search company; now we know that Google is in reality an ad company. Similarly to social media companies, such as Facebook, Google processes the data of its services’ users and sells it. From this perspective, how such companies influence standards and technology advances in terms of the social networks and search algorithms’ performance (or quality) is now conditioned by their internal strategy as advertising agent. This was not necessarily the case before.

IPs, browsers and individuals’ browsing history (clicks on links) have been stored by ISPs and websites from the beginning of the web history; nothing is new there. This kind of data was kept in cookies' files and stored in users’ computers not to overload the servers of websites and ISPs. A user requires this kind of information for her personal use, in case she wants to go back to a previous search results or remember some information. This kind of data is also positively rated by websites because it speeds up their performance. The same logic applies in the case of companies producing search and recommendation algorithms. These have designed search solutions, which query users’ recorded information because this process helps speed up, or improve the relevance of search results for the user. This is to say that much of the users’ data that was collected before was for the benefit of the users, i.e. to provide them with options and improve services and features’ design.

A turn is marked when this information is not stored any more on one's hard disk only but also on the servers' of the websites and ISPs. Why? First this is explained by storage capacities, which are now very cheap. Second the nature of this information (typed-in-personal-data), its volume (number of users), and the potential for connectivity (one key type for many datasets) are also very alluring. More importantly, users’ information is increasingly recorded by the servers because it is continuously processed. Most ISPs have now established standards and algorithms to mine, filter, visualise, re-order and categorise their users’ data for sale to third parties – advertising companies or companies in other industries. For example, tracking cookies, clicks and search histories are common surveillance technologies which are now used to compile long-term records of individuals' browsing behaviour. These datasets are mined and filtered by theme; industry and key word, then sold accordingly as an unnamed category – that is without disclosing the personal identity of the owner. Thus, all the rows in a dataset represent IPs, or browsers, or some sort of site ids referring to users who have searched or shopped or clicked on a let’s say a ‘travel’ product. These are put together in a category named ‘travel’ and valued at a certain price. Those ISPs who track names, emails or names may be considered today illegal and their activity restricted.

While we are undoubtedly dealing with a techno-cultural condition of increased and normalized tracking, Law has acted so far as a firewall, trying to keep up with industry standards and technology advances. In 2011, European and US laws have prompted companies who use tracking cookies to take action towards getting the 'informed consent' of users. There will probably be a similar law on search records or all on individuals' records that are collected by ISPs to feed customization and recommendation algorithms. While current advances in digital law bring increased public awareness and protection for users, it is somewhat too late to stop the massive scale of digital data exploitation and its possible negative repercussion on users’ freedom.

Indeed, to derive sensible information about individuals and groups from huge datasets that do not necessarily talk to each other, it is to be expected that aggregation procedures and reductionism will be brought to unprecedented scales. In other words big data analytics amplifies the scope and extent of statistical analysis and the probabilistic laws of large numbers. Such a situation means that people will increasingly be less unique and more quantifiable. They will also slowly lose control over their data, which will continue to circulate between datasets unbounded. Data are regularly and systematically re-purposed to be sold.
Their transactional value, as defined by potential for re-combinability and circulation, overweighs their semantics or utility. Such a situation can lead to cases where the transactional value of data about someone or something will be more significant than the ethical value of that someone or something. In a society where the question of ethics is underscored in the political discourse, I think that social sciences academics and scholars have a responsibility in bringing forward a revised framework on digital ethics.

