Data, Platforms, and Bias

Johannes Lenhard & Alexandrine Royer

Max Planck – Cambridge Centre for Ethics, Economy and Social Change
Cambridge University

April 15, 2021

As digital capitalism, and with it data, algorithms, artificial intelligence (AI), and platforms are becoming increasingly dominant, the anthropological interest in the topic is slowly growing, too. So far, most of the (critical) engagement with the digital economy around us that is driven on and by data has come out of journalism, media studies, and sociology; much of the debate has centred around what some call the ‘ethics’ of AI and data, or digital capitalist practices more generally, mostly with a view on big tech companies. 

In this reading list, we focus on contributions which are either anthropological in nature or are based on ethnography. Ethnography is so far an underused and underestimated part of the toolkit to open up what Frank Pasquale calls the ‘black box society’ produced by data and AI. As Kathleen Richardson comments, ‘to make artificial intelligence is to reproduce what is essentially us, an odd form of self-reproduction.’ How can anthropology and its methods help us to better comprehend how platforms, the data economy or more specific outgrowths of it such as new types of (gig) labour work? What kind of critiques can anthropology help formulate and productively bring forward? 

The world has shifted even further online with the breakout of COVID-19, and the accompanying lockdowns, constant Zoom meetings, and working from home. Much of our connectivity and our social relations  are being enabled, mediated, and managed (some would say surveilled) by platforms and data companies. While some aspects of life and work might return IRL (in real life), much of it will remain online and data-fied. 

How did we even get here? 

The history of (the social sciences of) data and platforms

Data is not exactly a new phenomenon; certainly not in its form as a possible ‘weapon of oppression’. In his How we Became our Data, Colin Koopman goes back to the 1900s to explain how, with the advent of the birth certificate, people became systematically defined by specifically formatted kinds and types of data for the first time. In the decades that followed, data has been manipulated, biased, and used to marginalise people, such as in racial redlining practices (among city planners in the US that Koopman describes; see our review). What has changed over the last decades with the digitalisation of increasing parts of our everyday lives since the advent of the (personal) computer and the (connected) internet is the scope of data’s influence. 

Platforms include the advertising-driven models of Facebook or Google, but also the cloud platforms, most of our data is stored on and what Nick Srnicek calls ‘lean’ platforms (including Uber’s taxi app or AirBnB’s apartment-renting solution). These platforms are organising much of the world around us and are usually powered by data – they collect data from us which can either turn into (advertising) revenue directly or convert into various other currencies (such as targeted, personalised sales on Amazon). Data itself has become a new form of capital, according to Ivana Bartoletti, that runs through and is organised by platforms. Platforms and their algorithms have the capacity to turn our everyday life – from online purchases to digital newspaper consumption and financial transactions – into fine-grained data profiles. These in turn are commodifiable, sellable, and usable in various forms of capitalist economising. 

Additional readings (the basics):

Boellstorff, Tom, and Bill Maurer. (2015). Data — Now Bigger and Better! Chicago: University of Chicago Press.
A short volume of essays relating contemporary discussions of big data to classical works and concepts within anthropological theory such as Mauss, Malinowski, Levi-Strauss, and others.

Coeckelbergh, Mark. (2020). AI Ethics. Cambridge, Massachusetts: MIT Press. What even is AI? What might an ‘ethics’ of it look like? The best pocket-sized critical introduction to the topic and very helpful to connect the dots.

Forsythe, Diane E. (2001). Studying Those Who Study Us: An Anthropologist in the World of Artificial Intelligence. Stanford: Stanford University Press.
A collection of pioneering essays on artificial intelligence from an anthropological perspective with a particular focus on the roles of gender and power in computer engineering. 

Guyer, Jane. (2016). Legacies, Logics, Logistics: Essays in the Anthropology of the Platform Economy. Chicago: University of Chicago Press.
A historical and comparative analysis of the composite architecture of West African and Western economies; Guyer offers a fresh conceptualisation of the platform economy as structure entangled in local experiences, logics and logistics.

Srnicek, Nick. (2016). Platform Capitalism. Cambridge, UK: Polity.
The easiest and most critical primer on different kinds of platforms – from advertising-driven ones to cloud-based ones and lean platforms; a must read.

Data as knowledge and control 

In The Age of Surveillance Capitalism, Shoshana Zuboff paints a picture of a world not only organised by digital platforms collecting and ordering data from their customers but also controlled by them. Zuboff’s account (while at times jargon-laden) delivers rich examples and strong case studies of how data and platforms turn into governmentality-apparati, from facial recognition software to social ranking scores. Jathan Sadowski’s recent volume Too Smart extends this line of analysis to wearables, smart cities and smart homes. From data extraction to data-fueled control, he warns of simply buying into the convenience that new technologies promise.

Anthropologists have applied this lens to scrutinise what work looks like in the digital platform economy. Three recent ethnographies illuminate different kinds of contemporary blue- and white-collar work mediated by data and platforms (read our combined review of them here). Ilana Gershon’s Down and Out in the New Economy focuses on white-collar, office workers while Alexandrea Ravenelle’s Hustle and Gig and Alex Rosenblat’s Uberland are both concerned with the more precarious gig workers. Gershon goes into detail on how college-educated young professionals struggle to find work and fit into a societal narrative of ‘the self as business.’ How do you turn yourself into the right kind of entrepreneur using LinkedIn and other (linguistic) genres of self-branding? Rosenblat focuses uniquely on the ride-hailing service Uber in the US, and Ravenelle studied the gig-workers at four platforms, including TaskRabbit and the already defunct Kitchensurfing. They extend the same observations to low-paid, on-demand work, but the promise of the self as entrepreneur is only one half of the narrative. The other half reveals a reality of data-enabled control which the platforms exert over both workers and work itself. 

What anthropology contributes with these and other detailed ethnographies is the view from the workers’ own perspective: how are they struggling (or enjoying the promised flexibility and freedom)? How do the platform’s practices, such as regulating surge pricing, affect them in their everyday work? What might their work-arounds look like? More generally, how are data, platforms, and algorithms embedded in sociality?

Algorithmic bias 

Diane Forsythe observed already in her 1993 article how data engineers, who were predominantly white, middle-class, Euro-American men, designed systems that reflected their interests and perspectives. Forsythe was among the first to leave an anthropological imprint on the study of human-machine interactions, noting the gendered biases in the coded structure of AI. Despite this early contribution, everyone from media scholars to mathematicians have taken the lead in unravelling the consequences of our growing reliance on big data models and automated decision-making. As the input data within AI systems are tainted with society’s racial and gendered biases, algorithmic outputs will automate the status quo.

In Weapons of Math Destruction, Cathy O’Neil carefully uncovers how big data and algorithms can lead to decisions that place minorities, people of colour, and low-income individuals at a disadvantage, thereby reinforcing discrimination and amplifying pre-existing inequalities in the distribution of socioeconomic resources. Algorithms responsible for high-stakes decisions in insurance, education, and policing operate in ways that are opaque, unregulated, and largely unbeknownst to the public, hence challenging to contest. Following O’Neil, scholars have expanded their attention towards the relationship between algorithmic data, societal structures, and social justice efforts. In Algorithms of Oppression, Safiya Umoja Noble details how search engines are biased in their presentation and engagement with racialized groups, especially women of colour, due to their quasi-monopoly and the corporate interests driving results pages. 

The systems that proliferate bias also have biases baked within their very design and coding. Joy Buolamwini documented the inability of AI facial recognition systems relying on computer vision to correctly identify people of colour, particularly women of colour, an issue that tracess back to the history of colour film and its optimisation for lighter skin tones. Tech developers and Euro-American societies writ-large privilege whiteness and heteropatriarchal norms in the construction of intelligent machines. In their recent article, ‘The Whiteness of AI,’ Stephen Cave and Kante Dihal underscore how AI technologies are typically portrayed as feminized and ‘White’ or Anglo-Saxon in appearance and speech. 

Biases within AI systems reveal the wider systemic issues and historical disenfranchisement behind technological design and implementation. In Race After Technology, Ruha Benjamin refers to coded inequality as the ‘New Jim Code,’ given the range of discriminatory designs that ‘explicitly work to amplify hierarchies’ and ‘a number that aim to fix racial bias but end up doing the opposite.’ Anthropologists, in tracing the human imprint behind algorithmic systems from data gathering to deployment, can help uncover the assumptions and biases that undergird such systems and reveal how such algorithmic biases can be manifested beyond the North American and European context.