What does Facebook鈥檚 #tenyearchallenge tell us about public awareness of data and algorithms?

Helen Kennedy reflects on the recent #tenyearchallenge trend. Looking at responses to the challenge, she considers what they tell us about the public understanding of data and the companies that utilise it. Drawing on qualitative and survey data on the levels of public awareness, she finds that what the public knows about data continues to be unclear.

Lines of code from a computer programme

Facebook, Instagram and other platforms recently set their users a 10 year challenge: post your first ever Facebook photo and another one of you today. Whilst some users were quick to comply, others responded in unexpected ways. Some reactions were funny: Jennifer Aniston in 2009 is Iggy Pop in 2019. Some had a political message: an attractive underpass in 2009 is run down and occupied by homeless people in 2019; vibrant cities in Iraq, Libya, Yemen and Syria in 2009 are warzones in 2019. Some aimed to tell it like it is, such as this fictitious conversation at Facebook HQ:

@facebook: "Sir. Our facial recognition algorithms are becoming less accurate due to aging of users.鈥

Zuckerberg: "Tell @BuzzFeed to create a viral trend which gets people to post both their very first and current profile pics side by side. Adjust algorithms accordingly".

Or, as tech writer Kate O鈥橬eill posted on Twitter:

Me 10 years ago: probably would have played along with the profile picture aging meme going around on Facebook and Instagram

Me now: ponders how all this data could be mined to train facial recognition algorithms on age progression and age recognition.

These suggestions of foul play, have in turn spawned retorts that point out that such data is already available to Facebook in all of the profile pictures and other photos we have already posted on the platform. To which O鈥橬eill replies: this messy existing dataset has been superseded by the captioned new posts, which are far more helpful to Facebook鈥檚 aim of honing its facial recognition software. O鈥橬eill concludes that while this mission of Facebook鈥檚 may not be cause for concern, it would be a good thing if we were all a bit more aware, and a bit more critical, of what happens to our data.

Critical, political or just plain mocking, responses to the challenge could be seen as evidence of the very thing that O鈥橬eill says we need more of: public awareness of algorithms, data mining and analytics. High-profile coverage of stories, such as the Facebook/Cambridge Analytica scandal, may have shifted the ground in terms of what people know about how their data is used. So are responses to #tenyearchallenge a sign of growing public awareness of data and algorithms? The answer is simply, we don鈥檛 know: robust evidence about whether and how data analytics and related technologies like AI are understood by non-experts is seriously lacking.

Attitudes and understanding

A number of national and international surveys and polls have been carried out, but most of these focus on attitudes and perceptions, rather than knowledge and understanding. Some draw conclusions that do not appear to be backed up by their own data, others ask leading questions, and findings across surveys are inconsistent. For instance, the independent think tank doteveryone鈥檚 2018 digital attitudes survey found that the majority of respondents knew that personal information is used to target advertising (70%), but fewer realised their data could also be sold to other companies (56%), or may determine the prices they are charged (21%). A minority inaccurately believed that invasive data collection takes place: 7% believed phone conversations are collected and 5% believe their eye movements when looking at the screen are collected. Doteveryone concludes that there is a link between trust and understanding, stating that 鈥榃ithout this understanding people are unable to make informed choices about how they use technologies. And without understanding it is likely that distrust of technologies may grow鈥, but it鈥檚 not clear that they have data to back up this claim.

An Ipsos MORI Global Trends survey found 83% of the UK respondents were unsure what information companies had on them. Trust in Personal Data: A UK Review by Digital Catapult (2015) notes that 96% of respondents to its survey claimed to understand the term 鈥榩ersonal data鈥, but that when it came to describing it, less than two thirds (64%) chose the correct definition. Moreover, 65% of people surveyed 鈥榓re unsure whether data is being shared without their consent鈥. The report concludes that their study highlights an increase in data knowledge, although there is no temporal comparison in their survey on which to base this claim.

Qualitative research paints a more nuanced picture. Taina Bucher鈥檚 study of Facebook users鈥 engagements with the platform鈥檚 algorithms identified both more knowledge than participants themselves acknowledged that they had and a range of playful interactions with the algorithms. Terje Colbj酶rnsen鈥檚 study of how people talk on social media about algorithmic moments on Spotify, Netflix and Amazon found similar knowledge and playfulness, with shrewd statements such as 鈥済o home algorithm your鈥檈 drunk鈥, users describing algorithms as like a 鈥榮mug older brother鈥, or wishing they were a little 鈥榞ayer鈥. Likewise, my own research in Post, Mine, Repeat and 鈥楾he Feeling of Numbers鈥 with Rosemary Lucy Hill (2017), suggests that knowledge and experiences of data are more complex, diverse and nuanced than simple statistics suggest.

Why does it matter?

Knowing whether people understand the mining of their personal data is important for many reasons. First, as suggested above, there is a relationship between understanding and trust, the holy grail of data research. Elsewhere, I鈥檝e argued that understanding what people feel about what companies do with their data is as important as understanding what they know, because hopes, fears, misconceptions, and aspirations play an important role in shaping attitudes to data mining. But we also need to know what people know. We need to move beyond a hunch that awareness is growing, as #tenyearchallenge responses suggest, to more concrete knowledge of what is known and what is not, where the gaps in knowledge and understanding are, and how to fill them. Only then can we start to envisage a participatory, data-driven society, of the kind that many of us would like to see.

The second reason this question matters relates to new initiatives which aim to influence uses of data, AI and their governance, like the government Centre for Data Ethics and Innovation (CDEI) and the independent Ada Lovelace Institute (Ada). Such initiatives claim that understanding public views and how data affect people will be at the heart of what they do, but this is only possible if such understanding actually exists. To ensure data works 鈥榝or people and society鈥 (Ada鈥檚 mission) and is 鈥榓 force for good鈥 (a CDEI aim), we need better evidence of what people really know about what happens to their data.