Cambridge Analytica Explained: Big Data and Elections



Frederike Kaltheuner.

Recently, the data mining firm Cambridge Analytica has been the centre of tons of debate around the use of profiling and micro-targeting in political elections. We’ve written this analysis to explain what it all means, and the consequences of becoming predictable to companies and political campaigns.
What does Cambridge Analytica actually do?

Political campaigns rely on data operations for a number of decisions: where to hold rallies, which states to focus on, and how to communicate with supporters, undecided voters and non-supporters. Essentially, companies like Cambridge Analytica do two things: profile individuals, and use these profiles to personalise political messaging.

What some reporting on Cambridge Analytica fails to mention is that profiling itself is a widespread practice. Data brokers and online marketers all collect or obtain data about individuals (your browsing history, your location data, who your friends are, or how frequently you charge your battery etc.), and then use these data to infer additional, unknown information about you (what you’re going to buy next, your likelihood to be female, the chances of you being conservative, your current emotional state, how reliable you are, or whether you are heterosexual etc.).

Cambridge Analytica markets (!) itself as unique and innovative because they don’t simply predict users’ interests or future behaviour, but also psychometric profiles (even though the company later denied having used psychographics in the Trump campaign and people who have requested a copy of their data from the company have not seen psychographic scores.). Psychometrics is a field of psychology that is devoted to measuring personality traits, aptitudes, and abilities. Inferring psychometric profiles means learning information about an individual that previously could only be learned through the results of specifically designed tests and questionnaires: how neurotic you are, how open you are to new experiences or whether you are contentious.

That sounds sinister (and it is), but again, psychometric predictions are a pretty common practice. Researchers have predicted personality from Instagram photos, Twitter profiles and phone-based metrics. IBM offers a tool that infers personality from unstructured text (such as Tweets, emails, your blog). The start-up Crystal Knows gives customers access to personality reports of their contacts from Google or social media and offers real-time suggestions for how to personalise emails or messages.

From a technical perspective, it doesn’t matter whether you predict gender, interests, political opinions or personality, the point is that you are using some data (your keystroke speed, your browsing history, your location) to learn additional, unknown information (your sexual orientation, your interests etc.).
IBM Watson personality prediction
This is terrifying! So everything can be predicted?

Well, yes, but also not quite. Profiling feels creepy (and it is), because it allows anybody with access to enough personal data to learn highly intimate details about you, most of which you never decided to disclose in the first place. This is worth repeating: someone can use your data to find out whether you are gay, even though you’ve never shared this information. Now here’s where it gets tricky: this derived information is often uncannily accurate (which makes profiling a privacy nightmare) but by virtue of being predictive, predictions also sometimes get it wrong. Also, a lot of things are inherently subjective. Who defines what is reliable or suspicious in the first place?

Think about the targeted ads you see online: how often do they misjudge your interests, or even your entire identity? From the perspective of an advertiser this is not a problem, as long as enough people click on ads. For you and me, and every single one of us, systematic misclassifications can have real-life consequences.

Even worse, profiling and similar techniques are increasingly used not just to classify and understand people, but also to make decisions that have far-reaching consequences, from credit to housing, welfare and employment. Intelligent CCTV software automatically flags “suspicious behaviour”, intelligence agencies predict internet users’ citizenship to decide they are foreign (fair game) or domestic (usually not fair game), and the judicial system claims to be able to predicts future criminals.

As someone once said: it’s Orwell when it’s accurate and Kafka when it’s not.
So profiling is widespread. But did Cambridge Analytica influence the Brexit vote and the US election?

This is my favourite question because the answer is so simple: this is very unlikely.

It’s one thing to profile people, and another to say that because of that profiling you are able to effectively change behaviour on a mass scale. Cambridge Analytica clearly does the former, but only claims (!) to succeed in the latter. Even before the company was in the news, their methods raised a lot of eyebrows amongst experts on data-driven campaigning, with one consultant claiming that “everyone universally agrees that their sales operation is better than their fulfilment product”.

The idea that a single company influenced an entire election is also difficult to maintain because every single candidate used some form of profiling and micro-targeting to persuade voters — including Hillary Clinton and Trump’s competitors in the primaries. Not every campaign used personality profiles but that doesn’t make it any less invasive or creepy!

Evangelicals use data mining to identify unregistered Christians and get out the vote through the non-profit United In Purpose. The organisation profiles individuals and then uses a scoring system to measure how serious they take their faith.

As early as 2008, the Obama campaign employed a data operation to assign every voter in the country a pair of scores that predicted how likely they would cast a ballot, and whether they supported him. The campaign was so confident in its predictions that the Obama consultant Ken Strasma has been quoted to boast that: “[w]e knew who … people were going to vote for before they decided.” Before Cambridge Analytica worked for Trump, the company supported Ted Cruz who described his data operation as “very much the Obama model — a data-driven, grassroots-driven campaign”. By the time Trump hired Cambridge Analytica in 2016, Clinton employed more than 60 mathematicians and analysts.

Voter tracking also doesn’t end online. Shortly after the Iowa caucus in early 2016, the CEO of “a big data intelligence company” called Dstillery told public radio program Marketplace that the company had tracked 16,000 caucus-goes via their phones to match them with their online profiles. Dstillery was able to learn curious facts, such as people who loved to grill or work on their lawns overwhelmingly voted for Trump in Iowa.

All of these efforts to use data, profiling, and targeting to change voters’ minds make it incredibly hard for any one of these data companies to singlehandedly manipulate the outcome of an entire election.
So Cambridge Analytica is a snake oil vendor and I shouldn’t be worried?

No, no, you should definitely be worried!

Using profiling to micro-target, manipulate, and persuade individuals is still dangerous and a threat to democracy. The entire point of building intimate profiles of individuals, including their interests, personalities, and emotions, is to change the way that people behave. This is the definition of marketing — political or commercial. When companies know that you are depressed or feeling lonely to sell you products you otherwise wouldn’t want, political campaigns and lobbyists around the world can do the same: target the vulnerable, and manipulate the masses.

We are moving towards a world where your hairbrush has a microphone and your toaster a camera; where the spaces we move in are equipped with sensors and actuators that make decisions about is in real-time. All of these devices collect and share massive amounts of personal data that will be used to make sensitive judgements about who we are and what we are going to do next.
Is this even legal?

Good question that begs a lawyer-answer: it depends. There are vast differences in the way that data is regulated in the US, around the world, and currently even within different countries of the EU.

Nearly every single 2016 US presidential candidate has either sold, rented, or loaned their supporters’ personal information to other candidates, marketing companies, charities, or private firms. Marco Rubio alone made $504,651 by renting out his list of supporters. This sounds surprising but can be legal as long as the fine print below a campaign donation says that the data might be shared.


Under UK and European data protection law, the situation is slightly different. Data protection regulates the way in which organisations can process personal data. You need some legal grounds for obtaining, analysing, selling, or sharing data and even then, the processing needs to be fair and not excessive. This is why the UK Information Commissioner Office is currently investigating whether Cambridge Analytica and others might have violated these rules, and some have argued that there is evidence they did.

What is also important to know: according to the UK Data Protection Act 1998 implementing EU Data Protection Directive 95/46/EC, any individual whose data is processed in the UK has the right to access it (Article 7), regardless of nationality.

Profiling is specifically addressed by the upcoming General Data Protection Regulation (GDPR), which gives citizens more rights to information and objection. It contains more explicit requirements for consent than previous regulations and the penalties for violations of the law can be much higher. The regulation is a good start but won’t solve all problems.
I want to read more about this!

Sure, here are some more resources.

We have an irregular newsletter that informs you about recent news on data exploitation.

If you want to understand the legal basis for profiling in the UK, the Information Commissioners Office has some good resources on their website, including a guide on how to file a data subject access request and raise a concern. We are contributing to ongoing consultations about profiling under GDPR, both in Brussels and with the ICO — check our website for updates.

Here’s an excellent Twitter feed that argues how Cambridge Analytica might have violated the UK Data Protection Act 1998.

David Carroll filed a data subject access request to Cambridge Analytica and shared some of his data on Twitter.

Wolfie Christl and Sarah Spiekermann wrote a superb report on corporate surveillance and digital tracking with lots of timely examples from finance to employment and marketing.

In 2012, ProPublica investigated how political campaigns use data about voters to target them in different ways.

In 2014, the US Federal Trade Commission published a report on data brokers in the US, called “A Call for Transparency and Accountability”. Tighter regulations of data brokers would affect the way that campaigns use data.