bit.ly/1UAPtrK image credit Mark Dodds/Flickr

The illusion that might cheat us: ethical data science vision and practice

This blog post is also available as an audio file on soundcloud.


Anais Nin, wrote in her 1946 diary of the dangers she saw in the growth of technology to expand our potential for connectivity through machines, but diminish our genuine connectedness as people. She could hardly have been more contemporary for today:

“This is the illusion that might cheat us of being in touch deeply with the one breathing next to us. The dangerous time when mechanical voices, radios, telephone, take the place of human intimacies, and the concept of being in touch with millions brings a greater and greater poverty in intimacy and human vision.”
[Extract from volume IV 1944-1947]

Echoes from over 70 years ago, can be heard in the more recent comments of entrepreneur Elon Musk. Both are concerned with simulation, a lack of connection between the perceived, and reality, and the jeopardy this presents for humanity. But both also have a dream. A dream based on the positive potential society has.

How will we use our potential?

Data is the connection we all have between us as humans and what machines and their masters know about us. The values that masters underpin their machine design with, will determine the effect the machines and knowledge they deliver, have on society.

In seeking ever greater personalisation, a wider dragnet of data is putting together ever more detailed pieces of information about an individual person. At the same time data science is becoming ever more impersonal in how we treat people as individuals. We risk losing sight of how we respect and treat the very people whom the work should benefit.

Nin grasped the risk that a wider reach, can mean more superficial depth. Facebook might be a model today for the large circle of friends you might gather, but how few you trust with confidences, with personal knowledge about your own personal life, and the privilege it is when someone chooses to entrust that knowledge to you. Machine data mining increasingly tries to get an understanding of depth, and may also add new layers of meaning through profiling, comparing our characteristics with others in risk stratification.
Data science, research using data, is often talked about as if it is something separate from using information from individual people. Yet it is all about exploiting those confidences.

Today as the reach has grown in what is possible for a few people in institutions to gather about most people in the public, whether in scientific research, or in surveillance of different kinds, we hear experts repeatedly talk of the risk of losing the valuable part, the knowledge, the insights that benefit us as society if we can act upon them.

We might know more, but do we know any better? To use a well known quote from her contemporary, T S Eliot, ‘Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?’

What can humans achieve? We don’t yet know our own limits. What don’t we yet know?  We have future priorities we aren’t yet aware of.

To be able to explore the best of what Nin saw as ‘human vision’ and Musk sees in technology, the benefits we have from our connectivity; our collaboration, shared learning; need to be driven with an element of humility, accepting values that shape  boundaries of what we should do, while constantly evolving with what we could do.

The essence of this applied risk is that technology could harm you, more than it helps you. How do we avoid this and develop instead the best of what human vision makes possible? Can we also exceed our own expectations of today, to advance in moral progress?

What values underpin your policy and practice – reflections on the framework

Perhaps this is the core data dilemma for the civil service. Ethical data practice and demands of policy may diverge. Can you apply integrity to data models and serve two masters?

“It is difficult to translate a complicated policy into a coded algorithm to make operational decisions […] and new data will change the way some algorithms make decisions to get their target answer.”

How are the complexities of policy to be translated to the public? Language will matter in its communication. Ensuring “no one suffers unintended negative consequences” is a double negative that sounds positively Orwellian. Seeing harm as legitimate and acceptable, pre-evaluating a deserved negative impact from data is problematic.

The potential for a reduction in the application of human rights under any part of a Trump-itCon future is real, so transparency of the decision making of who the unintended negative consequences will affect also matters as much, if not more, than the benefit.

Fighting fraud is a legitimate use of data. Safeguards must still be clear in the case of mistakes. Creating ‘robust data science models’ and saying data users should ‘flag if algorithms are using protected characteristics to make decisions’ offers little on how to meet the specific safeguards, transparency or recourse required on the ground when sitting in a DWP office with a distraught PIP client after an automated decision says no. How would they know if the tech was broken and could appeal at all, if the algorithm is hidden?

This is an area that could be further developed.

Much like the Department for Education statutory guidance published in May that will mandate the web monitoring of every child’s Internet use in school from September 2016, the how is open to interpretation. We need to support local staff to have the freedom to trust their own judgements, and avoid an overly risk averse culture, in which the algorithm is always right. Where we see third party machines replace a class teacher’s informed judgement, it enters children into a postcode lottery of privacy intrusion, and offers little support to staff accountable for the thinking behind a decision if they cannot see inside.

The case study on page ten, targeting individuals in receipt of welfare to send energy company marketing materials, appears to flag precisely one of the privacy flaws designed into the most recently added of the six Cabinet Office data sharing plans in ‘Better use of data in government’. It is at odds with the legislation it is designed to accompany.

Three conclusions the M.O.D. expert draws in this blog are insightful and interesting (and other ciders are available).

1. The public wants involvement and transparency.
2. The commercial sector did not enjoy the same level of trust as public sector.
3. Don’t forget the people that the data might represent and those the data does not.

The recommendation of the 2014 Science and Technology Select Committee Report “Responsible Use of Data” recommended that; “the Government has a clear responsibility to explain to the public how personal data is being used.”

What potential there remains to do that well. What boldness and vision it would take.

The blog on data science is absolutely right to conclude that it is clear that “the public welcomed the opportunity to discuss government use of data science, and were broadly reassured by the presence of a framework that policy makers and data scientists can use to make sure what they are doing is ethically appropriate.”

However the truth is that today, various departments do not act in ways that are ethically appropriate. What you don’t know, you can’t object to.

All the work on privacy principles could be applied well across the board in government use of data. For me, the current consultation started over two years ago, and it shows. It’s already out of date. If today’s practices are already outdated, on what foundation are you building? What we are seeing built now goes far beyond privacy and data protection compliance. Data is re-shaping power between society and authorities, and looping back to change technology. Human rights and ethics have supra-legal statuses that are in flux.

Change is not easy and cannot be done by government alone.

“Privacy and data protection are part of the solution, not the problem. For the time being, technology is controlled by humans. It is not easy to classify neatly these potential developments as good or bad, desirable or harmful, advantageous or detrimental.

“Policy makers, technology developers, business developers and all of us must seriously consider if and how we want to influence the development of technology and its application. But equally important is that the EU consider urgently the ethics and the place for human dignity in the technologies of the future.”

[source: European Data Protection Supervisor (EDPS), Giovanni Buttarelli, September 2015, Towards a new digital ethics: Data, dignity and technology.]

If we believe that this data science ethical framework can legitimately stand in place of an ethical framework for decision making, it is “an illusion which might cheat us.”

It is an illusion that could not only cheat the public of our rights, but of the potential that the future holds for public interest and the economy if using data and technology well.

Perhaps this is a core shift in power towards the civil service to shape delivery designed beyond the demands of political policy.  It is therefore in all governments’ own interests to ensure innovation and public interest technology, is transparent and open to scrutiny.

What is your vision that those values help steer towards?

Ethics must bridge the gap between machine analytics of data science and human decision making, law and accountability. Values must underpin how the human design, the human purposes, are then adopted in practice by data engineers designing the machine thinking, and how they interpret and apply the output.

All these separate factors in any data science process need glued together into a consistent reliable model using ethics.

You wouldn’t use a simple Pritt Stick to keep the lid on a jack-in-the-box. So why use a weak ethics framework in the discussion of the toughest challenges of what could and should be in today’s data science models fit for future? A.I. alone has great potential to surprise us if we do not contain excess human ego.

When we need the best of human vision, we need more than a nice looking app. We must be careful to avoid the presentation of one thing, as an illusion of another. As Nin might have said, we have been offered the bread, when what we need is the wafer.

Data users need tools, guidance and laws that transcend the current thinking on data use and cater for unseen risk and potential future uses of data collected today.

Giovanni Buttarelli, the European Data Protection Supervisor, concluded on ethics in the 2015 report:

today’s trends may require a completely fresh approach. So we are opening a new debate to what extent the application of the principles such as fairness and legitimacy is sufficient.

“With technology, global innovation and human connectedness developing at breakneck speed, we have an opportunity to […] build a consensus.”

That consensus is not yet reached if the Cabinet Office data sharing consultation output is their final say on what ‘better use of data in government should be.’

I hope the call is genuine, but will go bolder and broader when it says “findings will be explored by data scientists, analysts, policymakers and external experts, and used to improve the first version of the ethical framework, launched today.”

How many ethicists were consulted designing this data science ethical framework?

It must yet be strengthened because mindful is not enough.

“There is a clear need in industry and the public and third sectors for a greater understanding [of use of social media used in crime prevention.]” is a call echoed by academics in the School of Social Sciences.

As the MOD expert writes in the data science ethics blog, “if we remain mindful of ethics then we can be sure, and the public can be reassured, that we are not moving towards Orwell’s 1984.”

As Giovanni Buttarelli, the European Data Protection Supervisor, wrote recently, George Orwell warned against big brother. He didn’t realise at the time that we’d also need to pay attention to big data.

Without using strong ethics as government sticks ever more sources of data together from all areas of our life, public trust in administrative use of our data will not be better, but will instead fall apart rapidly, and risks taking the public benefit with it.

Taking a look at the ethical framework ideas in a primer for Information Governance outlined by Castlegate Associates, it also challenges us on some of the practical reasons why ‘common good’ is not the only thing that matters, like the Tuskegee syphilis case study.

Behavioural insights manipulation. Smart cities. Neural networks learning based on our Big Data. Proprietary A.I. War machines we never see used high in foreign skies or future plans for on the ground. The public at large is not permitted insight into who and how these things are being worked on, based on whose rules, for what purpose. These are big tough moving targets.

If Elon Musk feels, “Not all A.I. futures are benign,” we should understand why he feels it is important to keep artificial intelligence open, and to keep it democratic.

We can do better for Better use of Data in Government

As Nin said in 1946, I too am, “curious about tomorrow, about what new places we were going to discover.”

Not everything past is still true today. Ethics and the best of intents from eminent people, can both transcend and be seen as flawed in time. Science has had its own Muirfield moment when Tim Hunt had trouble with girls in 2015. The brilliance of Watson dimmed in public perception after racist remarks in 2007. Some of the comments in William Bodmer’s papers from the early days of population-wide DNA and the Human Genome project, similarly reflect the ethics of their time. Ethics are both universal and cultural and of their time.

The efforts to achieve a truly ethical framework must not only dig deeper into the academic expertise that we excel at in this country in cyber security, in privacy technologies, in bioethics, and put human vision into how we will use data from individuals to better their lives, but open up data sciences policy and practice of authorities such as police and government to transparent ethical oversight.

That is something additional that could be further developed. That doesn’t have to mean by default compromising how they work. But it would increase public trust in the belief that they do.

The measure of this ethical framework is the intended legacy current politicians’ and civil servants want it to leave our children, and theirs. How will your own children be exploited by or benefit from these uses of their personal data?

To have the policy we need today, we first need to put the humanity and accountability back into current data use in government, while looking forward far into the future at what we will need. Ensuring human dignity is at the heart of decision making needs long term vision, beyond 2020.

That might be a bigger ask than an ethical data science framework is capable of, unless politicians have the vision and humility to accept how vital accurate analysis of data and its application has become, and they depend on it in their own everyday decision making.

“I don’t think the [American] obsession with politics and economics has improved anything. I am tired of this constant drafting of everyone, to think only of present day events,” said Nin seventy years ahead of the Brexit debate and long term economic plan.

The conflation of different purposes of data science in this legislation results in a simulation of data science ethics. Imposing what was intended for research purposes, upon data use for interventions some of whose purposes will have intended negative consequences doesn’t really work in practice.

Fraud may mean money owed to a local authority through criminal intent. Debt may mean money owed to a local authority from hardship. When and how should the algorithm send round the bailiffs? Will the bailiffs have their own access to the data, if they meet the prescribed persons in the planned Tailored Public Services part of the new legislation? How does a target-driven cash collection programme fit into that in practice? How do we avoid making the vulnerable even more vulnerable, further disempowered by data?

What may be obvious in one area, may not be in another, and will lead to a postcode lottery of privacy intrusion facilitated by the interpretation that this data sharing legislation widens the prescribed persons for public services, who can access public data, and why.

In the Data Science Ethics Dialogue engagement film participants show how poorly informed some are and all ask for more knowledge, and change in existing practices on pigeon holing and restrictions on commercial use. Above all they ask government to be open and honest on purposes and accountability.

We don’t only need better data, but to be better in how we approach the data package as a whole, held together in do-no-harm ethics:

  • Better transparency: is needed in the design and application of technology. The public needs to see how the ethical values underpinning  what policy intents are coded, to leave sufficient room for human interaction, human dignity, and accountability. If not, we  build a ‘poverty in intimacy and human vision’ into machine based data science as a feature, not a bug.
  • Better governance: If data science in government is to remain democratic design must also take into account what the public does not want, not only what government does. Ethics can be more obvious to define when we see what is lacking or something is not as we like. As the data science dialogue blog writer said, “The group wanted to ensure that we worked with purpose rather than simply rummaging through people’s lives.”
  • Better engagement: Is this engagement just another listening exercise, or does someone have responsibility to act on that feedback? Will it be the start of a long term process of education and exchanging ideas? We need to design an iterative change process as technology, scale and public experience progress.
  • Better vision: I would like to see leadership vision and joined up thinking that says this is our government data utopia for society. We need to see it now as we already live in smart cities with increasing amounts of unseen technology that underpins them. If this vision exists, it’s not coming across.
  • Better laws for bona fide purposes: Using administrative data ethically, means we see inspiring and exciting things come from using public data. The research and data science community must not allow the eleventh hour sidetracking of purposes in the consultation, to detract from the parts of legislation which should work well. But there are parts that need scrapped before they begin.

To borrow a phrase from and for our civil service, Be bold. Be bold in protecting personal rights, and you will protect trust in public data use, and with it, the future of professional data access in the UK. Inspire the public and politicians alike with what is possible.

The future holds too much potential to be jeopardised by puppet masters who seek to control the strings of the UK data debate, putting their own commercial data access before public benefit.

Ipsos MORI found in 2010 that half of the public believe that the Government’s top priority for public services should be what is good for everyone in society as a whole, while recognising that different aspects of fairness can be in tension.

What are the implications for people for using data responsibly, for using it irresponsibly, or not at all? What vision do we communicate to the public in parallel with the coming changes in data protection rights compliance and risk management and as we catch up with what is already being decided by the companies who want to exploit data? Musk sees the goal of technology as increasing the probability that the future will be good. [From 43:45] That’s a future for all, not only fearing risks for individuals but fixing them to get beyond them to the vision of the possible.

I want to see how the vision for our public data will achieve that.

Life can’t only be about solving problems, [13:25]: “there have to be things that are inspiring and exciting, and make you glad to be alive.”

image credit: cc Mark Dodds/Flickr