This is a repost of my opinion piece published in StatsLife in June 2015.
The majority of the public supports the concept of using data for public benefit. But the measurable damage done in 2014 to the public’s trust in data sharing  and reasons for it, are an ongoing threat to its achievement.
Rebuilding trust and the public legitimacy of government data gathering could be a task for Sisyphus, given the media atmosphere clouded by the smoke and mirrors of state surveillance. As Mark Taylor, chair of the NHS’s Confidentiality Advisory Group wrote when he considered the tribulations of care.data  ‘…we need a much better developed understanding of ‘the public interest’ than is currently offered by law.’
So what can we do to improve this as pilot sites move forward and for other research? Can we consistently quantify the value of the public good and account for intangible concerns and risks alongside demonstrable benefits? Do we have a common understanding of how the public feels what is in its own best interests?
And how are shifting public and professional expectations to be reflected in the continued approach to accessing citizens’ data, with the social legitimacy upon which research depends?
Listening and lessons learned
Presented as an interval to engage the public and professionals, the 18 month long pause in care.data involved a number of ‘listening’ events. I attended several of these to hear what people were saying about the use of personal health data. The three biggest areas of concern raised frequently  were:
- Commercial companies’ use and re-use of data
- Lack of transparency and control over who was accessing data for what secondary purposes, and
- Potential resulting harms: from data inaccuracy, loss of trust and confidentiality, and fear of discrimination.
It’s not the use of data per se that the majority of the public raises objection to. Indeed many people would object if health data were not used for research in the public interest. Objections were more about the approach to this in the past and in the future.
There is a common understanding of what bona fide research is, how it serves the public interest, and polls confirm a widespread acceptance of ‘reasonable’ research use of data. The HSCIC audit under Sir Nick Partridge  acknowledged that some past users or raw data sharing had not always met public expectations of what was ‘reasonable’. The new secure facility should provide a safe setting for managing this better, but open questions remain on governance and transparency.
As one question from a listening event succinctly put it :
‘Are we saying there will be only clinical use of the data – no marketing, no insurance, no profit making? This is our data.’
Using the information gleaned from data was often seen as exploitation when used in segmenting the insurance markets, consumer market research or individual targeting. There is also concern, even outright hostility, to raw health data being directly sold, re-used or exchanged as a commodity – regardless whether this is packaged as ‘for profit’ or ‘covering administrative costs’.
Add to that, the inability to consent to, control or find out who uses individual level data and for what purpose, or to delete mistakes, and there is a widespread sense of disempowerment and loss of trust.
Quantifying the public perception of care.data’s value
While the pause was to explain the benefits of the care.data extraction, it actually seemed clear at meetings that people already understood the potential benefits. There is clear public benefit to be gained for example, from using data as a knowledge base, often by linking with other data to broaden scientific and social insights, generating public good.
What people were asking, was what new knowledge would be gained that isn’t gathered from non-identifiable data already? Perhaps more tangible, yet less discussed at care.data events, is the economic benefits for commissioning use by using data as business intelligence to inform decisions in financial planning and cost cutting.
There might be measurable economic public good from data, from outside interests who will make a profit by using data to create analytic tools. Some may even sell information back into the NHS as business insights.
Care.data is also to be an ‘accelerator’ for other projects . But it is hard to find publicly available evidence to a) support the economic arguments for using primary care data in any future projects, and b) be able to compare them with the broader current and future needs of the NHS.
A useful analysis could find that potential personal benefits and the public good overlap, if the care.data business case were to be made available by NHS England in the public domain. In a time when the NHS budget is rarely out of the media it seems a no-brainer that this should be made open.
Feedback consistently shows that making money from data raises more concern over its uses. Who all future users might be remains open as the Care Act 2014 clause is broadly defined. Jamie Reed MP said in the debate : ‘the new clause provides for entirely elastic definitions that, in practice, will have a limitless application.’
Unexpected uses and users of public data has created many of its historical problems. But has the potential future cost of ‘limitless’ applications been considered in the long term public interest? And what of the confidentiality costs ? The NHS’s own Privacy Impact Assessment on care.data says :
‘The extraction of personal confidential data from providers without consent carries the risk that patients may lose trust in the confidential nature of the health service.‘
Who has quantified the cost of that loss of confidence and have public and professional opinions been accounted for in any cost/benefit calculations? All these tangible and intangible factors should be measured in calculating its value in the public interest and ask, ‘what does the public want?’ It is after all, our data and our NHS.
Understanding shifting public expectations
‘The importance of building and maintaining trust and confidence among all stakeholder groups concerned – including researchers, institutions, ethical review boards and research participants – as a basis for effective data sharing cannot be overstated.’ – David Carr, policy adviser at the Wellcome Trust 
To rebuild trust in data sharing, individuals need the imbalance of power corrected, so they can control ‘their data’. The public was mostly unaware health records were being used for secondary purposes by third parties, before care.data. In February 2014, the secretary of state stepped in to confirm an opt-out will be offered, as promised by the prime minister in his 2010 ‘every patient a willing research patient’ speech.
So leaving aside the arguments for and against opt-in versus opt-out (and that for now it is not technically possible to apply the 700,000 opt-outs already made) the trouble is, it’s all or nothing. By not offering any differentiation between purposes, the public may feel forced to opt-out of secondary data sharing, denying all access to all their data even if they want to permit some uses and not others.
Defining and differentiating secondary uses and types of ‘research purposes’ could be key to rebuilding trust. The HSCIC can disseminate information ‘for the purposes of the provision of health care or adult social care, or the promotion of health’. This does not exclude commercial use. Cutting away commercial purposes which appear exploitative from purposes in the public interest could benefit the government, commerce and science if, as a result, more people would be willing to share their data.
This choice is what the public has asked for at care.data events, other research events  and in polls, but to date has yet to see any move towards. I feel strongly that the government cannot continue to ignore public opinion and assume its subjects are creators of data, willing to be exploited, without expecting further backlash. Should a citizen’s privacy become a commodity to put a price tag on if it is a basic human right?
One way to protect that right is to require an active opt-in to sharing. With ongoing renegotiation of public rights and data privacy at EU level, consent is no longer just a question best left ignored in the pandora’s box of ethics, as it has been for the last 25 years in hospital data secondary use. 
The public has a growing awareness, differing expectations, and different degrees of trust around data use by different users. Policy makers ignoring these expectations, risk continuing to build on a shaky foundation and jeopardise the future data sharing infrastructure. Profiting at the expense of public feeling and ethical good practice is an unsustainable status quo.
Investing in the public interest for future growth
The care.data pause has revealed differences between the thinking of government, the drivers of policy, the research community, ethics panels and the citizens of the country. This is not only about what value we place on our own data, but how we value it as a public good.
Projects that ignore the public voice, that ‘listen’ but do not act, risk their own success and by implication that of others. And with it they risk the public good they should create. A state which allows profit for private companies to harm the perception of good research practice sacrifices the long term public interest for short term gain. I go back to the words of Mark Taylor :
‘The commitment must be an ongoing one to continue to consult with people, to continue to work to optimally protect both privacy and the public interest in the uses of health data. We need to use data but we need to use it in ways that people have reason to accept. Use ‘in the public interest’ must respect individual privacy. The current law of data protection, with its opposed concepts of ‘privacy’ and ‘public interest’, does not do enough to recognise the dependencies or promote the synergies between these concepts.’
The economic value of data, personal rights and the public interest are not opposed to one another, but have synergies and a co-dependency. The public voice from care.data listening could positively help shape a developing consensual model of data sharing if the broader lessons learned are built upon in an ongoing public dialogue. As Mark Taylor also said, ‘we need to do this better.’
 according to various polls and opinions gathered from my own discussions and attendance at care.data events in 2014 [ refs: 2, 4. 6. 12]
 The data trust deficit, work by the Royal Statistical Society in 2014
 M Taylor, “Information Governance as a Force for Good? Lessons to be Learnt from Care.data”, (2014) 11:1 SCRIPTed 1 http://script-ed.org/?p=1377
 Communications and Change – blogpost https://jenpersson.com/care-data-communications-change/
 HSCIC audit under Sir Nick Partridge https://www.gov.uk/government/publications/review-of-data-releases-made-by-the-nhs-information-centre
 Listening events, NHS Open Day blogpost https://jenpersson.com/care-data-communications-core-concepts-part-two/
 Accelerator for projects mentioned include the 100K Genomics programme https://www.youtube.com/watch?v=s8HCbXsC4z8
 Hansard http://www.publications.parliament.uk/pa/cm201314/cmhansrd/cm140311/debtext/140311-0002.htm
 Confidentiality Costs; StatsLife http://www.statslife.org.uk/opinion/1723-confidentiality-costs
 care.data privacy impact assessment Jan 2014 [newer version has not been publicly released] http://www.england.nhs.uk/wp-content/uploads/2014/01/pia-care-data.pdf
 Wellcome Trust http://blog.wellcome.ac.uk/2015/04/08/sharing-research-data-to-improve-public-health/
 Dialogue on Data – Exploring the public’s views on using linked administrative data for research purposes: https://www.ipsos-mori.com/researchpublications/publications/1652/Dialogue-on-Data.aspx
 HSCIC Lessons Learned http://www.hscic.gov.uk/article/4780/HSCIC-learns-lessons-of-the-past-with-immediate-programme-for-change
The views expressed in this article originally published in the Opinion section of StatsLife are solely mine, the original author. These views and opinions do not necessarily represent those of The Royal Statistical Society.