Care.data and the community…

care-data_328x212

The latest piece of health data news, that, according to the Telegraph, the hospital records of all NHS patients have been sold to insurers, is a body-blow to the care.data scheme, but make no mistake about it, the scheme was already in deep trouble. Last week’s news that the scheme had been delayed for six months was something which a lot of people greeted as good news – and quite rightly. The whole project has been mismanaged, particularly in terms of communication, and it’s such an important project that it really needs to be done right. Less haste and much more care is needed – and with the latest blow to public confidence it may well be that even with that care the scheme is doomed, and with it a key part of the UK’s whole open data strategy.

The most recent news relates to hospital data – and the details such as we know them so far are depressingly predictable to many of those following the story for a while. The care.data scheme relates to data currently held by GPs – the new scandal relates to data held by hospitals, and suggests that, as the Telegraph puts it:

“a report by a major UK insurance society discloses that it was able to obtain 13 years of hospital data – covering 47 million patients – in order to help companies “refine” their premiums.”

That is, that the hospital data was given or sold to insurers not in order to benefit public health or to help research efforts, but to help business to make more money – potentially to the detriment of many thousands of individuals, and entirely without those individuals’ consent or understanding. This exemplifies some of the key risks that privacy campaigners have been highlighting over the past weeks and months in relation to the care.data – and adds fuel to their already partially successful efforts. Those efforts lay behind the recently announced six month delay – and unless the backers of care.data change their approach, this last story may well be enough to kill the project entirely.

Underestimating the community

One of the key features of the farrago so far has been the way that those behind the project have drastically underestimated the strength, desire, expertise and flexibility of the community – and in particular the online community. That community includes many real experts, in many different fields, whose expertise strike at the heart of the care.data story. As well as many involved in health care, there are academics and lawyers whose studies cover privacy, consent and so forth who have a direct interest in the subject. Data protection professionals with real-life knowledge of data vulnerability and the numerous ways in which the health services in particular have lost data over the years – even before this latest scandal. Computer scientists, programmers and hackers, who understand in detail the risks and weaknesses of the systems proposed to ‘anonymise’ and protect our data. Advocates and campaigners such as Privacy International, the Open Rights Group and Big Brother Watch who have experience of fighting and winning fights against privacy-invasive projects from the ID card plan to the Snoopers Charter.

All of these groups have been roused into action – and they know how to use the tools of a modern campaign, from tweeting and blogging to making their presence felt in the mainstream media. They’ve been good at it – and have to a great degree caught the proponents of care.data on the hop. Often Tim Kelsey, the NHS National Director for Patients and Information and leader of the care.data project, has come across as flustered, impatient and surprised at the resistance and criticism. How he reacts to this latest story will be telling.

Critical issues

Two specific issues have been particularly important: the ‘anonymisation’ of the data, and the way that the data will be sold or made available, and to whom. Underlying both of these is a more general issue – that people DO care about privacy, no matter what some may think.

“Anonymisation”?

On the anonymisation issue, academics and IT professions know that the kind of ‘de-identification’ that care.data talks about is relatively easily reversed. Academics from the fields of computer science and law have demonstrated this again and again – from Latanya Sweeney as far back as 1997 to Arvind Narayanan and Vitaly Shmatikov’s “Robust De-anonymization of Large Sparse Datasets” in 2008 and Paul Ohm’s seminal piece in 2009 “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization”. Given this, to be told blithely by NHS England that their anonymisation system ‘works’ – and to hear the public being told that it works, without question or doubt, naturally raises suspicion. There are very serious risks – both theoretical and practical that must be acknowledged and taken into account. Right now, they seem to either be denied or glossed over – or characterised as scaremongering.

The sale or misuse of data

The second key issue is that of the possible sale and misuse of data – one made particularly pertinent by the most recent revelations, which have confirmed some of the worst fears of privacy campaigners. Two factors particularly come into play. The first is that the experience of the last few years, with the increasing sense of privatisation of our health services, makes many people suspicious that here is just another asset to be sold off to the highest bidder, with the profits mysteriously finding their way into the pockets of those already rich and well-connected. That and the way that exactly who might or might not be able to access the data has remained apparently deliberately obscure makes it very hard to trust those involved – and trust is really crucial here, particularly now.

Many of us – myself included – would be happy, delighted even, for our health data to be used for the benefit of public health and better knowledge and understanding, but far less happy for our data to be used primarily to increase the profits of Big Pharma and the insurance industry, with no real benefit for the rest of us at all. The latest leak seems to suggest that this is a distinct possibility.

The second factor here, and one that seems to be missed (either deliberately or through naïveté) is the number of other, less obvious and potentially far less desirable uses that this kind of data can be put to. Things like raising insurance premiums or health-care costs for those with particular conditions, as demonstrated by the most recent story, are potentially deeply damaging – but they are only the start of the possibilities. Health data can also be used to establish credit ratings, by potential employers, and other related areas – and without any transparency or hope of appeal, as such things may well be calculated by algorithm, with the algorithms protected as trade secrets, and the decisions made automatically. For some particularly vulnerable groups this could be absolutely critical – people with HIV, for example, who might face all kinds of discrimination. Or, to pick a seemingly less extreme and far more numerous group, people with mental health issues. Algorithms could be set up to find anyone with any kind of history of mental health issues – prescriptions for anti-depressants, for example – and filter them out of job applicants, seeing them as potential ‘trouble’. Discriminatory? Absolutely. Illegal? Absolutely. Impossible? Absolutely not – and the experience over recent years of the use of black-lists for people connected with union activity (see for example here) shows that unscrupulous employers might well not just use but encourage the kind of filtering that would ensure that anyone seen as ‘risky’ was avoided. In a climate where there are many more applicants than places for any job, discovering that you have been discriminated against is very, very hard.

This last part is a larger privacy issue – health data is just a part of the equation, and can be added to an already potent mix of data, from the self-profiling of social networks like Facebook to the behavioural targeting of the advertising industry to search-history analytics from Google. Why, then, does care.data matter, if all the rest of it is ‘out there’? Partly because it can confirm and enrich the data gathered in other ways – as the Telegraph story seems to confirm – and partly because it makes it easy for the profilers, and that’s something we really should avoid. They already have too much power over people – we should be reducing that power, not adding to it.

People care about privacy

That leads to the bigger, more general point. The reaction to the care.data saga so far has been confirmation that, despite what some people have been suggesting, particularly over the last few years, people really do care about privacy. They don’t want their most intimate information to be made publicly available – to be bought and sold to all and sundry, and potentially to be used against them. They have a strong sense that this data is theirs – and that they should be consulted, informed, and given some degree of control over what happens to it. They particularly don’t like the feeling that they’re being lied to. It happens far too often in far too many different parts of their lives. It makes them angry – and can stir them into action. That has already happened in relation to care.data – and if those behind the project don’t want the reaction to be even stronger, even angrier, and even more likely to finish off a project that is already teetering on the brink, they need to change their whole approach.

A new approach?

  1. The first and most important step is more honesty. When people discover that they’re not being told the truth – they don’t like it. There has been a distinct level of misinformation in the public discussion of care.data – particularly on the anonymisation issue – and those of us who have understood the issues have been deeply unimpressed by the responses from the proponents of the scheme. How they react to this latest revelation will be crucial.
  2. The second is a genuine assessment of the risks – working with those who are critical – rather than a denial that those risks even exist. There are potentially huge benefits to this kind of project – but these benefits need to be weighed properly and publicly against the risks if people are to make an appropriate decision. Again, the response to the latest story is critical here – if the authorities attempt to gloss over it, minimise it or suggest that the care.data situation is totally different, they’ll be rightly attacked.
  3. The idea that such a scheme should be ‘opt-out’ rather than ‘opt-in’ is itself questionable, for a start, though the real ‘value ‘ of the data is in it’s scale, so it is understandable that an opt-out system is proposed. For that to be acceptable, however, we as a society have to be the clear beneficiaries of the project – and so far, that has not been demonstrated – indeed, with this latest story the reverse seems far more easily shown.
  4. To begin to demonstrate this, particularly after this latest story, a clear and public set of proposals about who can and cannot get access to the data, and under what terms, needs to be put together and debated. Will insurance companies be able to access this information? Is the access for ‘researchers’ about profits for the drugs companies or for research whose results will be made available to all? Will any drugs developed be made available at cheap prices to the NHS – or to those in countries less rich than ours? We need to know – and we need to have our say about what is or is not acceptable.
  5. Those pushing the care.data project need to stand well clear of those who might be profiting from the project – in particular the lobby groups of the insurance and drug companies and others. Vested interests need to be declared if we are to entrust the people involved with our most intimate information. That trust is already rapidly evaporating.

Finding a way?

Will they be able to do this? I am not overly optimistic, particularly as my only direct interaction with Tim Kelsey has been on Twitter where he first accused me of poor journalism after reading my piece ‘Privacy isn’t selfish’ (I am not and have never presented myself as a journalist – as a brief look at my blog would have confirmed) and then complained that a brief set of suggestions that I made on Twitter was a ‘rant’. I do rant, from time to time, particularly about politics, but that conversation was quite the opposite. I hope I caught him on a bad day – and that he’s more willing to listen to criticism now than he was them. If those behind this project try to gloss over the latest scandal, and think that this six month delay is just a chance for them to explain to us that we are all wrong, are scaremongering, don’t understand or are being ‘selfish’, I’m afraid this project will be finished before it has even started. Things need to change – or they may well find that care.data never sees the light of day at all.

The community needs to be taken seriously – to be listened to as well as talked to – and its expertise and campaigning ability respected. It is more powerful than it might appear – if it’s thought of as a rag-tag mob of bloggers and tweeters, scaremongerers, luddites and conspiracy theorists, care.data could go the way of the ID card and the Snoopers Charter. Given the potential benefits, to me at least this could be a real shame – and an opportunity lost.

About these ads

8 thoughts on “Care.data and the community…

  1. Another problem is enforcement. If some company misused care.data (and of course they would), nobody would ever know – but if anyone did find out, it would be authoritatively denied, then cursorily investigated and denied again. Eventually some more or less independent inquiry with no real powers, limited terms of reference, and led by some “safe pair of hands” would deliver a voluminous report containing a few slightly critical sentences and the miscreant would be urged to apologise for something or other and learn appropriate lessons, many years after the offence.

    On the other hand a strict and intrusive compliance regime with real teeth – including the power and inclination to impose punitive fines and prison sentences – would get in the way of legitimate data users while being evaded and subverted by the rogues.

  2. […] As I mentioned in my previous post on this topic, I’m not so concerned about personal privacy/personal data leakage as I am about trying to think trough the possible consequences of making bulk data releases available that can be used as the basis for N=All/large scale data modelling (which can suffer from dangerous (non)sampling errors/bias when folk start to opt-out), the results of which are then used to influence the development of and then algorithmic implementation of, policy. This issue is touched on in by blogger and IT, Intellectual Property and Media Law lecturer at the University of East Anglia Law School, Paul Bernal, in his post Care.data and the community…: […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s