Feature Following last month’s announcement of a £1m nationwide spam drop, what now for care.data, the NHS's latest multi-million pound big data project?
Is it, as the carefully managed news release implied, merely taking its time – in fact, delaying a key project by almost a year - so as to nail issues of patient confidentiality? Or is it, already, in deeper mire, and using data protection issues as a figleaf to cover up more significant problems with system delivery?
More ReadingGP records soon wide open again: Just walk into a ‘safe haven’Anonymous means NO identifying element left behind – EU handbookGP surgeries MUST DO BETTER on data handling, says ICOHealth Sec opens new data centre: Biz not Surrey to be in GodalmingNHS carelessly slings out care.data plans to 26.5 million Brits
And if it does go bad, will we ever find out why and how (and how much it cost)? Or is the new arms-length NHS wholly immune from parliamentary scrutiny?
Let’s start with the mire.
The theory behind care.data is straightforward enough. Data from all (non-dissenting) UK patients is to be lodged in a central database, from where it may be used for admin purposes, for statistical analysis by the NHS or sold on to select research companies.
It is managed under the auspices of the Health and Social Care Information Centre (HSCIC), part of the new devolved NHS England, and intended to be “a modern information service on behalf of the NHS”, using information from a patients’ medical record to improve the way that healthcare is delivered for all”.
Moreover: “The service will only use the minimum amount of information needed to help improve patient care and the health services provided to the local community”.
So far, (sounds) so good.
care.data now forms a significant part of the Secondary Uses Service (SUS), initially set up as part of the ill-fated National Programme for IT in the NHS. The ambition of using one major supplier – BT – as national application service provider has now replaced by an open data platform (ODP) approach. This is in line with the principles of the Government ICT Strategy with the separation of technical components into those, such as data storage and processing, that need to be delivered centrally, and apps for turning data into information that may more sensibly be developed to meet specific needs.
According to the HSCIC, the key components of the platform's architecture look like this:
You want to extract what?
According to a GP toolkit, published earlier this year by NHS England, the system build process starts with the General Practice Extraction Service (GPES) a centrally managed extraction service, divided into two parts: ATOS provides the extract query tool; the extractions will be carried out by the GP practice system suppliers, including EMIS, TPP, Microtest and INPS.
In addition to data set out in the main spec, the GPES will also hoover up personal identifiers: NHS number, gender, date of birth, postcode and ethnicity are among the eight criteria required.
All data will first be uploaded to a Data Management Environment (DME) within the HSCIC. Initial upload may be to HSCIC direct or to one of a number of regional Data Management Integration Centres (DMICs). From there, it is matched to an index file (HES index file): the initial data upload is then deleted; and further secondary data may be matched in, also using the HES index. Unfortunately, that appears to be most of what is known publicly.
This leaves a host of questions, such as:
- Who are the lead providers?
- Have any of them been ejected?
- Are we getting multiple platforms that may or may not be able to interact?
- And the daddy of the lot: How much is care.data costing?
The answer is, despite putting all these questions and more to the relevant parties over a period of time, few answers are forthcoming. Nor are we likely to get more information any time soon. For the Health and Social Care Act 2012 (HSCA) that established NHS England also effectively removed that body from parliamentary accountability.
The Department of Health passes such questions directly over to NHS England.
Repeated attempts to elicit comment from the office of opposition Health spokesman, Andy Burnham, MP, have also drawn a blank.
However, according to sources close to the project, technical issues are already surfacing, and not in a good way. The use of local IT providers to build the various DMICs mean that while care.data itself is up and running, data loads are not. Because different local builders mean a range of different systems architectures and a system that is currently not interoperable.
A second source claims that the GPES itself is not working as it should, and that even if it wished to, HSCIC could not presently commence uploading data. We have asked NHS England for comment on both these claims – but so far no answer. Costs, with or without the impact of any technical glitches, remain a mystery.
How much? Oh, you can't tell us...
Officially, according to NHS England: “We are not yet in a position to provide the full costs of the programme.” They are working with HSCIC to do so and “anticipate” that further information may be forthcoming in the New Year.”
However, a good starting point seems to be a briefing paper put out by the Informatics Services Commissioning Group that states that “care.data programme costs will be built on the current costs of the proposed Open Data Platform”, and that the ODP cost “at outline business case is estimated at £33m over three years”.
They add: “This figure, excludes any additional accelerator project costs, which remain to be determined” – though one such cost may be an extra £11.8m of funding that Councils will receive to support the move to a new social care data collection system, which appears to be part of the care.data ecosystem. Add to this the £1m to £2m minimum for the door drop now needed to meet the demands of the Information Commissioner.
Meanwhile, care.data is starting to encounter opposition both for the enormity of what it intends to put in place, and the somewhat hamfisted way in which it has proceeded to date. For the vision is clear: under the HSCIC, what was once a simple data warehouse for producing statistical information on patient care is to be transformed into a whole life system of universal health surveillance.
According to the GP toolkit, the amount of personal, privacy-busting information to be uploaded is massive. Categories of information include diagnoses (anything from diabetes to schizophrenia), health group (including whether a patient smokes or has high cholesterol), interventions and prescriptions.
Those concerned about the scale of information being released might be relieved to learn that “sensitive information” – that’s information relating to subjects such as termination of pregnancy, convictions and domestic violence – are to be omitted. For now.
However, NHS England is keen to “listen to” calls by patient groups to open such data up, since its current omission might be considered “stigmatising”. Or in other words, you ain’t seen nothing yet – and information currently considered too sensitive for inclusion may yet be added.
The first iteration of care.data is also relatively limited, compared to what is already in the pipeline: hospital data is due to be added a year after GP data; and social care data a year after that.
Concerns over confidentiality fall under two heads. First, despite assurances that security is “the most important priority of the HSCIC”, and that care.data “will conform to the same strict standards of data security and confidentiality that have governed the use of HES for many years”, experience suggests otherwise. Government and data security, many would argue, are mutually exclusive things, and history seems largely to prove that where security can be breached, it will be. Given the literally career-changing nature of some of the data soon to be passed around, the risk, critics argue, is not worth it.
Accidents apart, the uses that the HSCIC intends for the data have raised eyebrows. Outputs for release 1 distinguish between a range of aggregate statistics, much as before, and pseudonymous statistics: that’s data anonymised, according to the HSCIC, sufficiently to resist a “jigsaw attack”.
Money-spinning datasets to cost, er... one quid
Such relatively limited data releases will be within the NHS family (GPs and managers who need information to hone the service they provide) and “customers” as approved by the HSCIC’s Data Access Advisory Group. Companies such as BUPA, Dr Foster and Civil Eyes research are among the early approvals, and likely to benefit greatly from plans to make extracts available commercially for no more than £1. (That appears to be the price for whole datasets)
While initial releases of data will be anonymised, the scope remains to match back to personal identifiers and make what is described in the literature as a s251 release. According to the Health Research Authority, this – s251 of the NHS Act 2006 – allows “confidentiality to be overridden to enable disclosure of confidential patient information for medical purposes, where it was not possible to use anonymised information and where seeking consent was not practicable, having regard to the cost and technology available.”
Patient confidentiality may therefore be overridden wherever the Secretary of State feels a sufficient case for doing so. Or as HSCIC puts it: “Release 2 will further consider the outputs to be provided from care.data to each of the receiving types of organisation”. Procedures are already in place to formalise the data sharing process with third parties.
Confidential? Define confidential.
Besides, an s251 exemption now allows use of identifiable data for commissioning purposes. In practice this appears to mean that identifiable patient data can be passed around routinely for non-direct care purposes – including admin, audit, and invoice reconciliation - at national (NHS England), regional (NHSE Area Teams, CSUs) and local (CCG, local authority) level.
In other words, mission creep is already happening. Use of the GPES is governed by an Independent Advisory Group, drawn from medical professionals and lay advisors. Minutes of this group’s meeting from September 2013 highlight concerns that deidentified data could be reidentified by commercial customers of HSCIC: their solution was to require customers to sign an undertaking to the effect that they would not do this (PDF).
This follows a decision by the same group, In August 2013, to permit the storing of NHS numbers and practice IDs within the HSCIC DME for a month at a time in respect of a project on diabetic retinopathy screening. This, the group agreed (PDF) “could raise some privacy risks” – in fact breaches the initial premise that once upload had taken place, that DME data would be deleted after matching - but that was OK as “this data would be stored securely and encrypted”.
care.data has also had something of a setback recently over its failure to communicate adequately with GP’s or patients.
It certainly did not help that the request for a large new GP dataset to be supplied from practices was sprung out of the blue on joint chairman of the BMA and RCGP's joint IT committee, Dr Paul Cundy in January of this year – or that the first many GPs heard of it was when an information pack arrived at their practice in late August, informing them that they had just eight weeks to make patients “aware” of the scheme.
Many GP’s were concerned that, as data controllers for their patient data, they were about to breach the Data Protection Act (DPA) and therefore open themselves to expensive legal action. On the other hand, if they did NOT support the upload of patient data, they may also find themselves in breach of the law, as the HSCA, responsible for the recent re-invention of the NHS also empowered the HSCIC to require providers of NHS care to send it confidential data in certain circumstances – such as when the Secretary of State for Health orders them to.
MedConfidential, jointly co-ordinated by Phil Booth and Terri Dowty, who were previously movers behind No2ID and ARCH (Action on Rights for Children), have started a campaign urging patients to opt out of care.data – something that Health Secretary Jeremy Hunt had previously agreed was permitted.
In August, the Information Commissioner joined the fray, expressing concerns that patient data was about to be processed in breach of the DPA. Patients should be aware of how their data might be processed and there needed to be reasonable assurance that such steps had been taken.
In fact, the ICO position, confirmed last week, was that “as far as practically possible, all patients” [should be] “aware of these changes”. They are not, however, prepared to state exactly what constitutes the level of awareness required.
However, advice by NHS England that patient awareness would be sufficiently engaged by putting up posters in GP surgeries and communicating through “routine communications” such as practice newsletters – or the 8 week window within which they were supposed to do all of this - certainly did not cut it.
Hence last month’s surprise announcement that NHS England is now to distribute leaflets to all 22 million households likely to be affected at a cost of 8p per household: £1.76m – or £1 million, as they more economically reported!
Although presented as a positive, this is something of an embarrassment for that organisation, which had already stated, somewhat bullishly in their business plan for 2013/14-2015/16 that “75 per cent of GP practices will be providing the full extract to care.data by September 2013”. Er, no.
The door-drop is due to go out in January 2014. Thereafter, assuming that the ICO deems patient awareness to have climbed to a sufficient level, uploading will commence in spring or summer.
But is that really the issue? Is the current willingness to go along with ICO recommendations a face-saving excuse for delay while HSCIC get on with fixing system glitches behind the scenes?
The problem is: we just don't know. On the one hand, the HSCIC has opened up a little, making public both physical architecture and debates about process. On the other, key questions - what all this will cost, who are the main providers - remain closed. On cost, supposedly, some two months after the project was due to go live, they are not in a position to say.
And there, beyond the technology, beyond issues of confidentiality and privacy, lies the real issue: that in the end, all might go swimmingly, but along the way, the right of public, politicians or anyone else to engage, to point out potential pitfalls is now seriously, officially limited. ®