There are many things in the new draft Investigatory Powers Bill that need very careful attention – some of which may be cautiously welcomed, some of which need to be taken with a distinct pinch of salt. The issues surrounding ‘Bulk Powers’ (which we’re not supposed to call ‘mass surveillance’) and ‘Equipment Interference’ (which I presume we’re not supposed to call hacking) will be examined in great detail, and quite rightly so because they’re of critical importance, and clearly recognised as such. The issue of ‘Internet Connection Records’, on the other hand, does not yet seem to have been given the attention it deserves – but I am sure that will change, because the collection of them has massive significance and represents a major change in surveillance, for all that they are described in the introduction to the bill as just ‘restoring capabilities that have been lost as a result of changes in the way people communicate’. They don’t restore capabilities: they provide hitherto unprecedented intrusion into people’s lives.
Internet Connection Records (ICRs)
The description of ICRs in the bill leaves quite a lot to be desired. In the introductory explanation they are set out as:

In accordance with the bill, these ICRs will be captured and stored for a year by the communications providers. This means, essentially, that a rolling record of a year of everyone’s browsing history will be stored. Not, it seems, beyond the top level of website (so that you’ve visited ‘www.bbc.co.uk’ but not each individual page within that website, nor what you have ‘done’ on that website). The significance of this data is very much underplayed, suggesting it is just a way of checking that so-and-so accessed Facebook at a particular time, in a similar way to saying ‘so-and-so called the following number’ on the phone, and thus the supposed ‘restoring of capabilities’ referred to. That, however, both misunderstands the significance of the data and of the way that we use the technology.
The latter part is perhaps the most easily missed. Our ‘online life’ isn’t just about what is traditionally called ‘communications’, and isn’t the equivalent of what we used to do with our old, landline phones. For most people, it is almost impossible to find an aspect of their life that does not have an online element. We don’t just talk to our friends online, or just do our professional work online, we do almost everything online. We bank online. We shop online. We research online. We find relationships online. We listen to music and watch TV and movies online. We plan our holidays online. Monitoring the websites we visit isn’t like having an itemised telephone bill (an analogy that more than one person used yesterday) it’s like following a person around as they visit the shops (both window shopping and the real thing), go the pub, go to the cinema, turn on their radio, go to the park, visit the travel agent, look at books in the library and so forth.
That, however, is only part of the problem. The other aspect is perhaps even more important – the inferences that can be gleaned from analysis of the ICRs. There are two different sides to this:
- The first is the ‘logical’ analysis of web browsing data: the kind of inferences that can be made by looking at the kinds of sites visited, the times that they are visited and so forth. This can be very direct, like using knowledge that a person visited sites connected with a particular religion to ‘guess’ their own religion, or that they visited sites connected with a particular health condition to ‘guess’ that they might be concerned about their own health. It can also be less direct but similarly logical – men who spend a lot of time watching Top Gear might be thought to have sympathy for Jeremy Clarkson’s views on ‘political correctness’ or be skeptical about climate change, or people who visit a lot of ‘news’ websites might be particularly interested in politics. People who visit pizza delivery websites regularly might be ‘guessed’ to have unhealthy lifestyles. The number of possibilities are huge – and not just relating to the actual sites visited, but the time and pattern of those visits. Browse a great deal in the middle of the night, and that says something very different to browsing only during working hours.
- The second is perhaps even more concerning: the ‘big data’ analysis of ICRs. One of the critical aspects of ‘big data’ is that it picks up traits and establishes correlations rather than seeking to find logical connections for things. This has been studied by academics, with some surprising findings – the story from one such study that ‘liking’ (in Facebook terms) curly fries correlates to higher intelligence makes the point. This kind of data – and it really is ‘big data’ – allows far more inferences to be drawn than are immediately obvious. Moreover, it is a kind of analysis that is being worked on, and worked on extensively, by some of the biggest, most powerful and most technologically advanced corporations in the world. What Google, Facebook and others develop in order to identify target audiences for advertising or markets for products is just as suitable for identifying people with particular political views.
The problems with these inferences should not be underestimated. If they’re accurate, they represent major intrusions into people’s privacy – sometimes they allow the analysts to predict behaviour better than the people themselves can predict it – whilst if they’re inaccurate they can mean that terrible decisions are made about people. When this is confined to advertising the impact is rarely that significant (though it can be, as the non-apocryphal stories of revealed pregnancies and sexuality have shown) but if decisions are made on a similar basis by law enforcement or security services they could be hideous.
So we should not underplay the importance of Internet Connection Records. They matter a great deal – and gathering them is a major step in surveillance. What is more, asking communications service providers to gather and hold them adds a whole raft of new vulnerabilities. The Talk Talk hack – and Talk Talk are precisely the kind of company who would be forced to hold this kind of data – should make the vulnerability to hacking crystal clear. This kind of data is perfect for identity theft, scamming, blackmail (Ashley Madison style) and far more crimes, and the servers holding it might as well have big red signs on them saying ‘hack me please’. The chance of individual misuse of the information should also not be downplayed – in the initial draft of the Bill it looks as though access to the data will not be via warrant, but through the ‘Designated Person’. The past has shown how individuals can misuse systems for personal reasons – this kind of data can be very tempting.
The chance of ‘function creep’ is perhaps even more concerning. Where systems are built and data gathered for one purpose, it is hard to resist using it for another, seemingly obvious and sensible reason. That’s how RIPA ended up being used for dog fouling, fly-tipping and school catchment enforcement when it was intended for terrorism and serious crime. If you build it, it will be used, and not just for the original purpose.
None of this is to say that Internet Connection Record should definitively not be collected – but that the ‘mature debate’ that has been called for on surveillance should be about what they can really be used for, and the depth of the intrusion into people’s lives that they really represent. The bar should be set very high here, and the case to gather and hold this information needs to be a very good one indeed. The arguments put forward so far do not seem strong enough to me – perhaps more will be provided in the process through which the bill is scrutinised over the next few months. If not, this is a part of the bill that should be opposed very strongly.