What they mean when the government says “We do not have ‘direct’ access to your info”
Summary: Even the best journalists and national security experts have difficulty with technical stories like the recent NSA revelations. Today Marcus Ranum (bio) cuts through the government’s lies, explaining the truth behind the NSA’s tapping vital telephone and email communication systems.
When politicians and spokespeople choose their words with exquisite care, then it’s time to examine them with extra care. Let’s talk a little bit about the realities of how one might monitor a data center, shall we?
“We have no direct access to their systems.”
Of course you don’t. By “direct access” you mean that you can log in and collect data directly from the system, or have database administrators’ credentials and can issue queries, or whatever. You wouldn’t want that, anyway, because the queries and the activities might then become public knowledge — those are traceable, you know.
When someone logs into a system, gains administrative rights, and looks at someone’s email in-box that leaves traces in the system logs, and that’s completely unacceptable because what you’re querying for is classified and suddenly those system logs contain extremely sensitive data, indeed.
Here’s how you do it
Those big outfits decrypt all their traffic at the edges of the network using a load-balancer/redirector that’s capable of offloading the CPU-intensive activity of decryption from the backend servers. Inside the provider’s core network, the traffic carried within their switches is all in the clear.
You show up with a national security letter and maybe a warrant and tell the provider that you’ve got a system that does classified stuff and they’re going to plug it into their network and have the core switches span some of the traffic between, say, the mail servers and everything else, and the user authentication servers and everything else, and send a copy of that traffic to the mystery box (or boxes, depending on the load you need to consume) and that’s it.
There’s no need even to give the box an IP address, which is a feature also, because that makes the box impossible for anyone to see other than in the configuration of the core switch or if they get into the special locked room in the data center and count the number of boxes in the rack there.
Sniffing traffic is fairly straightforward
You collect raw packets, reassemble them into virtual streams, collect statistics about the stream, extract whatever data you’re interested from the stream, and do whatever analysis you want to on that data. This is how load balancers (like the SSL accelerator I referred to earlier) and intrusion detection systems work.
This is ideal for a classified program since the actual analysis method used: what data is collected, whether it’s message headers or full contents, etc, can remain completely internal to the collecting device. That way there’s no need to pester the security people at the provider if you want to update your collection rules: you just give yourself a classified order to now start collecting something new, on your box, in their data center.
Remember how google famously claimed that their wireless sniffers weren’t collecting sensitive data, and they later “remembered” that oops, dear me, the system was “misconfigured” to collect too much? That’s how you do it. Except if you can layer the whole thing under layers of classification, it’s even harder for anyone to learn what’s going on. Besides, not having an IP and not actually having to touch the providers’ systems keeps you out of the potential problem-space of being a cause of failure; facebook or google are going to be mighty touchy about down-time caused by your collection system and you can take your sniffer offline whenever you want to, without impacting any of the surrounding systems.
In other words, having “direct access” would be a huge disadvantage for you, because you now have a greater potential for information about your collection program leaking out. No, they do not have “direct access” to google, facebook, twitter, etc. They have something better: field-programmable completely invisible classified and unregulated access.
“We have never heard of a program called PRISM”
So says the corporate spokesperson. Well, that’s true. Because you’re a marketing spokesperson and you’re completely unaware of the boxes in the special room in the data center. But even if you were, you wouldn’t call it “PRISM” because that was (until now) the classified name of the program. You’d call it “the special room in the data center” or “the spook closet” or “shut up we don’t talk about that.”
If you work for one of the agencies that consumes the data, you probably don’t even call it PRISM. To you it’s just “the data you work on” and how it was collected and managed is protected source and methods outside of your need to know.
“We don’t listen to all the phone calls …”
… just key words “all” and “listen.”
If you collect the call data while it’s moving on a network, it’s just encoded voice signals. If you run those through a speech-to-text engine and then search it as text you’re not “listening” to anything because no ears are involved and no sound is produced. Of course you run that text through a scoring engine and pattern-match for interesting keywords, and perhaps — just perhaps — a human analyst eventually plays the data to make sounds that they listen to with their ears.
If you’d like to see this kind of sophistry in action, I suggest you watch Mike McConnell’s comments in the Intelligence-squared debate with Bruce Schneier and Mark Rothenberg. The debate took place in 2010, and I think that now, you’re a bit more qualified to read between the lines of what McConnell is saying.
First off, you’ll realize that he’s not lying out of ignorance — he’s very very carefully spinning the truth. There’s one moment in the debate when McConnell rebuts Schneier by saying “If I wanted to tap your home phone, I’d have to get a warrant…“ The key word is “home“. Presumably McConnell threw that in there deliberately to deceive the audience because he knew that mobile phones were being tapped without warrants. If you do decide to watch that video, I suggest you stop before you get to 7:51 into it, when McConnell displays disgusting cynicism:
I would summarize by saying: we have laws, and the key is getting the laws correct. If the law is written correctly and there’s the appropriate oversight committee – if you violate the law, you will be held accountable.
In a nation as free and as wonderful as ours is, leading the world in human rights and privacy and civil liberties it’s getting the debate framed right — to mitigate the risk to protect the nation, consistent with our values and our laws.
By “consistent with our values” I suppose McConnell meant “us, in the intelligence community.”
“There’s too much data to handle”
This is another trope I’ve often heard. Oddly, it’s often uttered by people who use google: a service that collects practically every page of data on the entire web, indexes it, and searches for complex combinations of keywords in microseconds.
How is this done? By massively parallellizing processing: big data is managed, massaged, and refined as it moves up the data pyramid, with additional hints and analysis propagated up to each layer. The underlying data is retained so it can be referred to when necessary, and as long as there are at least 2 copies somewhere, nothing is ever lost.
When you start using hierarchical storage systems, you can store (for all intents and purposes) infinite amounts of data, as long as you can keep buying media for your robot jukeboxes. The point, again, is not that someone looks at all the messages. They are stored, scored, analyzed, clustered, and if something appears interesting (based on origin, destination, keywords, clustering, vocabulary used, etc) then maybe someone has to look at it.
Does anyone here actually think data mining is a new idea? SAS and IBM have been doing it since the 1970s and post 9-11 there are a raft of companies building new data mining and correlation tools for the security industrial complex.
“The identity and location of all communicants“‘
Judith Emmel, an NSA spokeswoman, told the Guardian in a response to the latest disclosures: “NSA has consistently reported – including to Congress – that we do not have the ability to determine with certainty the identity or location of all communicants within a given communication. That remains the case.” (Source: Glenn Greenwald)
I’m pretty sure Emmel was speaking the literal truth, but only because she carefully placed the word “all” in the second to last sentence, and “certainty”.
What about “most of“? What about “fairly confident”? This is more sophistic world-slicing: deny that you have a certain capability while strategically neglecting to mention that you’re trying to have that capability. I’m sure Tiger Woods could honestly say “I lack the ability to always knock the ball into the cup” but that sure as hell doesn’t mean he’s not a really good golfer.
Google and Facebook respond
David Drummond, Chief Legal Officer of Google, sent a letter to to the Attorney General and the Federal Bureau of Investigation. It denies the allegations about Google’s data-sharing with the government — in very carefully worded terms.
Mark Zuckerberg, CEO of Facebook, denied the allegations in a posting at Facebook
For More Information
- An archive of articles about the government’s increasing surveillance of Americans, at ProPublica
- A summary of the many different means by which the government spies upon us at Washington’s Blog, 6 June 2013
- “Fair Warning: Julian Assange’s Cypherpunks“, Adam Morris, Los Angeles Review of Books, 28 April 2013 — Assange warned us
- “The NSA Doppelganger“, Rick Perlstein, The Nation, 7 June 7, 2013
- “The National Security Agency: surveillance giant with eyes on America“, Glenn Greenwald, Guardian, 6 June 2013 — “The NSA is the best hidden of all the US intelligence services – and its secrecy has deepened as its reach has expanded”
- “NSA taps in to internet giants’ systems to mine user data, secret files reveal“, Glenn Greenwald, Guardian, 7 June 2013 — “Secret PRISM program gives intelligence agency access to web and email of Google, Facebook and Apple customers”
- “U.S. mining data from 9 leading Internet firms; companies deny knowledge“, Washington Post, 6 June 2013
- “Boundless Informant: the NSA’s secret tool to track global surveillance data“, Glenn Greenwald, Guardian, 9 June 2013 — “Revealed: The NSA’s powerful tool for cataloguing data – including figures on US collection”
- “Edward Snowden: the whistleblower behind the NSA surveillance revelations“, Glenn Greenwald, Guardian, 9 June 2013 — “The 29-year-old source behind the biggest intelligence leak in the NSA’s history explains his motives, his uncertain future and why he never intended on hiding in the shadows”
- Important: The NSA Files at The Guardian — archive of the leaked documents and news articles
- Priorities of Justice in New America: “Judge behind phone records threw out Obamacare“, AP, 6 June 2013
- “Could There (Finally) Be a Backlash Against Domestic Surveillance?“, John Sides (Prof, Political Science, George Washington U), The Monkey Cage, 6 June 2013 — The answer is no.
- “The Perils of (Vague Delegations of) Power“, Andrew Rudalevige (Professor of government, Bowdoin), The Monkey Cage, 6 June 2013 — Why could the government collect data on pretty much every phone call you make? The answer gives a lesson in legislative drafting
- Section 215: The White House’s Bullshit Talking Points, Marcey Wheeler, 6 June 2013
- “What’s the Matter with Metadata?“, Jane Mayer, The New Yorker, 6 June 2013
- “Public Documents Contradict Claim Email Spying Foiled Terror Plot“, BuzzFeed, 7 June 2013 — “Defenders of “PRISM” say it stopped subway bombings. But British and American court documents suggest old-fashioned police work nabbed Zazi.”
- “The tangled web of empire“, Stephen M. Walt (Prof of International Affairs, Harvard), Foreign Policy, 7 June 2013 — The government’s growing power is an inevitable result of the Long War.
- “Security-State Creep: The Real NSA Scandal Is What’s Legal“, Rebecca J. Rosen, The Atlantic, 7 June 2013 — “The Court has failed to develop a robust system for applying the Fourth Amendment meaningfully to the questions of the 21st century”
Consequences: the US government has just trashed the overseas reputation of our tech & telecom industries
- “How PRISM could ruin Apple, Google, and every other big tech company“, Farhad Manjoo, Slate, 7 June 2013
- “NSA Surveillance Threatens US Competitiveness“, Richard Stiennon, Forbes, 7 June 2013
Posts about these revelations, and what they show about America:
- Attention fellow sheep: let’s open our eyes and see the walls of our pen, 2009 — Five years ago these programs, and their growth, were easily visible. We just didn’t want to see.
- The NSA news might be a birthday for the New America!, 7 June 2013
- The US government spys on us because America is a democracy, 8 June 2013
- Our opinion leaders defend the government’s surveillance programs, 10 June 2013
We must conquer the future, or it will conquer us