In my last post, I asked the question “where are the servers”. And, of course, folks sent me links to the Utah data center.
Good response, but I was trying to go somewhere else… Teach me to bury my lead.
Finding a physical place for a 1,000,000 servers is easy if you are the US government. We have a lot of space that the government owns that it can use.
The problem is more about how in God’s name is the government buying, managing and using 1,000,000 servers.
The scale of the equipment required, and challenge managing that scale is mind-boggling given that it would dwarf the hardest systems commercially built.
Buying
Look, the US government for it’s really big super-computers relies on outside contractors. They don’t have the in-house skill to build one of these things.
And the scale of the equipment would make the US government an insanely huge part of the tech market and that is mind-boggling. Basically for every server CPU that is purchased on the open market, the NSA purchases the other. Which means the total commercial market is smaller than we think. Which means if you are making business plans based on IDC numbers for the market size you are, well, wrong.
And this takes into account just the servers. Never mind networking etc..
Counter 1: But they don’t have to buy them all at once
That’s strictly true, but misses several key insights.
One Google/FB/et al are increasing their capacity very quickly. And there are other online services that store data and have collaboration (Box, DropBox etc). The number of services and amount of data is increasing not shrinking over time.
To keep up they have to buy as much total capacity as everyone is creating. And since everyone is buying, they are buying as well.
The other thing this doesn’t account for is that Google and FB are replacing older servers as they age out and die. And this probably happens at 4-5 year time scale.
And finally my 1,000,000 was based on data that is 1-3 years out of date, the numbers are probably bigger.
Counter 2: CPU’s aren’t getting faster.
Server performance and capacity is increasing. Although CPU’s haven’t gotten faster, the number of cores has increased. Which means that the NSA has to buy enough capacity to match the utilization levels of Google/FB etc. Given that Google and FB and others go to great lengths to improve utilization, this suggests that server counts are representative of the NSA capacity needs.
Counter 3: It’s not that many servers
This is a reasonable argument. this data suggests that 1 million servers is really only 1/350 of the total servers sold globally.
People
If you consider the sheer intellectual horsepower at Google then you start to scratch your head about where are the people who built this thing?
Seriously.
Because the NSA, thanks to Federal law can’t hire outside of the US.
So maybe the NSA can offer green cards and citizenship super-fast … But then who is doing the hiring? It’s not happening on college campuses and the best and the brightest are not going to IBM and Raytheon.
So where is the army of technically savvy people being hired from?
Managing
Having managed many servers at Zynga, managing infrastructure at a fraction of this scale is not easy.
Having being part of the team that built, what was at one point, the world’s large private cloud, I know that the software that is available to manage the infrastructure simply does not exist.
To make it, even remotely, tractable you need a lot of sophisticated software just to bring machine’s up and to take machines down.
This kind of software is not simple. And it requires big brains to assemble. And it would have to be built from scratch as nothing on the market exists that can replicate it.
Ingest
What nobody talks about is the complexity of managing the ingest of data. Let’s assume you’ve solved the infrastructure, the hiring and purchasing now we’re talking about magical software that is able to handle data coming from a myriad of different companies.
Each company has it’s own evolving set of data.
So either you have to deal with the unstructured format (which explodes the computational cost) or you’ve got teams of people working together at companies whose job it is to pre-process the data before it leaves your site.
In short
this smells of a fabrication. to what end, i don’t know.