Author Archives: specialk

PRISM – and the dominoes fall

First journalists revealthat Mr. Snowden revealed nothing new. After all, we all knew that the NSA was monitoring everything.

http://www.businessinsider.com/60-minutes-reported-nsa-spying-in-2000-2013-6

Business Insider basically described Mr. Snowden as a dim-witted high-school dropout that caused himself a lot of pain over nada

And then we have this article that provides more insight into what PRISM does.

http://www.businessinsider.com/how-prism-surveillance-works-2013-6

Essentially what PRISM does is give a narrow view into a small set of users that the NSA has legally obtained wiretaps that allows them to look at their online activity in real-time.

That is both easy and simple to do and does not need mythical amounts of infrastructure to do…

My original assessment still holds:the technological illiteracy of the average reporter, the rabid anti-government bias of too many folks in SillyValley allowed a low-level functionary to convince them that alien technology had suddenly materialized in the bowels of the US government.

PRISM – Where are the servers, revisions

In my analysis I claimed that the NSA had to buy lots and lots of servers.  Something like 1 million.

What friends of mine, correctly, pointed out is that most servers render data and do not storie data.

Which, of course, reveals my bias. At Zynga we don’t render the data for the user, we just process the data. . At other web companies most servers render the thing the user sees.

Mea Culpa. 

The reality is that of the 1 million servers, for many web properties, only a fraction stores data. So let’s say 10% which is probably fair. That reduces the problem to 100k servers.

Except …

FB, Yahoo and Google are probably just one of the interesting places people store data.

They also store data on Box, DropBox, S3, EBS, tumblr, etc, etc, etc. Any application that stores data for sharing is a target for the NSA.

They also store data in Hotmail (now known as Outlook… Really … )…

The point is that you can easily shrink the problem down, and then I can easily grow it.

And then the interesting problem that the folks at the NSA have to solve isn’t just storing the data but finding connections across the data. Just the size of the data motion and data indexing boggles the mind.

The point is that this is a huge infrastructure.

And that the problems of management, scaling, operations remain real even before we get to the really interesting question of data analysis.

Now it’s entirely possible that there are researchers in the NSA that have solved all of big-data’s problems that the rest of us are working on. It’s possible.

And unicorns might exist.

Look if this is real, it means that my understanding of where the state of the art is, is about 10 years behind the curve. And if the US government has sat on this kind of advanced software, then the entire decade we spent figuring this shit out was … wasted.

And if they are that good, that means that entire areas of human endeavor could be accelerated if they gave that software away. Just think about what we could do with the kind of real-time analysis. What would we do if we could sift through all the data about all of humanity in real-time …

At the end of the day, I am having a hard time believing that the rest of the planet is 12 years behind some mysterious dark organization.

Is it possible? Absolutely. Likely, no.

PRISM – Person of Interest meets Reality Show

Everyone gets their 15 minutes of fame. The NSA dude, Edward Snowden, got more than his fair share.

What I found fascinating was his description of how the system will create connections from things you have done and create an artificial and suspicious narrative.

As I listened to him talk, I remembered where I had seen this theory before — it’s in a TV show called Person of Interest.

The central conceit of the show is that there is a computer that has access to every data source on the planet and is making connections and finding bad guys before anyone else can .

Which made me laugh. Occam’s razor says that the simplest solution is the likeliest. So what’s more likely some low-level person invented some impossible to disprove conspiracy theory based on a hit TV show OR the NSA is monitoring computers systems without the smartest minds of my generation figuring it out.

When this is all said and done, we will discover that these are the false claims made by a media hungry person. And we will also, rediscover that press’ technology literacy is abysmal.

PRISM – Where are the frigging servers, part deux…

In my last post, I asked the question “where are the servers”. And, of course, folks sent me links to the Utah data center.

Good response, but I was trying to go somewhere else… Teach me to bury my lead.

Finding a physical place for a 1,000,000 servers is easy if you are the US government. We have a lot of space that the government owns that it can use.

The problem is more about how in God’s name is the government buying, managing and using 1,000,000 servers.

The scale of the equipment required, and challenge managing that scale is mind-boggling given that it would dwarf the hardest systems commercially built.

Buying

Look, the US government for it’s really big super-computers relies on outside contractors. They don’t have the in-house skill to build one of these things.

And the scale of the equipment would make the US government an insanely huge part of the tech market and that is mind-boggling. Basically for every server CPU that is purchased on the open market, the NSA purchases the other. Which means the total commercial market is smaller than we think. Which means if you are making business plans based on IDC numbers for the market size you are, well, wrong.

And this takes into account just the servers. Never mind networking etc..

Counter 1: But they don’t have to buy them all at once

That’s strictly true, but misses several key insights.

One Google/FB/et al are increasing their capacity very quickly. And there are other online services that store data and have collaboration (Box, DropBox etc). The number of services and amount of data is increasing not shrinking over time.

To keep up they have to buy as much total capacity as everyone is creating. And since everyone is buying, they are buying as well.

The other thing this doesn’t account for is that Google and FB are replacing older servers as they age out and die. And this probably happens at 4-5 year time scale.

And finally my 1,000,000 was based on data that is 1-3 years out of date, the numbers are probably bigger.

Counter 2: CPU’s aren’t getting faster.

Server performance and capacity is increasing. Although CPU’s haven’t gotten faster, the number of cores has increased. Which means that the NSA has to buy enough capacity to match the utilization levels of Google/FB etc. Given that Google and FB and others go to great lengths to improve utilization, this suggests that server counts are representative of the NSA capacity needs.

Counter 3: It’s not that many servers

This is a reasonable argument. this data suggests that 1 million servers is really only 1/350 of the total servers sold globally.

People

If you consider the sheer intellectual horsepower at Google then you start to scratch your head about where are the people who built this thing?

Seriously.

Because the NSA, thanks to Federal law can’t hire outside of the US.

So maybe the NSA can offer green cards and citizenship super-fast … But then who is doing the hiring? It’s not happening on college campuses and the best and the brightest are not going to IBM and Raytheon.

So where is the army of technically savvy people being hired from?

Managing

Having managed many servers at Zynga, managing infrastructure at a fraction of this scale is not easy.

Having being part of the team that built, what was at one point, the world’s large private cloud, I know that the software that is available to manage the infrastructure simply does not exist.

To make it, even remotely, tractable you need a lot of sophisticated software just to bring machine’s up and to take machines down.

This kind of software is not simple. And it requires big brains to assemble. And it would have to be built from scratch as nothing on the market exists that can replicate it.

Ingest

What nobody talks about is the complexity of managing the ingest of data. Let’s assume you’ve solved the infrastructure, the hiring and purchasing now we’re talking about magical software that is able to handle data coming from a myriad of different companies.

Each company has it’s own evolving set of data.

So either you have to deal with the unstructured format (which explodes the computational cost) or you’ve got teams of people working together at companies whose job it is to pre-process the data before it leaves your site.

In short

this smells of a fabrication. to what end, i don’t know.

 

 

PRISM – where are the frigging servers?

Over the last two weeks the 1% and its wannabe cohorts has been obsessively worrying about government spying. The rest of the world has tried to keep their jobs and pay their bills.

What’s weird is that the same guys who think the black helicopter conspiracy theorists are “nuts” are finding just cause with those conspiracy theorists.

What astonished me in this whole discussion was the really basic question of where are all the NSA’s servers? Most reporters focused on the technological feasibility of such a system, I want to ask the mind-numbing question of where the hell does the data live? And where is the infrastructure that computes the data.

Since most of us are software guys, including yours truly, we never ask where are the physical systems that run our software. But in this case, I want to.

Let’s speculate that to collect the data in real-time and analyze it in real-time you need an infrastructure as big as the one you are monitoring. What I am saying is that if FB requires 1 cpu cycle and 1 byte to store data as it comes, the corresponding system that is monitoring the data must need no less than 1 cpu cycle and 1 byte of data to store the same data. And the assumption is probably too simple. In reality the monitoring system has to spend more CPU cycles to analyze the data than FB, and can store less data as data. But we’ll stick with that assumption.

The server infrastructure that the NSA builds is bigger than the joint infrastructure of FB, Yahoo and Google. In plain English, the most complex advanced technology companies on the planet have built something that compared to what the NSA has built is a toy.

Just to put some numbers on this, FB had about 180000 servers in 2012, Google was using about 900000 servers in 2011, and Yahoo according to this report had 100000 but that seems to only count a small piece of Yahoo’s business.

We’re talking about over 1 million servers here (assuming 2012 numbers with no growth). You don’t just have 1 million servers with their switches and racks and disk-drives just sitting around … This infrastructure would represent a huge portion of corporate america (just think of Cisco and Intel for the frigging processors). This kind of deployment would literally show up as a significant line item in their balance sheet.

Where the f*k do you put 1 million servers? That’s a f*k load of power and networking.

If the NSA really has this kind of infrastructure that is off the grid, the logistics of purchasing, shipping and secrecy astonish me far more than the relatively insanely difficult problem of spying on FB in such a way that their top engineers don’t notice.

The fact that no one knows about this much infrastructure should convince us that this is an absurd tale.

But then again we fought a world war and built a bomb and nobody knew about it…

So when someone tells you the government is full of incompetent morons, just tell them: Absolutely not, they put together the world’s largest computing infrastructure and it took a low-level systems analyst to spill the beans and none of the press asked: where the hell are the machines?

Too many games, too little time

Recently, my employer, has been launching a lot of games.

And in different genres.

Wanna play a running game, try out Running with Friends.

384936-running-with-friends

Wanna go on a long quest to defeat some dark lords, try Battlestone.

 

And then if you want to engage in groups of battles just wait for Solstice…

And if you need something more sedate, more cerebral but still fun, try War of the Fallen…The first card battler that was simple enough for me to play.

In the past, Zynga released a lot of games in similar genres. Which was great, but it did mean when you wanted to play a different kind of game, you could play someone else’s game…

But no more.

And as someone who likes to stay on top of our games, and the player experience, I am spending too much time playing games…

Well, maybe my life isn’t SOOOO bad….

 

The Power Supply Issue with PC hardware

steveballmer

Steve Ballmer must hate his life. His company builds this software, they then hand it to these bozos at Lenovo, and all of a sudden shit happens.

Latest problem.

The Lenovo Ideapad y500 has a dual SLI configuration for its 3D hardware. The problem with an SLI configuration is that it consumes a lot of power. I mean a lot of power.

To actually get the graphics hardware to run you need 170W power-supply.

Which is fine.

If you don’t need the graphics card, then 90W power supply will do just fine. And that is great. Because a 90W power supply costs 25$ these days and you can have several in your house.

And so here’s where the shit hits the fan. Last night I used a 90W power supply because I never paid attention to the 170W power requirements of my graphics cards.

And I spent two hours trying to figure out why my laptop was suddenly dropping frames, etc.

It was the frigging power supply.

Now I ask, why oh why, could Lenovo’s hardware monitors not just tell me that the problem was the power-supply? A simple warning? A notification? Something?

But no. Nothing.

Maybe there is a BIOS option for that…