Monthly Archives: May 2013

NUMA results from Google and SGI

Saw this on high-scalability. Google performed an  analysis of NUMA. In that analysis they discovered many of the same results we uncovered at SGI in the mid-90’s. And that is super cool. It’s super cool because it suggests we, at SGI, were on the right track when we worked on the problem .

At the core of the results is the result that NUMA is NUMA and not UMA. And that to get performance you need to understand the data layout, and that performance is dependent on the application data access patterns.

What I find really cool is this;

Based on our findings, NUMA-aware thread mapping is implemented and in the deployment process in our production WSCs. Considering both contention and NUMA may provide further performance benefit. However the optimal mapping is highly dependent on the applications and their co-runners. This indicates additional benefit for adaptive thread mapping at the cost of added implementation complexity

Back at SGI we spent a lot of time trying to figure out how to get NUMA scheduling to work, and how to spread threads around to get good performance based on application behavior. One of the key technologies we invented was dplace. dplace placed threads on CPUs based on some understanding of the topology of the machine and the way memory would be accessed.

So it’s nice to see someone else arrive at the same conclusion because it probably means we are both right …



Real Time Disruption

One of my favorite things to watch is how industries get disrupted.

The thing that’s amazing is that in spite of all the information we have both in research and in practice the same story plays out over and over and over gain.

There are three that I am paying very close attention to.

  1. The disruption of college level and high school level education by on-line courses
  2. The disruption of the combustion engine with the electric car.
  3. The disruption of the legal profession through online legal services that address most common legal issues

What’s interesting about 1 and 3 is that the disruption took place as natural responses to market opportunities. There isn’t a single force of nature causing the disruption.

What is extraordinary about 2, the electric car, is that this a case where a visionary leader is actually creating the disruption through sheer force of will.

One of the key misunderstandings of the disruptee and their defenders is the assumption that technological and supply chain obstacles are insurmountable.

For example, how do you power your car when you go across the country?

What the disruptee’s don’t realize is that as the demand for electrical supply stations increase, because the supply of electrical cars increase, the supply of electrical supply stations will increase.

This takes time. Except when a visionary leader decides to make things go faster…

Which is what Elon Musk is doing again…

 The stations are only on the East Coast and in California today, but CEO Elon Musk announced this week that Tesla will triple the size of the supercharger network in the next month, according to AllThingsD. The network will span most of the metro areas in the U.S. and Canada by the end of 2013–meaning it will be possible to take a long-distance road trip in a Tesla without worrying about running out of power. Musk has said in the past that the company plans to install over 100 Supercharger stations by 2015.

A better language for Lua?

Given that I work in the gaming industry, I am always fascinated with what people will do with Lua.

The Terra project struck me as an interesting investigation into building a better low-level counterpart to Lua that is not C.

The key claim to fame for Terra is that it is a dynamic language that has near native performance because it is less dynamic than Lua.

The idea behind Terra and Lua is to use Lua as a scripting language for rapid prototyping and Terra for optimized components without having to deal with the messiness of dropping into C. What makes this particular system intriguing is that the Terra functions are in the same lexical environment as the Lua functions which means they inter-operate seamlessly while having the Terra functions execute outside of the Lua VM… as per their abstract:

High-performance computing applications, such as auto-tuners and domain-specific languages, rely on generative programming techniques to achieve high performance and portability. However, these systems are often implemented in multiple disparate languages and perform code generation in a separate process from program execution, making certain optimizations difficult to engineer. We leverage a popular scripting language, Lua, to stage the execution of a novel low-level language, Terra. Users can implement optimizations in the high-level language, and use built-in constructs to generate and execute high-performance Terra code. To simplify meta-programming, Lua and Terra share the same lexical environment, but, to ensure performance, Terra code can execute independently of Lua’s runtime. We evaluate our design by reimplementing existing multi-language systems entirely in Terra. Our Terra-based auto-tuner for BLAS routines performs within 20% of ATLAS, and our DSL for stencil computations runs 2.3x faster than hand-written C.

I don’t have enough experience with Lua to offer any insight as to whether this is a good idea… but it bounced along the internet superhighway so I’ll try to take a look.

Reading the paper, I realized there is a lot more there in terms of the sophistication and science of the challenge of integrating these two different languages. Okay… I’ll have to read and noodle.

Are the 300k servers Microsoft promised game changing?

Recently, as part of the xbox live announcement, Microsoft announced a dramatic expansion in the amount of compute they intended to add to their infrastructure. The plan, as announced, was to grow the server count from 15k to 300k, a 20 fold increase.

This is an astonishing amount of new servers to add to any new service, especially if you are not expecting a huge growth in the number of users.

The marketoon hypothesis

One hypothesis is that some guy in marketing asked some woman in engineering how many servers could the data center hold, and the woman said it could hold 300k, and the bozo figured that would make an awesome press release.

If this is true, the groans in Microsoft Engineering would be vast and awesome…

They are trying to do something different. 

Another, more interesting hypothesis is that they are actually trying to do this:

Booty says cloud assets will be used on “latency-insensitive computation” within games. “There are some things in a video game world that don’t necessarily need to be updated every frame or don’t change that much in reaction to what’s going on,” said Booty. “One example of that might be lighting,” he continued. “Let’s say you’re looking at a forest scene and you need to calculate the light coming through the trees, or you’re going through a battlefield and have very dense volumetric fog that’s hugging the terrain. Those things often involve some complicated up-front calculations when you enter that world, but they don’t necessarily have to be updated every frame. Those are perfect candidates for the console to offload that to the cloud—the cloud can do the heavy lifting, because you’ve got the ability to throw multiple devices at the problem in the cloud.” This has implications for how games for the new platform are designed.

One of the limitations of systems like the xbox is that the upgrade cycle is  5-7 years. The problem with a 5-7 year upgrade cycle is the difficulty in delivering better and better experiences. The effort to extract even better performance requires more and more software tuning until the platform is unable to give any more.

The approach Microsoft is taking to shift some of the computational effort to the cloud and leverage the faster upgrade cycles they have control over to deliver a better experience to their users without forcing their users to buy more hardware.

Several startups, that failed, have demonstrated that it is possible to stream a AAA title to a device. So the idea of doing this is not implausible.

With this, theoretical, approach the folks at Microsoft are attempting to square the circle. They have a stable and rapidly decaying platform in people’s homes, but use the hardware in their data center to give increasingly better graphics through a vast amount of pre-computed data.

The problem, of course, is that in practice the amount of things you have to pre-compute and store is so vast given 3D immersive worlds to be almost impractical.  Well, perhaps except if you had 20x more servers per user…

In the 2D space, this was basically the solution Google adopted to Google Maps. Confronted with the problem of how do you dynamically render every tile on the client, they pre-rendered data on the server and then had the client stream the data.

This is going to be very interesting to see… Although my money is still on the marketoon hypothesis…


Why Apple is Winning – or why Dell and Lenovo Deserve to die


If I was Steve Ballmer, then I would want to ring the scrawny or fat necks of both Chairman Dell and the head of the Lenovo Consortium.

A Microsoft customer buys a Lenovo laptop. And the Lenovo Laptop proudly proclaims to the world that the Lenovo Laptop has always on USB. A feature that allows you to charge your devices through the USB port without powering on the laptop.

When I say proclaims to the world, they have a little decal on the keyboard that advertises this feature. This is not some obscure little thing…

This is a very useful feature, especially when you are travelling.

So I insert my USB cable and … I feel like I am stuck in a bad episode of Revolution.

What the ?????

It turns out that this cool new feature advertised in such a prominent way … REQUIRES that I go into the BIOS, navigate the 1980’s VB text interface to configure it on.

For some reason some bozo Product Manager somewhere in the bowels of Lenovo’s design team decides to turn the feature OFF in the BIOS?




If I am Steve Ballmer, I am thinking: how do I build up my PC business so I stop having to deal with these Bozos that are destroying my brand… Because it’s the Software that’s the problem not the POS hardware those idiots build….