This spring, the Texas Advanced Computing Center (TACC) brought online what they bill as the largest open access supercomputer in the world. True to form, they've named it something distinctly Texan - Ranger. Other systems at TACC go by names such as Lonestar, Maverick, and Ranch. Ranger has a whopping 504 teraflops peak performance, 125 terabytes memory and 1.7 petabytes disk space. And it doesn't stand a chance against a problem studied by an 18th century Frenchman named Simeon Denis Poisson.

Poisson's equation is quite elegant. It can be written in just one line, as you can see below.Yet this simple equation is a nightmare for the latest generation of supercomputers. Here's why.

In the 80's and 90's, advances in computer speed were all about faster processors. In about 10 years, processor speeds went from megahertz to gigahertz. To these computers, operation counts and memory size were the only things that really limited the problems that could be tackled. However, in the late 90's, we started to hit a wall in terms of how fast you could make a single processor. Performance leveled out somewhere around 2 - 3 GHz so that a 5 GHz chip really ended up doing the about the same amount of work as a 2 GHz chip and the 5 GHz processor used far more energy (and created far more waste heat). The solution to this problem was to design supercomputers that had many processors. Today, this leaves us with Ranger with 15,744 quad-core chips for a total of over 60,000 processors. To use more than one of these 60,000 cores, you have to break your problem into pieces and send these pieces to individual cores which do the work and then send the results back to the master processor. The master processor then puts the pieces back together and spits out the answer. That the computer side of the problem.

Now let's look at the other side of the problem - the math. Let's compare Poisson's equation to the wave equation, given below.

When you write the wave equation like this, it is easy to see that you get Poisson's equation in the limit that the wave speed, c, goes to infinity. Since the wave speed is the speed at which information propagates through a system, this means that when we solve Poisson's equation, we are saying that every part of our physical domain is instantaneously aware of every other part of the domain.

You've probably already figured out that when you have a computer that splits a problem into 60,000 pieces solving a problem where every piece needs to know exactly what is going on in every other piece, that leads to a lot of communication. In fact, the communication scales as the number of processors squared. What's worse is that on many new systems, there are fewer network connections than processors. On Ranger, for example, there is one network connection for every 16 processors. That's like telling a thousand people that they all need to call each other but there are only 60 telephones. Instead of getting more work done as you add processors, after some number of processors, when you add more the processors simply spend all of their time talking to other processors rather than doing any real work. So while many hands make light work, too many chefs still spoil the soup.

So how do you solve a Poisson equation on a supercomputer like Ranger? Well, right now the answer is you don't. Our code, ASH, is currently limited to using small chunks of Ranger at any given time. There are a number of promising ideas on the horizon involving multi-grid schemes. Another approach is to solve the more general wave equations which have finite wave speeds so that processors only need to know what is going on with their neighbors rather than with every other processor. Finally, there are some machines on the horizon with faster networks with fewer processors per connection. All of these things, however, are unproven and may or may not improve the situation. For now, we're stuck unable to solve a 300 year old equation.

Simeon Denis Poisson must be laughing his head off.

Nick, I agree having massively parallel communication is a major issue.

ReplyDeleteHowever, I am confused. Can't you just solve Poisson's equation using 16 processors connected on the same network? Then you shouldn't have a communication problem right? Or solve it on the 4 processors that are on the same quad-core chip?

However, having the whole machine solve it at once would obviously be a nightmare.

Joe, the problem really lies in the fact that we are solving the MHD equations which are all coupled and only one of them is a Poisson equation. Sixteen processors is fine for low resolution jobs, but when we want to push resolution up to something like 512^3 or higher, we really need hundreds of processors in order to be able to do all of the computation required at each point for each time step.

ReplyDeleteIt would be very handy if we could somehow de-couple the Poisson equation from the rest of the equation, but that requires doing things like assuming that bulk velocity, temperature, and magnetic fields do not influence the pressure balance in the gas, which are terribly bad assumptions to make.

So for now, if we use hundreds or thousands of processors to solve the problem, we have to split the Poisson equation up into hundreds or thousands of pieces too.

"Joe, the problem really lies in the fact that we are solving the MHD equations"

ReplyDeleteOh Nick, I solve equations like this on my personal Core-Duo laptop just as a quick test to see if my MPI libraries are working.

I'm kidding of course. Yeah, I can see why doing Poisson's equations on an MHD level could be tricky on as few as 16 processors.

(Why aren't you using Matlab? I thought as an undergraduate I was told Matlab was the best thing to learn for doing computational physics?) :)