Oscar's thoughts on HPC

There is a barely marketed GPU designer called Imagination Technologies. But they are selling more GPU licenses than ARM.

Their PowerVR SGX 5 series of GPU's are able of great 3D graphics as seen on the iPhone and iPad, and other copycat devices.

Apple has some shares on the company, and I've read that they have already licensed the PowerVR SGX 6 series.

Imagination Technologies say that their SGX 6 series GPU's will be extremely scalable, up to a point that they will be able to do some PCIe dGPU's competing with AMD and NVIDIA, and outperforming them on performance/watt (key question today).

Apple (say Steve Jobs) loves to control all the basic technologies on their products, and love to power optimize whatever they can control. So what a good deal.

The GPU availability on the Mac Pro is on my personal opinion pathetic. What am I doing with a 5000€ machine if I can only put a single Radeon HD 5870 with ONLY 1GB of GDDR5. There isn't even an GTX 4 or GTX 5 series offering!! It's clear that they need to do something. To quit the sever business, or to do something revolutionary.

Well, I wonder what will be possible with a SGX 600MP? How many cores? How many watts? And how many GPU chips in a single slot PCIe card? Will Apple design a GPU interconnect like SLI or CrossFire? A Torus like interconnect would be crazy love for computational codes, much like the brand new Japanese Fujitsu K System (wow!! SPARC is not dead!! Is on 1st position in the top500 list!!).

And if like some sources say, the new Mac Pro will be optionally rack mounted, that can make a lot of sense. Tremendous Graphics and Processing power with no competence. That would be pretty for the marketing. And Apple would have a bigger piece of the product cake. Juicy!!

What about the Intel MIC?? I've seen it 1 minute ago from ISC 11. We'll see. Maybe this will be another option. Who knows.

I can't wait!!

Hello World!

This blog is just a micron of the Internet where I want to write about what I do, read and think about HPC. So that is software, libraries, languages, operating systems and runtimes, and of course hardware!!!

Is a nice time to be on HPC (High Performance Computing, for those who don't know about). There is an explosion of architectures and derivative software stacks. Imagination and creativity is a big part of the equation when designing and developing computer systems, and all that is not restricted only to HPC.

As some may know, thanks to some technological limits all the computing ecosystem has to enter some dark roads. How much of "what" have I to put on what sort of processor? How and who will program that for which purposes? How much general or specific, homogeneous or heterogeneous my future architecture might be? How can I give access to C/C++ to all that madness? Crazy questions.

And more now, that a Microsoft engineer has put the cloud on to the equation. In fact, the cloud is an heterogeneous distributed system!!

So you see, I want to write a game that runs super-fast and super-smooth on next generation mobile devices. Ok, no problem, learn some multicore CPU programming if you want it to really use the hardware, because it will be a dual or quad core ARM processor. And if you need more, learn OpenCL and make your super-low-level super-tuned shader by using some 12 GPU cores or more, that are on the same chip. And we are talking about a mobile device.

Concurrency, and heterogeneity. This are the two main trends.

But why? Take a look at the AMD Fusion Developer Submit rebroadcasts, and you'll find some pretty good stuff from and ARM man, talking about "why oh why"?

It's almost all about the so called Power Wall. It sounds pretty simple. Power consumption may limit the GHz I can put on a processor. Well, that's not the only reason, and power is not the only reason for the lack of 200GHz CPU's, but is one reason. This, leads us to be forced to use multicore CPU's.
The other factor, and provably most important is: how much power does a transistor consume? If I make it smaller, how many extra transistors can I put in the same area, and how much less power will they consume? Well, the thing is that we can make transistors much smaller than greener. So with every transistor size reduction, we end up with a more power hungry chip if we always use the same die area.
That's why some are arguing heterogeneous computing as a solution, because that way we can use only the transistors needed for a given task, and switch off the rest to avoid "frying-egg" chips. And that's also the reason why Intel is investing on transistor research, trying to make them not only smaller, but much less power hungry than any other else offering.

So here comes the battle. And meanwhile, the programmers are eager to use all that heterogeneity for their programs... but what's out there? CUDA, OpenCL, OpenMP, MPI, HMPP, pthreads?? etc etc etc... Well for some, OpenCL is the only way in the heterogeneous world, since it's an open standard intended for any hardware. But it is based on C... No objects can execute on the GPU... Well, and what about having two different memory spaces?...

HPC users may be happy... you may say... Well, I love all that craziness, but I do know lots of HPC users (Artificial Inteligence, Bioinformatics, Computational Chemistry etc...) that don't!!! They have enough with their science. Don't ask them to learn even about concurrency!! And don't ask them to put some developers to take care of performance and scalability of their code (and heterogeneity? hahaha!!). They don't have an oil company budget, and worst, don't want to understand any thing about concurrency and so on.

So... what is people doing? Well, there are some brave heterogeneous crusaders, that learnt CUDA or OpenCL apart from having a PhD on some HPC-user science. This brainy monsters are an exception, and we can not rely on them to think that the technology will be widely used. We neeeed FUSION!! We need GMAC!! We need x86 virtual memory hardware support, and lots and lots of work that fortunately is already on the works.

For my part, I'm integrating a medical imaging OpenCL code on a visualization interface, and making a library for my own use. It's crazy the amount of code needed for a program if you use OpenCL from scratch. But making a simple code-reducing and generic OpenCL library makes things much shorter and easier. Provably I'll publish it somewhere in Open Source format. Maybe it is useful for others while waiting for a FUSION implementation and so on.

So that's an example of what this blog is about. I'll put whatever I have on my mind and want to share. Even sometimes business aspects of HPC, wearable HPC and so on. At the end it's all related, business drive's the commodity hardware design that is so widely used on HPC.

Oscar's thoughts on HPC

lunes, 20 de junio de 2011

Loving the idea: Apple enters HPC with "custom" GPU's and Thunderbolt optical cables

What's this blog about?