Monday, August 18, 2008

"Speed is king" or "how to prepare for future?"

Google said "Speed is King". Who don't believe please try to build a better search engine than Google :) Why Google is so fast? What is the reason behind? I invite you to "a Google Behind-the-Scenes Tour". Let spend few minutes to go through slides before continue reading this post.
After the tour, you will discover that Goolge is fast because they do distributed computing with huge of commodity computing units that give best performance/cost ratio.
The key is that Google invented a simple programming model that applied to many large-scale computing problem called MapReduce (inspired by map and reduce functions in functional programming).

MapReduce hide a lot of complicated problems in distributed programming (like parallelization, load balancing, handling machine failures, robustness ..) from programmers so that their can buid efficient programs quickly and easily (Learn more about MapReduce here, and Google Architecture here. For an overview, the tour above is enough :)

MapReduce usage statistics (the tour, slide 29) show that for Sep, 2007 Google performed 2,200,000 jobs in average 6.5 minute completion time using average 400 worker machines.

The good news is that you can clone Google architecture using an open source software called Hadoop: "a free Java software framework that supports data intensive distributed applications running on large clusters of commodity computers.[1] It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers." (from Wikipedia). Hadoop has been deployed in many big Web companies like: Yahoo, Facebook, Amazon ...

The bad news is that if you are a student, surely you don't have enough money to buy hundred machines to deploy a MapReduce system (let say each machine cost about $500 USD, 400 machines will cost $200,000 USD - not including setup and maintenance fee :(

Here come a rescuer, GPU (Graphic Processing Unit) as massive parallel processors for scalable high-performance computing (HPC) at very cheap price. CUDA is one of the first and most comprehensive platform (an overview here, more details here). What I mean about cheap is that with about 200 USD (at Aug, 2008) you can by a Nvidia 8800 GTX Card with 128 Stream Processors, 512MB DDR3 to play with parallel programming yourself.

Actually there is a lot of different between MapReduce and CUDA programming models. But I don't see they compete each others. Instead, they complement each others. I think that Hadoop+Cuda combination will give a lot of powerful for individual to utilize the most computing power giving best performance / cost ratio that current hardware industry can offer. It's also a good practice to prepare for "the next milestone in computing in history". Welcome to the world of parallel computing.

As a programmer you may say "I don't care. I still coding-for-food with my sequential programming techniques. My programs run very fast on latest CPU. That's all my IT company need."

Well well well, here is the point. Now a day, the Moore law is not correct. Since 2002,processor performance has improved less than 20% per year [link]. Then multiple-core processors appeared. The hardware industry started switch from single CPU to multiple core processors. Now you can buy a quad-core Intel processor at around $1000 USD. Intel also plans 32-Core Processor by 2009/2010. And then ADM also have it's own a lot of core processor plan :). Now I can see that you starting feel nervous. Of course, if your program can run in one core only and other programmer (who learn about parallel programming techniques) can run a new program with the same logic but can run in 32 cores. It means that your program will be 30 times slower than the new program. If you are a user, which program you want to use? If you a CEO who run an IT company, which program you want to hire?

I believe that in 2010 the will be a huge demand for parallel programming jobs just to utilize new parallel processor architecture to improve existing and develop new programs as fast as possible. From now, you have 16 months to learn about parallel programming. You should be hurry :))

Now I tell you that I'm in the same situation with you. I know almost nothing about parallel programming. Because universities don't offer me parallel programming courses, and jobs don't demand me for parallel programming techniques. But I will prepare myself for a near future where every desktop computers will have 32 (64, 128 or even 256) cores, and GPUs with computing power of thousand's GFLOPS (imagine that a near future desktop will be as powerful as 400 worker machines in above Google MapReduce jobs examples). In that concurrency world, the ones who master parallel programming techniques can write fastest programs, again "speed is king" :)

And please remember that "The world is concurrency, the applications are concurrency, and hardware is concurrency also". Programming need to catch up.

Google bought PeakStream (a GPU and multi-core computing solution startup)in middle of 2007. I don't know what Google going to do with PeakStream. Guest that they will use it to solve biology problems or build the next generation server farm.

No comments: