Is it Getting Hot in Here?
Microsoft’s recent announcement that they are ending support for the Intel Itanium processor on the Windows HPC Server product got me thinking again about specialized processors versus low-cost commodity processors for high-performance computing (HPC) in general and specifically, HPC in the cloud.
Low-cost, general purpose processors have made the Cloud possible, not only from an economic standpoint but also due to their broad support for various programming languages, compilers and tools. The tools for specialized processors are most often special-purpose and limited which require highly trained resources. This of course results in a significantly higher Total Cost of Ownership. However, there are certainly some performance advantages which in some cases make the trade-offs worth it.
Plus, there is not anything more sexy (calm down, I’m talking computers here) than an ultra-scale High Performance Computer running Itanium or Tile-Gx specialty processors. The harmonics of the heat sinks as they dissipate the inferno created from 100 screaming cores annihilating billions of complex calculations per second, is music to the ears. Add NVIDIA’s Tesla GPU’s to this powerhouse and you have got yourself one serious Ox capable of pulling a plow through 6 feet of mud.
From Multiple Processors to Multi-tenant
Alas, the sex appeal of the specialty processor has been diminished by the economic realities of low cost commodity processors. In addition to significantly lower capital and operational costs, the technical expertise necessary to run a “generic” shop are far more plentiful and less expensive also.
It’s the Ford truck of computing models; “Never runs great but runs forever.” This approach has a long history of stable and reliable performance which dates back to the 1960’s when computer design was largely focused on adding as many instructions as possible to the machine’s CPU.
It was also at this time when “parallel computing” emerged and along with it came the multiple-processor, general purpose computer design. The system divided the workload up by distributing parts of the problem to each CPU and consolidating all the results into a single answer.
In 1965, in an effort to capture market share in the scientific field, Burroughs picked up where Westinghouse left off with their Solomon high performance computing initiative. In a shared-risk project, Burroughs teamed up with the University of Illinois as a development partner to build what would become the last generation of the ILLIAC family. Given that the systems computational resources would far exceed what the University could use, they decided they would “rent” capacity to commercial customers. This may have been the first true example of a multi-tenant cloud computing model.
Time has proven the ILLIAC’s design to be effective for technical computing applications. Today, supercomputers are almost universally made up from large numbers of commodity computers, precisely the concept that the ILLIAC pioneered. What the Burroughs engineers did not realize was that they were laying the foundation for the mega data centers that power cloud computing today.
Does Technical Computing Require a Dedicated Cloud?
There remains a question as to whether the Ford truck design will be sufficient for applications that require high-performance computing. The Cloud was not designed with technical or scientific computing applications in mind they were designed for reliability and steady, predictable performance. The biggest challenge is that not all HPC applications lend themselves to this type of processing and extracting “high-performance” from this type of design was and still is, problematic.
The Cloud presents some interesting possibilities for HPC and there appears to be a divide forming between traditional cloud models and those that are designed specifically for high performance computing applications.
SGI recently introduced their Cyclone HPC in the Cloud service which is comprised of some of the world’s fastest supercomputing hardware architectures, including Intel Xeon and Itanium processor-based SGI Altix scale-up, Altix ICE scale-out and Altix XE hybrid clusters. The Cyclone also incorporates high performance SGI InfiniteStorage systems for scratch space and long-term archival of customer data, another pre-requisite for HPC.
The technology at Cyclone’s core is highly specialized and designed exclusively for HPC workloads, something that traditional Cloud designs are lacking. While traditional Cloud computing designs can impersonate an HPC environment, they do not possess the true performance characteristics that are necessary for complex scientific and technical computation.
As Seymour Cray once remarked, “If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?”
It is my opinion that true HPC workloads will require specialized Clouds and we will see more models like SGI’s in the near future. As a result the division between low-cost commodity and specialized Clouds will become clearer as HPC and non-HPC workloads are better defined.