Challenges and stakes of high performance computing

Photo Philippe Ricoux / Executive in charge of Numerical Processing and Modeling, Scientific Division, Total / November 30th, 2011

To understand tsunamis or locate oil slicks, scientists are running ever more complex models in ever more powerful machines. Some are now able to compute nearly ten million billion operations per second. Welcome to the world of HPC (High Performance Calculation) where technical challenge meets major industrial stakes.

ParisTech Review – In the field of supercomputers, generation after generation of machines, boundaries keep receding: the mark of the petaflop (a million billion operations per second) was attained by an American machine in 2008, 10 petaflops in Japan recently, and a staggering figure of 20 petaflops by 2012 has already been forecast. Americans, Japanese and Chinese vie for the first place. What exactly is at stake in this calculations race?

Philippe Ricoux – This is not merely a technological competition, but an operational response to a scientific challenge: to better recreate the reality of complex phenomena. We are heading towards an ever finer modeling of these phenomena, which will ultimately allow us to better understand them, and to some extent, to better anticipate them. One must realize that, at the present time, we still do not exactly know what goes on inside a cumulonimbus. That is food for thought. Hurricanes, tsunamis and earthquakes are events where an extremely high number of parameters are at hand, and modeling events of such magnitude requires enormous computing power. Among the stakes that relate to supercomputers, this is one is at the forefront.

Computer simulation of these phenomena also responds to another constraint: in a number of cases, such as seismology or certain aspects of meteorology, the events are rare, hence experiments are almost impossible. In other cases, such as nuclear issues or a deeper understanding of combustible explosions (on which we are currently working for at Total), experimenting is dangerous. It is precisely for these rare and hazardous occurrences that a better understanding is essential. In that respect, simulation through high performance computing offers a pertinent alternative.

Of course, we are still talking about a model that will never exactly recreate reality, and this explains that some professionals are still reluctant to rely on virtual models, even the most sophisticated ones. For example, in aerodynamics, HPC provides solutions to better understand the turbulence associated with the flow of fluids along an airplane wing. We are even beginning to dream of a digital plane, entirely designed by calculation. But some airplane makers remain cautious, their engineers rightly pointing out that there are lives at stake and that you cannot beat experience. They are right … but at the same time there are complementarities to be engineered. And in the same time, there are areas, such as geoscience, where experience is not applicable.

What are the current perspectives in terms of computing power?

The most efficient supercomputer, which is Japanese, has a power of more than 8 petaflops – one petaflop being one million billion operations per second. The champions are Japan, China, and the United States, but Europe is by no means eclipsed; for instance in France we now have four petaflop machines. These rankings evolve at a very fast pace, suggesting the existence of numerous needs.

The exaflop (1000 times the petaflop) is in sight : we should get there by the end of the decade. The zettaflop (one million petaflops) could be achieved by mid-century. To give you an idea of what is at stake, it will take zettaflops to run complete seismic models. For the time being, in this field, our computing prowess is still far from optimum. In the case of the digital plane we would also be dealing with things on a zettaflopic scale.

Computing capabilities of such magnitude pose problems which we will come back to, not least of which is the energy consumption it involves: today the power consumption for a petaflop is one megawatt. Under such conditions, an exaflop would require a nuclear power plant! Fortunately, energy efficiency is regularly being optimized and the goal is 20 megawatts an exaflop. That would already be quite a significant progress, as it means dividing the power consumption of the best supercomputers of today by a factor of 50.

In practical terms, a supercomputer is an array of processor cabinets, which are installed in a 100 to 500 m² hangar. One of the limitations we face is simply the very fact that we have to work with matter: as long as we’ll work with silicon processors, however much we may reduce their size, the issue will still be about blowing up electrons no matter what – and atoms need a certain amount of room. Furthermore the energy liberated, particularly in terms of heat, imposes limits on the spatial concentration. Imagine cramming a million iPhones into an iPhone! Well, that’s exactly what the challenge of the zettaflop is about. Progress is being made, and solutions are emerging to move to the exaflop, but it will probably take a revolution of sorts to reach the zettaflop. The quantum computer, an idea that we started investigating recently, may be the solution. In the meantime, we need results, and therefore we work in farming patterns or in cloud computing, with many calculations on different machines.

Is today’s trend towards supercomputers or towards clouding?

It all depends on the desired use. Both models have their relevance, but their potential for applications differs. For example, a computer with very heavy computing power will be relevant to answer a single particularly complex question. On the other hand, you may have to answer a simpler question a very large number of times simultaneously, and in this case what is better suited is a cloud, consisting in an array of not necessarily very powerful computers, but in large numbers, and networked. Clouding finds its relevance in calculations that pose stochastic problems, with random variables: in such a case what we are dealing with is, for instance, processing the same calculation a million times, while changing the variable. So what we have here is not one single “big” problem, but, say, a million times the same problem, with a million different initial conditions. The cloud is relevant, for instance, in medical studies, in genetics.

Supercomputers are an obvious choice, however, when an equation cannot be easily parallelized – in layman’s terms, when the task at hand cannot be broken down into multiple sub-tasks to be executed at the same time, therefore setting up- parallel architectures. This is the case for example when trying to model a transport network’s operative pattern, where diverse parameters interact with each other in time and space: it is difficult to break them down into anything, as they form a whole. Another example is seismology, where you also find that kind of interactions, and in which processors need to communicate with each other when you test a model. Basically, when all the data is interrelated, you need a supercomputer.

In both cases, the stake clearly is to accelerate calculation so as to complete an answer to a problem within a reasonable time frame. A practical aspect of this question is to get to the point where you can work in “engineer time”: a calculation is launched at 6 P.M. as you leave the office, and you get the result at 9 o’clock the next morning.

What do acceleration factors have more to do with: software or hardware?

In fact the two issues are interdependent, and for a reason that precisely has to do with the energy issues that we mentioned earlier. In terms of hardware, consumption is fundamental, but one should be aware that in a processor, the power-hungry element is primarily the memory. One can therefore imagine millions of processors working in parallel, but each having relatively little memory.

This brings us to the software part – for the challenge, from this point on, lies in learning to program differently, and to get 10 million processors to work simultaneously. To achieve that, there is no other way but to parallelize operations, and that is no walk in the park. Because you and I are used to function sequentially, on the basis of a pattern you could call linear thinking. And it is precisely this type of thinking that must be superseded, by reorganizing the way we program in an entirely new manner. In practical terms, one can proceed by creating levels, stacking equations and systems to be calculated that have diverse time constants. At the top level there would be the slowest systems, then a second level for systems that are a little faster, and so on, just like interlocking gears of different sizes. Today we are able to stack and articulate four to five levels in this manner, and ideally we should push for up to seven or eight levels. Besides, we are currently able to do that on a scale of 100,000 processors, and we must learn to work on a scale of one million or 10 million… which again would imply adding levels. However, this breaking down of the computing process into levels is in itself an intellectual endeavor. It is entirely new paradigms that we are working on.

One significant hurdle is that not everybody will be able jump the bandwagon of this new approach. I mean, there will necessarily be a complex and delicate coupling between hardware, algorithms and applications, which will mobilize extremely specialized skills, and at the same time, a real multidisciplinary approach. Industrial or public stakeholders able to combine all these skills at once are quite rare.

Another problem is that in breaking down computation into levels, equations are bound to be denatured. Because the basic equation of transport, for instance, remains sequential no matter what. So to minimize as much as possible the problems arising from this disjunction between the original equation and its operational implementation, it is essential to develop architecture and algorithm in parallel. More generally, computer science, numerical and physical analysis must become associated. Thus, there are machines whose hardware was designed for a specific application, such as the Anton supercomputer, which was designed for genomics. If an application is set up once and for all and remains stable, it makes sense to build a machine ad hoc, and makers do offer them.

Given these conditions, is it not manufacturers’ interest to develop their own machines?

Whenever they can afford it, it is indeed their interest. It actually depends on the proximity between the calculations carried out in high performance mode and their primary business. If the latter is of a strategic nature, they tend to opt for internal solutions.

It should also be noted that the industry is widely present in the “Top 500” of the most powerful machines. Manufacturers account for 57% of the Top 500 in volume, and 26% in terms of computing power. In volume, they represent 8% of the Top 100. We can therefore observe that HPC is by no means limited to research centers. At the same time, we assess the shift. To give you a better idea, the 500th-ranking machine has the power that the first one had six years ago, it is true today, and it is has roughly been constant over time since 1980. This allows to rather accurately size up the industry’s incubation time, as it remains a major player. Among other private players is also finance – even if some of the systems used in finance are repeatable and thus are processed through cloud computing solutions.

Do programs developed in-house by manufacturers combine with choices of an open source type?

Yes, and for several reasons. The first is the pricing mode of software publishers, as software can be very pricey when it comes to thousands of processors connected in parallel: when you have a machine with a thousand cores, you are willing to pay a license, but not a thousand! A first challenge is to escape the software giants.

Secondly, what we are talking about here is solving complex, multi-scalar problems. It would be absurd to input it all into a single code, because it would be millions of lines long and would probably be very fragile. Hence the solution of coupling codes. For instance you associate a code that solves transport equations, a code that is in charge of thermodynamics, another that takes care of heat equations… physics is split into modules, and you couple the modules you need. Therefore the focus now is to develop module by module, hence the importance of exchanging modules, and thus to have open source modules – which have the added advantage of being developed with universities and are easy to debug. What will be dearly protected and kept secret, however, is the particular coupling of diverse modules, or somehow, the end result. In short, it is the particular use of different modules and their arrangement in a “business code” that constitutes an in-house expertise and is therefore liable to remain secret.

However, identifying high-performance computing as a strategic resource is not always obvious. Let us take the example of oil. It is easily understandable that for oil companies, a thorough knowledge of the subsoil is essential – both to assess seismic hazards and to more accurately locate the resource. Drilling an oil well costs an average of $ 100 million, and it turns out to be a wrong spot one time in three. Back in the 1990s, it was two times out of three, and the improvement was largely made possible by the advances in high performance computing and the development of ever more accurate models. That is not everything, one still has to demonstrate value, which always takes time. At Total, prior to 2009, we had not achieved that point. Up to that point, we surmised of course that advances were made possible by the HPC, but there was a measure of skepticism in the air. One must also be aware that a petaflop computer costs around 20 million euro (in 2011), subsequent investment in software development notwithstanding. So appropriate decisions must be made, that need to the most accurate information.

Some companies have been insightful on these issues, others still hesitate. It actually depends more on firms than on their sectors. In a broader sense, in developed economies but also in major emerging markets, access to high performance computing is now recognized as a key stake for competitiveness and innovation. There are programs of public support for the development of supercomputers, as PRACE in Europe, which aims to provide access to supercomputers to Europeans – thus saving them millions of computing hours purchased from Americans. The Chinese are now also advanced – the strategic value that is at stake has been clearly identified throughout the world. All the better for science, for a lot remains to be understood!

Note from the editors: Total is a patron of ParisTech Review.


  • Computer Architecture: From Microprocessors to Supercomputers
    Behrooz Parhami
    List Price: EUR 99,99
  • Last record: 10.51 petaflops (Engadget, avril 2011)
  • PRACE European Program

More on paristech review

By the author

  • Challenges and stakes of high performance computingon November 30th, 2011