John Merlin
and 
Vincent Schuster 
The Portland Group, Inc. (PGI)
9150 SW Pioneer Court, Suite H, Wilsonville, OR 97070, USA.
High Performance Fortran is a portable, high-level language for programming parallel architectures with non-uniform memory access, especially distributed memory systems. However, there is now considerable interest in hierarchical memory architectures containing distributed memory with multiple processors sharing each memory module, such as SMP clusters and cc-NUMA machines. While HPF is portable to such systems it does not provide explicit shared memory programming features and hence does not exploit such architectures to their full potential. In contrast, OpenMP provides a set of directives and library routines for shared memory programming, but has no features for controlling data locality on distributed memory platforms.
This paper therefore explores the possibility of adding OpenMP extensions to HPF for the purpose of supporting hierarchical memory systems. We aim to preserve the semantics of each, so an HPF-OpenMP program can also be compiled as standard HPF or OpenMP for execution on 'pure' distributed memory or SMP platforms, as well as by an HPF-OpenMP compiler for SMP clusters and other hierarchical architectures.
We describe the programming model and semantics of HPF-OpenMP, with examples. We also describe its execution model, which involves a dual-level mechanism whereby lightweight threads are utilized within a collection of SMP processors under the control of higher-level node processes. Finally we present an implementation based on fairly straightforward modifications to the normal HPF compilation scheme.