Title; Optimization of Element-by-Element FEM in HPF

Authors; Hiroshi Okuda, Norihisa ANAN

Department of Mechanical Engineering and Materials Science, Yokohama National University 79-5 Tokiwadai, Hodogaya-ku, Yokohama 240-8501, Japan

TEL: +81-45-339-3864, FAX: +81-45-331-6593

Email: okuda@typhoon.cm.me.ynu.ac.jp, anan@typhoon.cm.me.ynu.ac.jp

 

ABSTRACT

It has been known that the disadvantage of HPF is the difficulty in attaining the high parallel efficiency for problems involving unstructured grids, such as the finite elements. In HPF we have to program so that the compiler could get the parallelism, maintain the load balance and minimize communications as much as possible. For the large-scale analysis it is important to reduce unnecessary and redundant data-copying procedures made by the compilers in order to efficiently utilize memory storage.

In this study, Poisson's Equation was numerically evaluated by the Element-by-Element (EBE) Finite Element Method under parallel environment using the HPF. A principal virtue of this scheme is that element contributions are computed independently. The EBE methodology allows for the circumvention of the global matrix assembly procedure, and moreover, by restructuring the Conjugate Gradient algorithm at the element level, the total storage space can be economized. In order to attain the high parallel efficiency, all data structures have been completely altered to node-base, instead of the mixtures of node, element based data. The procedure above may be considered as Node-by-Node finite element scheme. It is worth noting that through the above process, the generation of useless data-copying procedures has also been reduced, because the frequency which refers to the indirect address decreases. 

In general, because the processors (PEs) are distributed, in order to efficiently carry out large-scale analysis in MPP (Massively Parallel Processors) environments, the DDM (Domain Decomposition Method) is applied to the analysis model. It has been known that the partitioning has great influence on the parallel efficiency. In this study, the Greedy partitioning algorithm for the message-passing style of parallel computations developed by Farhat was modified for the data-parallel EBE FEM to maintain load balance and to minimize communications among the processors. This was made possible by reordering the element and node ID numbers. The authors report its effectiveness in reducing the necessary computation time in MPP environments.

The parallel machine implemented for the computations is SX-4, made up of 4 nodes with total of 32 PEs, however, the inter-nodal computations were not permitted at this time. The HPF compiler used is the HPF/SX version 3.0 which supports HPF 1.1. The Poisson's Equation was numerically evaluated for a subway station model involving unstructured grids and approximately 200,000 nodes. The effect of the reordering scheme was studied through the acquired computation time and parallel efficiency. The CPU time for computing Poisson's equation was measured, which CG solver was cut off at 100 iterations. In the case of computation involving 16 PEs, reordering scheme has shown its effectiveness, where improvements were made from 17 seconds to 14 seconds in computational time, and from 59% to 75% in parallel efficiency, respectively. This result is equivalent to that of message-passing type code, where 13 seconds for computational time and 79% for parallel efficiency were recorded. Improvements in the parallel efficiency were achieved by the reordering of element and node numbers for three-dimensional problems involving unstructured grids in HPF.