Ahmed Eleliemy
HPC Group, University of Basel, Basel, Switzerland
Mahmoud Fayze
Fujitsu & Computer Science, Ain-Shams University, Cairo, Egypt
Rashid Mehmood
High Performance Computing Center, King AbdulAziz University, Jeddah, Saudi Arabia
Iyad Katib
High Performance Computing Center, King AbdulAziz University, Jeddah, Saudi Arabia
Naif Aljohani
High Performance Computing Center, King AbdulAziz University, Jeddah, Saudi Arabia
Download articlehttp://dx.doi.org/10.3384/ecp17142673Published in: Proceedings of The 9th EUROSIM Congress on Modelling and Simulation, EUROSIM 2016, The 57th SIMS Conference on Simulation and Modelling SIMS 2016
Linköping Electronic Conference Proceedings 142:98, p. 673-679
Published: 2018-12-19
ISBN: 978-91-7685-399-3
ISSN: 1650-3686 (print), 1650-3740 (online)
Loadbalancing of computational tasks over heterogeneous architectures is an area of paramount importance due to the growing heterogeneity of HPC platforms and the higher performance and energy ef?ciency they could offer. This paper aims to address this challenge for a heterogeneous platform comprising Intel Xeon multi-core processors and Intel Xeon Phi accelerators (MIC) using an empirical approach. The proposed approach is investigated through a case study of the spin-image algorithm, selected due to its computationally intensive nature and a wide range of applications including 3D database retrieval systems and object recognition. The contributions of this paper are threefold. Firstly, we introduce a parallel spin-image algorithm (PSIA) that achieves a speedup of 19.8 on 24 CPU cores. Secondly, we provide results for a hybrid implementation of PSIA for a heterogeneous platform comprising CPU and MIC: to the best of our knowledge, this is the ?rst such heterogeneous implementation of the spin-image algorithm. Thirdly, we use a range of 3D objects to empirically ?nd a strategy to loadbalance computations between the MIC and CPU cores, achieving speedups of up to 32.4 over the sequential version. The LIRIS 3D mesh watermarking dataset is used to investigate performance analysis and optimization.
heterogeneous architectures, MIC, spin-image algorithm, loadbalancing, performance analysis