Publication Date:
2016-06-30
Description:
The latest machine generation installed at supercomputer centres in Germany offers a peak performance in the tens of Tflop/s range. We study performance and scaling of our quantum chromodynamics simulation programme BQCD that we obtained on two of these machines, an IBM Blue Gene/L and an SGI Altix 4700. We compare the performance of Fortran/MPI code with assembler code. The latter allows to exploit concurrency at more levels, in particular in overlapping communication and computation as well as prefetching data from main memory.
Keywords:
ddc:000
Language:
English
Type:
reportzib
,
doc-type:preprint
Format:
application/pdf
Format:
application/postscript