Library

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • 2020-2023  (5)
Years
Year
Language
  • 1
    Publication Date: 2022-03-11
    Description: The Portable Computing Language (PoCL) is a vendor independent open-source OpenCL implementation that aims to support a variety of compute devices in a single platform. Evaluating PoCL versus the Intel OpenCL implementation reveals significant performance drawbacks of PoCL on Intel CPUs – which run 92 % of the TOP500 list. Using a selection of benchmarks, we identify and analyse performance issues in PoCL with a focus on scheduling and vectorisation. We propose a new CPU device-driver based on Intel Threading Building Blocks (TBB), and evaluate LLVM with respect to automatic compiler vectorisation across work-items in PoCL. Using the TBB driver, it is possible to narrow the gap to Intel OpenCL and even outperform it by a factor of up to 1.3× in our proxy application benchmark with a manual vectorisation strategy.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2020-10-16
    Description: Recently, Intel released the oneAPI programming environment. With Data Parallel C++ (DPC++), oneAPI enables codes to target multiple hardware architectures like multi-core CPUs, GPUs, and even FPGAs or other hardware using a single source. For legacy codes that were written for Nvidia GPUs, a compatibility tool is provided which facilitates the transition to the SYCL-based DPC++ programming language. This paper presents early experiences when using both the compatibility tool and oneAPI as well the employed extension to the SYCL programming standard for the tsunami simulation code easyWave. A performance study compares the original code running on Xeon processors using OpenMP as well as CUDA with the performance of the DPC++ counter part on multicore CPUs as well as integrated GPUs.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2022-06-13
    Description: Large capacity Storage Class Memory (SCM) opens new possibilities for workloads requiring a large memory footprint. We examine optimization strategies for a legacy Fortran application on systems with an heterogeneous memory configuration comprising SCM and DRAM. We present a performance study for the multigrid solver component of the large-eddy simulation framework PALM for different memory configurations with large capacity SCM. An important optimization approach is the explicit assignment of storage locations depending on the data access characteristic to take advantage of the heterogeneous memory configuration. We are able to demonstrate that an explicit control over memory locations provides better performance compared to transparent hardware settings. As on aforementioned systems the page management by the OS appears as critical performance factor, we study the impact of different huge page settings.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2022-12-05
    Description: Solving PDEs on unstructured grids is a cornerstone of engineering and scientific computing. Heterogeneous parallel platforms, including CPUs, GPUs, and FPGAs, enable energy-efficient and computationally demanding simulations. In this article, we introduce the HPM C++-embedded DSL that bridges the abstraction gap between the mathematical formulation of mesh-based algorithms for PDE problems on the one hand and an increasing number of heterogeneous platforms with their different programming models on the other hand. Thus, the HPM DSL aims at higher productivity in the code development process for multiple target platforms. We introduce the concepts as well as the basic structure of the HPM DSL, and demonstrate its usage with three examples. The mapping of the abstract algorithmic description onto parallel hardware, including distributed memory compute clusters, is presented. A code generator and a matching back end allow the acceleration of HPM code with GPUs. Finally, the achievable performance and scalability are demonstrated for different example problems.
    Language: English
    Type: article , doc-type:article
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2022-12-12
    Description: Solving partial differential equations on unstructured grids is a cornerstone of engineering and scientific computing. Nowadays, heterogeneous parallel platforms with CPUs, GPUs, and FPGAs enable energy-efficient and computationally demanding simulations. We developed the HighPerMeshes C++-embedded Domain-Specific Language (DSL) for bridging the abstraction gap between the mathematical and algorithmic formulation of mesh-based algorithms for PDE problems on the one hand and an increasing number of heterogeneous platforms with their different parallel programming and runtime models on the other hand. Thus, the HighPerMeshes DSL aims at higher productivity in the code development process for multiple target platforms. We introduce the concepts as well as the basic structure of the HighPer-Meshes DSL, and demonstrate its usage with three examples, a Poisson and monodomain problem, respectively, solved by the continuous finite element method, and the discontinuous Galerkin method for Maxwell’s equation. The mapping of the abstract algorithmic description onto parallel hardware, including distributed memory compute clusters is presented. Finally, the achievable performance and scalability are demonstrated for a typical example problem on a multi-core CPU cluster.
    Language: English
    Type: article , doc-type:article
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...