Library

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
Years
Language
  • 1
    Publication Date: 2020-10-16
    Description: Recently, Intel released the oneAPI programming environment. With Data Parallel C++ (DPC++), oneAPI enables codes to target multiple hardware architectures like multi-core CPUs, GPUs, and even FPGAs or other hardware using a single source. For legacy codes that were written for Nvidia GPUs, a compatibility tool is provided which facilitates the transition to the SYCL-based DPC++ programming language. This paper presents early experiences when using both the compatibility tool and oneAPI as well the employed extension to the SYCL programming standard for the tsunami simulation code easyWave. A performance study compares the original code running on Xeon processors using OpenMP as well as CUDA with the performance of the DPC++ counter part on multicore CPUs as well as integrated GPUs.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2022-06-13
    Description: Large capacity Storage Class Memory (SCM) opens new possibilities for workloads requiring a large memory footprint. We examine optimization strategies for a legacy Fortran application on systems with an heterogeneous memory configuration comprising SCM and DRAM. We present a performance study for the multigrid solver component of the large-eddy simulation framework PALM for different memory configurations with large capacity SCM. An important optimization approach is the explicit assignment of storage locations depending on the data access characteristic to take advantage of the heterogeneous memory configuration. We are able to demonstrate that an explicit control over memory locations provides better performance compared to transparent hardware settings. As on aforementioned systems the page management by the OS appears as critical performance factor, we study the impact of different huge page settings.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2023-01-09
    Description: This work provides a brief description of Omni-Path Express and the current status of its development, stability, and performance. Basic benchmarks that highlight the gains of OPX over PSM2 are provided, and the results of an initial performance and scalability study of several applications are presented.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2023-07-17
    Description: Version 4.0 of the Message Passing Interface standard introduced the concept of Partitioned Communication which adds support for multiple contributions to a communication buffer. Although initially targeted at multithreaded MPI applications, Partitioned Communication currently receives attraction in the context of accelerators, especially GPUs. In this publication it is demonstrated that this communication concept can also be implemented for SYCL-programmed FPGAs. This includes a discussion of the design space and the presentation of a prototypical implementation. Experimental results show that a lightweight implementation on top of an existing MPI library is possible. In addition, the presented approach also reveals issues in both the SYCL and the MPI standard which need to be addresses for improved support of the intended communication style.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2024-03-27
    Description: Molecular dynamics simulations are one of the methods in scientific computing that benefit from GPU acceleration. For those devices, SYCL is a promising API for writing portable codes. In this paper, we present the case study of HAL’s MD package that has been successfully migrated from CUDA to SYCL. We describe the different strategies that we followed in the process of porting the code. Following these strategies, we achieved code portability across major GPU vendors. Depending on the actual kernels, both significant performance improvements and regressions are observed. As a side effect of the migration process, we obtained impressing speedups also for execution on CPUs.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2024-03-27
    Description: The use of stand-alone, network-coupled Field Programmable Gate Array (FPGA) accelerators is intended to significantly increase the energy efficiency of HPC applications and thus also of HPC data centers. A loose coupling between the nodes of the HPC data center and the FPGAs is established through the high-speed network of the data center. This allows greater flexibility in combining different nodes and accelerators. Both the resulting energy savings and the increased flexibility through the network connection, enable the economical use of FPGAs. This work presents a communication stack to integrate the so-called Network-attached Accelerator (NAA) into the HPC data center. A low-level Remote Direct Memory Access (RDMA) Application Programming Interface (API) and a high-level Remote Procedure Call (RPC) API is designed on top of the RDMA over Converged Ethernet v2 (RoCEv2) communication stack. The experimental results over 100 Gbps RoCEv2 show that our design and implementation deliver performance close to the theoretical maximum.
    Language: English
    Type: conferenceobject , doc-type:conferenceObject
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...