6th Programming and Abstractions for Data Locality Workshop

4 September 2023, Monday (DAY 1)

16:00 – 18:00 | Compilers and DSLs

16:00 – 16:20 | ‘An approach to whole-program optimization in GT4Py’ by Mauro Bianco (ETH Zurich – CSCS)

Many approaches to program optimization are affected by several shortcomings: fragile integration into legacy application codes, reliance on data layout defined at language level, difficulty in eliminating operations due to observability constraints, lack of a programming model that covers the whole spectrum of supported computer architectures. GT4Py, or GridTools for Python, tries to overcome some of these limitations by offering a high-level domain specific language for Weather and Climate applications, in which whole-program optimization can be enabled thanks to a JIT compilation and delayed/lazy instantiation. While some limitations are still not yet solved, our path is to develop a solution to the main issues preventing code optimizations to be effective, still maintaining the readability and reusability of the source code, by proposing a change in direction with respect to traditional HPC application development.

16:20 – 16:40 | ‘Integrating Data Layout and Data Movement into Code Optimization’ by Mary Hall (University of Utah)

Machine imbalance has prioritized limiting data movement, in conjunction with maximizing parallelism, as the focus of optimization strategies. In this talk, we will describe the role of data layout and data movement optimizations in domain-specific optimizations for stencils, sparse tensors, and deep learning. The key ideas are as follows: (1) access data in the order in which it is stored to accelerate requisite and eliminate unnecessary data movement; (2) provide a mapping between physical and logical data layout that can be used during code generation; and, (3) augment existing compiler technology to expose data layout and data movement to developers and tools.

16:40 – 17:00 | ‘Optimising for locality – automatically: Mapping the landscape of locality optimisation algorithms’ by Paul H J Kelly (Imperial College London)

Exploiting locality is fundamentally harder than finding parallelism. We have developed many tools for tackling this challenge by hand – sometimes with some abstraction. For me though, the real way forward is to automate: to deploy algorithms that find a good locality+parallelism combination for us. We can then raise the level at which we think about the problem – this is what real abstraction looks like. This talk does not offer new research results – instead I aim to map out what we know about when we can automate locality, when it works and when it doesn’t. I’ll illustrate with some examples from my own group’s work and others. The main objective is to promote and to frame discussion at the workshop.

17:00 – 18:00 | PANEL 2 (Moderator: Emmanuel Jeannot)