On-the-fly clustering for exascale molecular dynamics simulations.

Alizée Dubois and Thierry Carrard

Slide at 12:46

METHODOLOGY AND TECHNICAL BOTTLENECKS
Alizée Dubois and Thi...
CONNECTED COMPONENTS ANALYSIS
IMAGE PROCESSING
GRAPH PROCESSING
COMMUNITY
COMMUNITY
Shared Memory Parallelization
Distributed Memory
Finite number of pass
Iterative algorithm
Not scalable beyond a single
Scalable, but requires multiple
node
iterations
Necessity to port the algorithm
To be adapted for 3D images
to a distributed memory system
while limiting the number of
passes
A. DUBOIS - T. CARRARD - COMPUTER PHYSICS COMMUNICATIONS SEMINAR SERIES - 03/03/25
II Votre écran est partagé par le biais de l'application app.zoom.us.
Arrêter le partage
Masquer

Share slide

Summary (AI generated)

We propose a new approach for on-the-fly detection of particle aggregates and regions of interest within molecular dynamics simulations. This algorithm must be versatile, capable of identifying any binarizable zone of interest. It should be robust and scalable to handle simulations involving billions of atoms. Additionally, it should be cost-effective relative to the overall simulation expenses, enabling physicists to obtain real measurements, including quantification of bars and qualification of data. To achieve these objectives, we have selected connected components analysis.

Connected components analysis can be divided into two main branches: one developed by the image processing community and the other by the graph processing community. The image processing community typically employs shared memory algorithms with a finite number of passes, which are not scalable beyond a single node. Therefore, utilizing these algorithms requires integration into a routine memory system. Conversely, the graph processing community offers fully parallelized algorithms, although many are iterative and would need adaptation for 3D images.

The reason for focusing on images is that we work with atomic data, specifically the movement of atoms. However, we will not remain solely on atomic data; instead, we will project our atomic quantities onto a regular grid. This approach will allow us to define our spatial subdomain effectively.