On-the-fly clustering for exascale molecular dynamics simulations.
Alizée Dubois and Thierry Carrard
Summary (AI generated)
To evaluate the performance of our simulation, we conducted a test involving 81.9 million atoms. The accompanying graph illustrates the activity across the cores over time, with a specific focus on the workload distribution indicated by the work ID. The lower graph integrates this information, showing the number of active workers on the cores throughout the simulation. We observed a bottleneck related to the communication of information through the master MPI; however, the time required for this process is relatively minor compared to the overall analysis duration. While this presents a limitation, it is manageable and not a critical issue.
Our goal is to determine the performance of the algorithm by selecting the optimal partitioning for the Neural Network Potential (NNP). The ideal partitioning should enhance the efficiency of the Molecular Dynamics analysis. We utilize nodes composed of 64 cores each, allowing for various decomposition configurations. For instance, a configuration of 1 MPI with 125 threads can be tested. Increasing the number of MPI processes results in a higher volume of messages, but adding more threads can lead to increased memory access time, which may not be optimal. Therefore, we aim to identify the best scenario, ultimately determining that a configuration of 4 MPI with 32 threads yields the best performance.
In terms of strong scaling, we maintain a constant load of 81.9 million atoms, which is relatively manageable for our system.