ChaNGa

Performance

Zooms with r431

__LMC mass cosmological zoom from romulus, baryons, particles, 87 pc softening. 4096 steps in total.

  • The first 5 real steps (to z=62.5972), with multicore-smp and a single 128-core node (to z=62.5972), takes 1939.01 seconds
  • ATROCIOUS performance for my intel + verbs build, ONE STEP takes longer than the whole 5 for 2 nodes!!

AGORA SUPER

  • Why is this dang machine so atrociously slow? It has Zen2 cores (EPYC 7742)

pkdgrav3

Building

  • Load the following modules:
    • fftw/3.3.8, boost/1.75.0-intel20.4, anaconda3/2024.10-1, gcc/13.2.1-p20240113, and openmpi/5.0.3-gcc13.2.1
  • Activate my pkdgrav3 conda environment
    • conda activate pkdgrav3
  • Build:
cmake -S . -B build
cmake --build build

Performance

  • It appears to be incredibly fast (like beyond belief)
  • I’m having trouble getting it to actually run in parallel