Advanced N-body

Building an N-body Simulator with Post-Newtonian General Relativity

A fully relativistic N-body gravitational simulator for the inner solar system requires three interlocking components: the 1PN Einstein-Infeld-Hoffmann equations of motion (the same formulation JPL uses for DE440/DE441), a high-order adaptive integrator like IAS15 or DOP853 that handles velocity-dependent forces, and rigorous validation against JPL Horizons ephemeris data. The critical physics payoff is reproducing Mercury's anomalous perihelion precession of 42.98 arcseconds per century — the signature confirmation of general relativity — while achieving kilometer-level positional accuracy over multi-year integrations.

1. The Newtonian Foundation and Post-Newtonian Corrections

Newtonian pairwise acceleration

The starting point is the standard N-body gravitational acceleration. For body i in a system of N bodies:

r̈_i = Σ_{j≠i} μ_j · (r_j - r_i) / r_ij³

where μ_j = G·m_j is the gravitational parameter of body j, r_ij = |r_i − r_j| is the Euclidean distance. For the 5-body Sun-Mercury-Venus-Earth-Mars system, there are 10 unique pairwise interactions.

The 1PN equations: JPL's PPN formulation

The JPL planetary ephemerides DE440/DE441 use the parametrized post-Newtonian (PPN) form of the equations of motion. Setting β = γ = 1 (pure general relativity), the acceleration of body i in barycentric harmonic coordinates is:

r̈_i = Σ_{j≠i} (μ_j/r_ij³)(r_j − r_i) · {
    1
    − (4/c²) Σ_{k≠i} μ_k/r_ik
    − (1/c²) Σ_{k≠j} μ_k/r_jk
    + (v_i²/c²) + (2v_j²/c²)
    − (4/c²)(ṙ_i · ṙ_j)
    − (3/2c²)[(r_i − r_j)·ṙ_j / r_ij]²
    + (1/2c²)(r_j − r_i)·r̈_j^N
}
+ (1/c²) Σ_{j≠i} (μ_j/r_ij³) · [(r_i − r_j)·(4ṙ_i − 3ṙ_j)] · (ṙ_i − ṙ_j)
+ (7/2c²) Σ_{j≠i} μ_j · r̈_j^N / r_ij

This is Equation 8-1 from Standish & Williams in the Explanatory Supplement to the Astronomical Almanac, the exact equation used by JPL. This 1PN formulation captures Mercury's perihelion precession (42.98 arcsec/century), de Sitter geodetic precession, and the metric structure underlying the Shapiro time delay.

GM values and physical constants

GM (gravitational parameter) is known far more precisely than G or M individually. Always use GM directly, never compute G×M. Values from JPL DE440:

Body	GM (km³/s²)
Sun	132712440041.279419
Mercury	22031.868551
Venus	324858.592000
Earth	398600.435507
Mars system	42828.375816

Additional constants: c = 299792.458 km/s (exact), 1 AU = 149597870.700 km (IAU 2012, exact).

2. Numerical Integrators: From Leapfrog to IAS15

Symplectic leapfrog

The kick-drift-kick leapfrog (Stormer-Verlet) is the workhorse 2nd-order symplectic integrator. Symplecticity means energy error oscillates but never drifts secularly — a critical advantage for long integrations. The cost is just 1 force evaluation per step.

The symplecticity problem with PN forces: Post-Newtonian corrections introduce velocity-dependent forces, making the Hamiltonian non-separable. Workarounds include treating PN forces as small perturbative kicks or using a non-symplectic adaptive integrator like IAS15 that handles velocity-dependent forces natively.

IAS15: the gold standard for PN-corrected N-body

The IAS15 integrator (Rein & Spiegel 2015), used in the REBOUND code, is a 15th-order implicit method based on Gauss-Radau quadrature. Key advantages:

Near machine-precision accuracy — with ~100 steps per orbit, truncation error per step is ~(1/100)^15 ≈ 10^-30, far below double-precision roundoff (~10^-16). Handles velocity-dependent forces natively with no separability requirement. Automatic adaptive stepping via convergence of the highest-order coefficient. Only 8 force evaluations per step with ~2 predictor-corrector iterations.

Integrator comparison

Integrator	Order	Force evals/step	Symplectic	Handles PN	Adaptive
Leapfrog	2	1	Yes	No	No
Yoshida 4	4	3	Yes	No	No
RK4	4	4	No	Yes	No
DOP853	8(5,3)	12	No	Yes	Yes
IAS15	15	~16	No	Yes	Yes

3. Force Computation: Barnes-Hut and FMM

Direct summation: optimal for N ≤ 10,000

For 5 bodies, direct O(N²) summation computes all 10 unique pairs and is trivially optimal — tree-based methods carry significant constant-factor overhead for small N.

Barnes-Hut octree for O(N log N) scaling

The Barnes-Hut algorithm hierarchically decomposes space into an octree. Each internal node stores the total mass and center of mass of all descendants. The multipole acceptance criterion (MAC): if a cell of size s is at distance d from the target body and s/d < θ (typically θ = 0.5), the entire cell is treated as a single point mass.

function compute_force(body, node, θ):
    if node is leaf:
        if node.body == body: return 0
        return pairwise_acceleration(body, node.body)

    d = |node.center_of_mass - body.position|
    if node.size / d < θ:
        return node.total_mass / d² * unit_direction  // monopole approx

    return Σ compute_force(body, child_k, θ)  // recurse into children

Memory layout: Structure of Arrays

Structure of Arrays (SoA) stores each attribute contiguously across all bodies. For force computation, loading x[j:j+8] fills one cache line with 100% useful data and enables SIMD vectorization — AVX2 processes 4 doubles simultaneously from contiguous arrays. Measured speedups from AoS to SoA: 3x typical, up to 19x on Xeon Phi.

4. Validating Against JPL Horizons

The validation workflow

Step 1: Fetch initial conditions for all bodies at epoch t₀ from Horizons, using barycentric ICRF coordinates and TDB time scale.

Step 2: Integrate the system from t₀ to t₁ using your chosen integrator and force model.

Step 3: Fetch Horizons state vectors at regular comparison times between t₀ and t₁.

Step 4: Compute position error, RMS error, relative energy error, and angular momentum conservation at each comparison time.

Step 5: Plot error growth curves. Linear growth indicates systematic error (missing physics); quadratic growth suggests integrator drift; √t growth indicates optimal roundoff behavior.

Mercury perihelion precession as the key GR validation

To measure the GR-induced precession: detect perihelion passages by finding local minima of |r_Mercury − r_Sun|, compute the Laplace-Runge-Lenz vector at each perihelion, track the angle over time and fit a linear trend, then isolate the GR contribution by running simulations with and without 1PN corrections. The difference should yield 42.98 ± ~0.05 arcsec/century.

Mercury completes ~415.2 orbits per century, so the GR precession per orbit is ~0.1035 arcsec, corresponding to ~29 km of perihelion displacement per orbit — measurable even in short integrations.

Without GR (Newtonian only), Mercury's position error grows to ~100–1000 km per year as the missing ~0.1 arcsec/orbit precession accumulates. Over 100 years, this becomes catastrophic (>10⁵ km). Adding 1PN corrections reduces Mercury's error by 1–2 orders of magnitude.

5. Optimization: SIMD, Kahan Summation, Compiler Flags

SIMD vectorization of the inner force loop

With SoA layout, the inner loop processes 4 interactions simultaneously using AVX2 doubles. Measured speedup from AVX2 vectorization: ~42% over optimized scalar. AVX-512 doubles this again for supported hardware. Combine with OpenMP over the outer loop for multiplicative speedups: SIMD × thread count.

Kahan compensated summation

When summing N force contributions, naive summation accumulates O(ε·N) roundoff error. Kahan summation reduces this to O(ε) by tracking a running compensation term:

struct KahanAccumulator {
    double sum = 0.0, comp = 0.0;
    void add(double value) {
        double y = value - comp;
        double t = sum + y;
        comp = (t - sum) - y;
        sum = t;
    }
};

Compiler flags and their dangers

-ffast-math is dangerous: it enables reassociation of floating-point operations, which destroys Kahan compensated summation by optimizing away the compensation term (algebraically zero, but not in finite precision). Use -fno-math-errno -fno-trapping-math -ffinite-math-only instead.

Conclusion

The most important architectural decisions are: (1) use the exact 1PN PPN equation from Standish's Equation 8-1 with β = γ = 1, which is precisely what JPL uses for DE440; (2) choose IAS15 or DOP853 as the integrator, since velocity-dependent PN forces break symplecticity and these methods handle them natively at high order; (3) validate against Mercury's 42.98 arcsec/century perihelion precession as the definitive GR benchmark.

For the 5-body inner solar system, direct O(N²) summation with 10 pairs is trivially optimal. The performance-critical path is the PN-corrected force evaluation, where SoA memory layout, AVX2 SIMD, and Kahan summation together yield order-of-magnitude improvements in both speed and accuracy. The final system should achieve kilometer-level accuracy over years and reproduce the relativistic precession within 0.05% after a few simulated centuries.