Results

This page reports what seems to work, what broke along the way, and what would be worth improving next.

Successes

The project already does a few things reasonably well:

it trains one model across a family of double-pendulum parameter settings
it uses a mechanically structured architecture instead of direct black-box acceleration regression
it produces usable held-out rollouts
it supports some manual out-of-distribution parameter tests
it includes diagnostics that help inspect the learned structure, not just the final trajectories

Example results:

Rollout Predictions

These are examples of rollout prediction from the test set; the parameters of the pendula (blob masses and rod lengths) were not seen during training.

Energy Tests

The energy loss (which should be 0 in perfect predictions) remains at or below 1% for most if not all the cases. Please note that the vertical axis is already scaled as a percentage value. These refers to the same tests above in 'Rollout Predictions'.

Energy drift test 0 Energy drift test 3 Energy drift test 8

The potential and kinetic energy representation that the model predicts are ok, but you can see how the kinetic energy predictions starts being qualitatively different from the truth when approaching the edges, because of lack of training data in the region.

Potential and Kinetic Energy Decomposition 0

Out of Distribution Predictions

The tests included some parameter and initial condition pairs that are outside of the ranges of values used in training. That means positions, velocities, masses and lengths that were outside the ranges seen in training.

Out of distribution test 0 Out of distribution test 1 Out of distribution test 2

Failures

The project also ran into plenty of failure modes on the way:

unstable or meaningless matrix structure before the kinetic branch was constrained properly
conditioning choices that looked reasonable and trained poorly
sensitivity to parameter sampling and target quality

That history is important because it explains why the current architecture is shaped the way it is.

Out of Distribution Failures

When the parameters and/or initial conditions fall too far from the training distribution, the error starts accumulating and because of the high-nonlinearity, results drastically diverge. In this case below, the second blob of the ground truth falls short of a full swing at the very beginning, while the model predicts the full rotation and from there, the trajectories become completely different.

Out of distribution test 3 Simulation

Current Limitations

The main limitations are:

the implementation is specialized to the 2-DoF double pendulum
the learned energy story is in normalized coordinates, not fully physical coordinates
the current energy-conservation regularizer assumes one trajectory chunk per batch
several workflows are still driven by script constants and hard-coded examples

What To Improve Next

If this project were pushed further, the most useful next steps would probably be:

make the structured kinetic construction less hard-coded to the 2-DoF case
generalize training and evaluation so fewer settings live directly inside scripts
make the energy-conservation regularizer explicitly trajectory-local for more flexible batching
expand training to larger traiing distribution parameters and initial conditions
expand evaluation beyond a few held-out and manual OOD examples