Skip to content

Results

This page reports what seems to work, what broke along the way, and what would be worth improving next.

Successes

The project already does a few things reasonably well:

  • it trains one model across a family of double-pendulum parameter settings
  • it uses a mechanically structured architecture instead of direct black-box acceleration regression
  • it produces usable held-out rollouts
  • it supports some manual out-of-distribution parameter tests
  • it includes diagnostics that help inspect the learned structure, not just the final trajectories

Example results:

Rollout Predictions

These are examples of rollout prediction from the test set; the parameters of the pendula (blob masses and rod lengths) were not seen during training.

Energy Tests

The energy loss (which should be 0 in perfect predictions) remains at or below 1% for most if not all the cases. Please note that the vertical axis is already scaled as a percentage value. These refers to the same tests above in 'Rollout Predictions'.

Energy drift test 0 Energy drift test 3 Energy drift test 8

The potential and kinetic energy representation that the model predicts are ok, but you can see how the kinetic energy predictions starts being qualitatively different from the truth when approaching the edges, because of lack of training data in the region.

Potential and Kinetic Energy Decomposition 0 Potential and Kinetic Energy Decomposition 1

Out of Distribution Predictions

The tests included some parameter and initial condition pairs that are outside of the ranges of values used in training. That means positions, velocities, masses and lengths that were outside the ranges seen in training.

Out of distribution test 0 Out of distribution test 1 Out of distribution test 2

Failures

The project also ran into plenty of failure modes on the way:

  • unstable or meaningless matrix structure before the kinetic branch was constrained properly
  • conditioning choices that looked reasonable and trained poorly
  • sensitivity to parameter sampling and target quality

That history is important because it explains why the current architecture is shaped the way it is.

Out of Distribution Failures

When the parameters and/or initial conditions fall too far from the training distribution, the error starts accumulating and because of the high-nonlinearity, results drastically diverge. In this case below, the second blob of the ground truth falls short of a full swing at the very beginning, while the model predicts the full rotation and from there, the trajectories become completely different.

Out of distribution test 3 Simulation

Current Limitations

The main limitations are:

  • the implementation is specialized to the 2-DoF double pendulum
  • the learned energy story is in normalized coordinates, not fully physical coordinates
  • the current energy-conservation regularizer assumes one trajectory chunk per batch
  • several workflows are still driven by script constants and hard-coded examples

What To Improve Next

If this project were pushed further, the most useful next steps would probably be:

  1. make the structured kinetic construction less hard-coded to the 2-DoF case
  2. generalize training and evaluation so fewer settings live directly inside scripts
  3. make the energy-conservation regularizer explicitly trajectory-local for more flexible batching
  4. expand training to larger traiing distribution parameters and initial conditions
  5. expand evaluation beyond a few held-out and manual OOD examples