Discovering State Variables Hidden in Experimental Data

Boyuan Chen, Kuang Huang, Sunand Raghupathi Ishaan Chandratreya, Qiang Du, Hod Lipson

All physical laws are described as relationships between state variables that give a complete and non-redundant description of the relevant system dynamics. However, despite the prevalence of computing power and AI, the process of identifying the hidden state variables themselves has resisted automation. Most data-driven methods for modeling physical phenomena still assume that observed data streams already correspond to relevant state variables. A key challenge is to identify the possible sets of state variables from scratch, given only high-dimensional observational data. Here we propose a new principle for determining how many state variables an observed system is likely to have, and what these variables might be, directly from video streams. We demonstrate the effectiveness of this approach using video recordings of a variety of physical dynamical systems, ranging from elastic double pendulums to fire flames. Without any prior knowledge of the underlying physics, our algorithm discovers the intrinsic dimension of the observed dynamics and identifies candidate sets of state variables. We suggest that this approach could help catalyze the understanding, prediction and control of increasingly complex systems.

Overview Video (with narrations and subtitles)


Main paper: arXiv:2112.10755 [cs.CV]

Download our Appendix for more information here

Code and Dataset

We release our codebase at this link. Please follow the instructions on our GitHub page to use the code and the dataset..


Columbia University


This research was supported in part by NSF AI Institute for Dynamical Systems #2112085, DARPA MTO Lifelong Learning Machines (L2M) Program HR0011-18-2-0020, NSF NRI #1925157, NSF DMS #1937254, NSF DMS #2012562, NSF CCF #1704833, and DE #SC0022317.


If you have any questions, please feel free to contact Boyuan Chen