The Action-Reaction Learning system functions as a server which receives real-time multi-dimensional data from the vision systems and re-distributes it to the graphical systems for rendering. Typically during training, two vision systems and two graphics systems are connected to the ARL server. Thus, it is natural to consider the signals and their propagation as multiple temporal data streams. Within the ARL server, perceptual data or tracked motions from the vision systems are accumulated and stored explicitly into a finite length time series of measurements.
For the head and hand tracking case, two triples of Gaussian blobs are generated (one triple for each human) by the vision systems and form 30 continuous scalar parameters that evolve as a multi-dimensional time series. Each set of 30 scalar parameters can be considered as a 30 dimensional vector arriving into the ARL engine from the vision systems at a given time t. The ARL system preprocesses then trains from this temporal series of vectors. However, certain issues arise when processing time series data and dealing with temporal evolution of multidimensional parameters. The representation of this data is critical to reliably predict and forecast the evolution of the time series or, equivalently, estimate the parameters of the 6 blobs in near future. The future value of will be referred to as and it must be forecasted from several measurements of the previous vectors which will form a window of perceptual history.