Video from a Single Exposure Coded Photograph




Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs.

In this project, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit.

To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device.  Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.

This project was done in collaboration with Yasunobu Hitomi and Tomoo Mitsunaga of Sony Corporation.

Publications

"Video from a Single Coded Exposure Photograph using a Learned Over-Complete Dictionary,"
Y. Hitomi, J. Gu, M. Gupta, T. Mitsunaga and S.K. Nayar,
IEEE International Conference on Computer Vision (ICCV),
Nov. 2011.
[PDF] [bib] [©] [Project Page]

"Efficient Space-Time Sampling with Pixel-wise Coded Exposure for High Speed Imaging,"
D. Liu, J. Gu, Y. Hitomi, M. Gupta, T. Mitsunaga and S.K. Nayar,
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 36, No. 2, pp. 248?260, Feb. 2014.
[PDF] [bib] [©] [Project Page]

Images

  A fundamental tradeoff between temporal and spatial resolution:

Due to image sensor hardware factors, as the frame rate increases, spatial resolution decreases. It causes degradation of image quality.

     
  The goal of our work:

The goal of our work is to design an imaging system that can capture videos with both high spatial and temporal resolutions. In this project, we focus on two problems: 1) sampling, and 2) representation of space-time volumes for designing practical compressive video acquisition systems.

     
  How to sample space-time volumes while accounting for the restrictions imposed by imaging hardware?:

For the maximum flexibility in designing sampling schemes, it is important to have pixel-wise exposure control. Meanwhile, we design sampling functions adhering to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices.

     
  How to effciently represent space-time volumes for sparse reconstruction?:

We propose learning an over-complete dictionary from a large collection of videos, and represent any given video as a sparse, linear combination of the elements from the dictionary. The redundant nature of these dictionaries leads to highly sparse representations.

     
  Hardware prototype and experiments:

While we have not yet fabricated a CMOS image sensor chip with per-pixel exposure control, we constructed an emulation imaging system with an LCoS device to achieve pixel-wise exposure control. We show video reconstruction results for a variety of motions, ranging from simple linear translation to complex fluid motion and muscle deformations.

     

Video

  ICCV 2011 Video:

This video is a compilation of the proposed coding and reconstruction schemes and some of the results of this project. (With narration)

     

Coded Rolling Shutter Photography:
Flexible Space-Time Sampling

Programmable Imaging: Micro-Mirror Arrays

Temporal Modulation Imaging