Changxi Zheng wins NSF CAREER Award

Changxi Zheng

Changxi Zheng, assistant professor of computer science, is the recipient of a National Science Foundation (NSF) CAREER Award for his proposal to create realistic, computer-generated sounds for immersive virtual realities. The five-year, $500,000 grant will fund his proposal “Simulating Nonlinear Audiovisual Dynamics for Virtual Worlds and Interactive Applications.”

“I am honored by the award and excited about the research it supports,” says Zheng, who is co-directing the Columbia Computer Graphics Group (C2G2) in the Columbia Vision + Graphics Center. “The algorithms and tools used today for constructing immersive virtual realities are inherently appearance-oriented, modeling only geometry or visible motion. Sound is usually added manually and only as an afterthought. So there is a clear gap.”
While computer-generated imagery has made tremendous progress in recent years to attain high levels of realism, the same efforts have not yet been applied to computer-generated sound. Zheng is among the first working in the area of dynamic, physics-based computational sound for immersive environments, and his proposal will look to tightly integrate the visual and audio components of simulated environments. It will represent a change from today’s practices where digital sound is usually created and recorded separately of the action and then dropped in when appropriate. In Zheng’s proposal, computational sound will be automatically generated using physics-based simulation methods and fully synchronized with the associated motion.

“What I propose to do about sound synthesis is analogous to the existing image rendering methods for creating photorealistic images,” says Zheng. “We would like to create audiovisual animations and virtual environments using computational methods.”

“In addition to the realism, sound that is not tightly coupled with motion loses some of its impact and detracts from the overall immersive experience,” says Zheng. “Attaining any type of realism in virtual worlds requires synchronizing the audio and visual components. What the user hears should spring directly and naturally from what the user sees.”
It will take new mathematical models and new computational methods. Sound is a physical phenomenon, and creating sound by computer requires understanding and simulating all the motions and forces that go into producing sound. Computational methods will have to replicate everything from the surface vibrations on an object that produce the pressure waves we hear as sound, while taking into account how frequency, pitch, and volume are affected by the object’s size, shape, weight, surface textures, and countless other variables. The way sound propagates differs also, depending on whether sound waves travel through air or water and what obstructions, or other sound waves, get in the way. It’s a dynamic situation where a slight adjustment to one variable produces nonlinear changes in another.
Zheng’s system will have to tackle not only the physical phenomena but do so in a computationally efficient way so there is no delay between an action and the resulting sound. “Realistic virtual environments require a great many complex computations. We will need fast, cost-effective algorithms. We need also to identify what calculations can be approximated while still maintaining realism.”
The computational sound is just the beginning. Zheng foresees his proposal as laying the groundwork for building interactive applications and spurring new techniques for efficient digital production and engineering design. Education outreach for disseminating the information learned from the research is also part of the proposal, with workshops planned for high-school students to encourage their interest in STEM fields, and with college courses for teaching undergraduate and graduate researchers how to build audiovisual computational technologies.
For now, numerous challenges exist toward computationally synthesizing realistic sound but if the research is successful, it will enable new virtual-world applications with fully synchronized audiovisual effects, lead to new ways of creating and editing multimedia content, and bring computer-generated sound into the future.
BS, Shanghai Jiaotong University (China), 2005; PhD, Cornell University, 2012