Lorenzo Torresani wins the Google Faculty Research Award

Our own Lorenzo Torresani has won the Google Faculty Research Award. Dr. Torresani aims to use deep learning (i.e., learning of deep networks) to discover compact representations of video that work well for classifying human pose dynamics.

Dr. Torresani proposed to learn semantic primitives to represent human actions in video. The primitives are learned by training deep convolutional neural networks to classify different human pose dynamics. Such learned representation promises to significantly improve the accuracy of video understanding applications, including action recognition, semantic segmentation of video, as well as search and retrieval.

The technical novelty of the approach is twofold:

  1. Deep learning (i.e., learning of deep networks) is used to discover a compact representations of video that work well for the aforementioned tasks. In the last couple of years deep networks have outperformed all other approaches for the problem of still-image recognition. There is promise that this project will deliver analogous improvements in the domain of video and action recognition.
  2. Also novel is the focus on describing human pose dynamics, i.e., the spatial-temporal variations of the joints of the human figure in the video. Pose dynamics have been shown to be crucially-important clues for action recognition, yet most current approaches are based on much simpler and lower-level features, such as the appearance and motion of corners and edges.

Visit the website of the Visual Learning Group, lead by Dr. Torresani, for more information on this and other exciting projects.