A Parallel Software-Only Video Effects
Processing System

by:

K. Meyer-Patel
Computer Science Division - EECS
University of California
Berkeley, CA 94720-1776

Abstract

Video is playing an increasingly important role as an Internet media data type. Internet video use, however, typically means streaming live or on-demand material without manipulation. One important class of operations is video effects processing such as titling, compositing, and blending. Experience from the television, video, and film industries shows that video effects are an important tool for communicating information and maintaining audience interest. In most applications, video is created in traditional studio settings, edited with special purpose hardware, and finally digitized and compressed for Internet streaming.

We envision that streaming video on the Internet will become a first-class data type that can be manipulated in real-time. As such, a network-based model centered around a compressed packet stream representation is needed instead of the traditional model centered around an uncompressed synchronous stream representation. In this new model, video sources will be compressed packet video streaming across a network from cameras connected to computers and video-on-demand archives. The destination of the processed video will include archival systems, content indexing systems, and viewers watching the video. In this way, video effects processing will be incorporated into a variety of applications including distance learning, collaborative virtual meetings, remote training, news and entertainment.

This dissertation describes a software-only video effects processing system designed for the compressed packet video environment. We call this system the Parallel Software-only Video Effects Processing system (PSVP). A software-only solution using commodity hardware provides the flexibility required to handle compressed video sources. Variable frame rates, packet loss, and jitter which are attributes of Internet video can be handled gracefully with dynamic adaptation. A software solution provides flexibility to adapt to new video formats and communication protocols and benefit from continuing improvements in processor and networking technology.

The key to a software solution is exploiting parallelism. Currently, a single processor cannot produce a wide variety of real-time video effects which is why conventional systems and early research systems use custom-designed hardware. Even as processors become faster, the demand for more complicated effects, larger images, and higher quality will increase the video effects processing requirements. A scalable software solution is required to meet these growing application demands. The quality of video used on the Internet today is quite poor and is unlike CD quality audio which is near the limits of human perception. PSVP is a parallel solution that can incorporate additional computing resources to meet increased demands for higher quality.

Fortunately, video processing algorithms contain a high degree of parallelism. Three types of parallelism can be exploited when implementing these algorithms: functional, temporal, and spatial. Functional parallelism can be exploited by decomposing the video effect task into smaller subtasks and mapping these subtasks onto different computational resources. Temporal parallelism can be exploited by demultiplexing the stream of video frames to different processors and multiplexing the processed output. For example, one processor may deal with all odd numbered frames while another deals with all even numbered frames. Spatial parallelism can be exploited by assigning regions of the video stream to different processors. For example, one processor may process the left half of all video frames while another deals with the right half.

Taking advantage of these types of parallelism requires the solution of different problems. This dissertation describes our solution to some of these problems. Specifically:

Another problem encountered during this research is that video compression formats were designed for storage and transmission and not for manipulation. Transport protocols for packet video often assume that a video source originates from a single point in the network. These assumptions conflict with how a distributed software system, such as PSVP, might produce the video stream. The design choices made in building PSVP were heavily influenced and sometimes constrained by the earlier design choices made by those who developed these standards and protocols. The dissertation describes these influences and constraints. The overall lesson learned from developing PSVP is that video formats and protocols developed with transmission and storage as the primary applications create artificial constraints for applications that manipulate packet video data.