Presented at Multipoint Workshop held in conjunction with ACM Multimedia 1994.
The Berkeley Plateau Multimedia Research Group is doing research on systems infrastructure and applications of continuous media (i.e., digital video and audio). The group is working on a variety of topics including: software toolkits to develop continuous media applications, MPEG video compression and decompression, video-on demand systems (VOD), desktop video conferencing, and hypermedia courseware.
This abstract will briefly describe two multipoint continuous media applications that we have developed: 1) a distributed video-on-demand system and 2) a desktop video conferencing system. Both applications were developing using the Berkeley Continuous Media Toolkit which provides support for developing distributed client/server applications including continuous media network protocols, continuous media device abstractions, distributed client/server programming, and large object storage services. The current status of the toolkit and planned extensions will be summarized.
Our view is that multipoint services have not been widely adopted because high quality, low cost applications that actually solve users problems have not yet been developed. Many interesting and potentially useful applications have been developed, but current technology and application restrictions have reduced the effectiveness of these applications.
We are developing a distributed video-on-demand (VOD) system that will be capable of storing hundreds of hours of video material [Federighi94, Rowe94]. Figure 1 shows the architecture of the Berkeley Distributed VOD System. The system is composed of a database, one or more video file servers (VFS), and one or more archive servers that manage tertiary storage devices (e.g., optical disk or tape jukeboxes). The database contains metadata and indexes about the videos stored in the system and tracks the location of video material cached on one of the VFS's. It is also used to schedule the movement of data between tertiary storage and the servers.
The system is accessed using an ad hoc query interface, called the Video Database Browser (VDB), that allows a user to query the database to locate interesting video material and then, if necessary, schedule it to be retrieved from the tertiary store. The system will support heterogeneous platforms and networks which means that it might have to transcode a video stream so that the stream can be transported to, decoded, and played on the user's platform. Numerous other services are required including: 1) protecting copyrighted material (i.e., encryption), 2) electronic billing, and 3) cache management on the VFS's (e.g., replicating popular videos and choosing the appropriate objects to keep in the cache, etc.). The user should view the system as a large electronic library that allows videos to be played without having to know where a video is stored or how it will be delivered.
Our ultimate goal is to build a distributed VOD system that will support archives at different geographic locations and use local VFS's to cache videos for playback.
Multipoint services are an integral part of the system. First, users querying metadata databases at different locations are essentially querying a distributed database. A distributed database access protocol such as SQL Access uses multicast communication and distributed transaction processing services. Second, the management of the secondary storage caches on the VFS's is essentially distributed cache management. We envision these algorithms optimizing both local access patterns as well as wider area accesses. For example, since loading a video off a tertiary storage device is a slow process, it makes sense to maintain copies of frequently accessed videos on-line. Suppose you have a video stored on a tertiary storage device at a remote location. You might want to keep the video on secondary storage to service requests for the video at your site and requests from other sites. If it was just accesses from your site, you might not chose to keep it on-line.
We have implemented a desktop video conferencing system that will be used to experiment with one-to-one and small group conferences. The system supports a variety of connection management services (e.g., access lists, caller-ID, glances, dynamically adding a user to a one-to-one call thereby turning it into a conference, etc.), directory services (e.g., phone book, users running the system, and speed dial), and application services (e.g., participation in multiple calls/conferences at the same time, recording a call/conference to a file, and playing a stored video into a conference). Figure 2 shows the session manager panel and the user-specific connection management properties. The system currently runs on Sun Sparcstations using a Parallax motion JPEG board, but it is being ported to other platforms and video boards including DEC Alphas, HP Snakes, and IBM PCs.
Our plan is to make the system available to many users so that we can experiment with the use of desktop video conferencing in day-to-day use. We are also planning on building vertical applications using the system such as a distance learning application that allows users to indicate to the speaker a request to ask a question without disturbing other viewers and that will show the questioner to all viewers. We also hope to experiment with features that let a student interact with a software package under guidance and/or control of a teaching assistant during a presentation. A help desk application is another example of a vertical application. A person seeking help might interact with a support representative who can diagnose the problem and direct the person to supplementary material (e.g., hypermedia courseware) that will guide them to a solution. Another example is a network office hours application that supports asking questions to an instructor including the ability of watching other student's questions and the instructor's response.

Considerable research is required to discover, implement, and test the operations required by these applications. For example, the help desk and network office hours application must allow the technical support representative or instructor, call him or her the expert, to play a video or traverse hypermedia courseware on his or her screen while at the same time it is being shown to the person seeking help. The expert needs to find a particular segment in the video to discuss it with the person seeking help. Then, the expert wants to pass control of the video or courseware to the person so he or she can continue to watch it. Later, the person may want to ask the expert another question which would require showing the expert some material the person is looking at in the video or courseware. Few systems support this degree of dynamic interaction.
The Continuous Media Toolkit (CMT) provides support for a variety of audio and video devices, media dependent resources, media independent audio/video sources and destinations, and protocols for creating them, connecting them together, and controlling them.
Devices represent the audio and video hardware that can be used by an application. Examples of devices are graphics framebuffers, video capture and playback boards with and without compression/decompression capability, audio capture and playback boards, disk files, and analog video switchers and digital video effects systems.
Resources represent application independent, but media specific continuous media abstractions built on devices. Examples are live cameras that can be used to capture a motion JPEG stream, Windows that can display an MPEG video stream, microphones that can be used to capture CD quality stereo audio streams, and speakers that can play 16-bit linear audio data. Another example of a resource is a clipfile that stores a sequence of blocks or frames of a particular type of media. A script is a file that specifies how clips from different clipfiles are organized into synchronized streams.
Sources and destinations are application-specific abstractions that connect resources. For example, a script source is an object that can send audio and video packets to a destination and a script destination can record audio and video packets for later playback.
CMT provides many other abstractions and services including CM packet transport (e.g., a best-effort network protocol built on TCP/IP and a real-time protocol such as RTIP), synchronized clocks and timers, audio and video buffers, and numerous glue objects (e.g., splitters, multiplexers, audio mixers, etc.). In addition, it provides distributed programming support including a name server for locating, starting and stopping services, a simple distributed object system, blocking and non-blocking RPC and failure recovery protocols (e.g., automatically re-starting failed servers).
An application connects to processes on hosts with continuous media devices and creates sources, destinations, and resources and connects them together. Then, the application issues commands to activate the objects (e.g., begin video capture, reposition script to logical time 10 seconds, play video, etc.).
CMT version 1.1 was used to implement the CMPlayer application that supports network playback of synchronized audio (8khz, 8-bit ulaw) and video (hardware assisted motion JPEG and software-only decoded MPEG) [Rowe92]. CMT version 2.0 was used to implement the desktop video conferencing system. We are currently working on a new version that will modify the basic continuous media abstractions (i.e., devices, resources, sources, and destinations), improve the reliability of the system, and support new platforms and devices. We plan a beta release of CMT version 3.0 before the end of 1994.