They attribute this failure to a culture problem. There is a fear of embarrassment from appearing in front of the video phone in your pajamas, hair all unkempt, looking like a slob. Using an audio phone, people can mask any aspect of their physical appearance, since it takes far less work to make your voice sound good than to spend time on your hair. This phone seemed like an invasion of privacy.
We thought another problem was due to the fact that most Americans are conservative and luddites at heart. New technology scares people to the point where they won't try again at something if it failed before. It has to be presented in a new way. AT&T's videophone was simply a better version of the videophone that everyone had heard so much about (and never seen) in the 60's. They remembered all the bad things about the videophone and never even considered any of its good qualities that AT&T tried to stress (Showing your grandkids to their grandparents by video was one popular image).
In addition to perceptual problems, there was a problem of distribution. How many people do you know who have a videophone? Yes, exactly the problem. Without a critical mass of videophone purchasers, consumers basically got a really expensive phone whose video display remained blank most of the time.
Michael Shilman thought about the difference between the American culture and the Japanese culture. The Japanese tend to be gadgetophiles. They buy any new electronic toy that comes on the market. Take Sony minidisc. The technology is smaller, lightweight, recordable and better than tape. Why do most Japanese have really small Sony minidiscmans, while American's are still clunking around with huge fullsize discmans? Americans didn't see enough of an improvement over the fullsize disc to justify buying it. Most people remember the big debate when 8-track gave way to the cassette tape. They also remember LPs giving way to tapes and CDs. They had no intention of throwing out their amassed assets just to buy them over again with no appreciable difference in quality other than the minidiscs not holding as much as the fullsize discs.
Another aspect of this paper addresses why people want video conferencing in the first place. It claims results from several studies (with questionable methodologies) over a number of years that show that the debate is almost over. In the beginning, people were excited about the technology and predicted that 85% of future meetings would be done over video. Further studies showed that this number dropped over time until about 4% today.
Michael Shilman also came up with a graph of communication styles from IRC talk to face-to-face conversation that can be found on the Media Board archives.
Perhaps the most interesting line in this paper was, "It was just like being there." We found this amusing at the time, since we were presenting these papers live over the MBone to the class. Staring into a video camera, instead of looking at the monitor where the pictures of the people were, was not "just like being there." In addition, since we were the focus of the conversation, conversation tended to die unless we kept it going. There was none of the subtle eye contact or body language that lets other people know that you want them to talk now.
In Rapport, the key metaphor is a "virtual meeting room" in which a user enters the room and call other people to join in. It provides a mixture of A/V and document communication, in which the A/V takes a back seat to other modes of discourse (specifically the document sharing). One person can use the document editing software at a time, and users "raises" a virtual hand (cursor) to gain editing privileges. The system also has a "store and forward" function which allow meetings to be saved and passed on to other people.
Team WorkStation is an "open shared workspace" which uses video as its primary medium. User's sketches are captured on video and are superimposed on one another to achieve this effect. Ishii's current work involves making the UI disappear completely from the collaborative experience.
Media Space views video as the most important aspect of the collaborative design, with the claim that design is more of a social task than a technical one. The ambiguity of being able to talk on video allows for more expressive freedom in the design process. Work here lead to VideoDraw, which combines a sketching surface with a video screen. PARC researchers note that users prefer more manual tasks in the collaborative environments, and hypothesize that this is a signal their system is not quite right yet.
Cruiser is a "casual encounter" based video system. Users can make quick calls into offices to see if people are around. Users typically make only short conversations on the system, and often pack up their stuff and meet in person for anything longer than a couple minutes. Bellcore has also developed VideoWindow, which is a large virtual window separating parties at different sites. They note that it is often easy to ignore people on the other side of the window.
There were also some interesting MOO applications and suggestions for future applications:
Like all good secure systems, the fail-safe mode is denial-of-access. There is no unintentional video send or receive with the users explicit go ahead. There will be no Big Brother watching over you unless you allow it.
How can this system deal video archival and playback? Each segment may have a different encryption and security level, and thus may allow people to see only small portions of it. If you have video playback, how can you tell it's not live? Could it be a bot playing back prerecorded video snippets? Is it live or is it Memorex? Does it matter? If you expose the source of the video, is it in violation of the source's privacy or anonymity? What if the source is a bot? Is it violation of the bot's privacy?