PROBLEM STATEMENT

RESEARCH QUESTION

DESIGN OF AUDIO AND VIDEO SUB SYSTEM

IMPLEMENTATION OF AUDIO AND VIDEO SUB SYSTEM

EVALUATION AND RESULTS


As a service relying on the Internet, a Web meeting tool is directly impacted by underlying networking problems. Both audio and video real time communication require steady Internet link with enough bandwidth available. When these conditions are not met, several problems can arise:

- for audio conferencing: unpleasant or even unintelligible sound playback;
- for video conferencing: blocking and jerky video playback;

These problems can seriously degrade the communication quality, making an Internet conferencing solution practically useless. That is why most current commercial and open source Internet meeting solutions cannot reliably be used in the African context.

Go to page top


The primary objective of a Web meeting tool is to support human communication, special emphasis need to be put on user experience and satisfaction. One research objective in this project is to study how the user experience can be positively enhanced despite networking problems. Investigations are focused on how to reprioritize features and offer the best tradeoff between quality and usability. The main research question is:

- Is it possible to construct an effective audio-video conferencing tool that can cope with low bandwidth and unstable networking conditions?

Go to page top


The system is based on client-server model. The client can either record and send audio-video stream or
receive and play it back. The server acts as a central hub: it receives multimedia streams from one client
and forwards to the rest of participants.The figure below illustrates the overall system architecture.


The sound is recorded at frequency of 8000 samples per second, with a sample size of 8 bits. The
resulting audio bit rate is 64 kbps, which is compressed with the ZIP format into a 28 kbps audio stream.
For video, each image is captured individually. Then the difference between two consecutive images is
calculated, compressed with JPEG and sent across the network. A key frame is sent after a certain
number of iterations, to cope with image degradation resulting from accumulation of differences.

Congestion detection is based on the average reception time of audio packets. When the average time
between two audio packet receptions goes beyond 2 seconds, congestion is suspected and actions are
taken to reduce bandwidth usage.

Go to page top


The final prototype is a module providing audio and video communication among participants attending a virtual meeting. This module can be combined to work together with other components: screen sharing, presentation, text chat, floor control and participant list.



Go to page top

The analysis of bandwidth usage shows that half duplex communication is more effective than full duplex. This fact is particularly true for a meeting context, where communications are less interactive compared to phone call, for example. In a meeting with 2 participants, half duplex uses 76% of the bandwidth required by full duplex. This difference drastically increases with the number of participant: for 8 users, half duplex uses only 13% of bandwidth required by full duplex. The graph below campares bandwidth usage between half and full duplex when the number of users increases.


Image differentiation helps to reduce bandwidth usage by sending across the difference between images. The experiment indicates an average gain of 28% in bandwidth usage when image differentiation is used. This gain increases when higher frame rates are used, as still images are closer to each others.

Combining image differentiation with low frame rates can result in a very light video stream. When the image is updated every 5 seconds, the resulting video stream requires only 15 kbps (~ 2 KB/s), which is almost half of audio stream bandwidth.

The audio compression scheme adopted reduces the sound stream to 44% of its initial size without quality degradation. The resulting stream is evaluated as good or excellent by 86% of users.

The video was evaluated in 3 different settings: low frame rate (0.2 FPS), medium frame rate (1 FPS) and high frame rate (3 FPS). For low frame rate video stream, as expected, the quality of the video is perceived as “not that good” by users. But more importantly, the video stream helped to convey a sense presence ranked as either “good” or “strong” by 75% of users. This fact demonstrates that even with limited bandwidth, it is possible to supply a minimalistic video stream that can significantly enhance the meeting experience. This appreciation of video contribution to the meeting naturally increases when to frame rate raises.

The prototype developed requires only 64 kbps at the server to reliably host an audio meeting with 2 participants.  When a low frame rate video is used with audio, 96 kbps is needed to host a consistent meeting with 2 participants. A server with 512 kbps bandwidth can host meeting with: 18 participants using audio only, 11 participants using audio and low frame rate video, 5 participants using audio and medium frame rate video or 2 participants using audio and high frame rated video. The table and graph below give the expected number of participants for different bandwidth at the server.



Go to page top

 
Hussein Suleman
Antoine Bagula