Low Resource, Post-processed Lecture Recording from 4K Video Streams

Fitzhenry, Charles and Khatieb, Tanweer and Marais, Patrick and Marquard, Stephen (2022) Low Resource, Post-processed Lecture Recording from 4K Video Streams, Proceedings of 43rd Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT), 18-20 July 2022, Cape Town, EPiC Series in Computing, 85, 15-35, EPiC.

Text
Low_Resource-_Post-processed_Lecture_Recording_from_4K_Video_Streams.pdf - Published Version
Download (15MB)

Abstract

Many universities are using lecture recording technology to expand the reach of their teaching programs, and to continue instruction when face to face lectures are not possi- ble. Increasingly, high-resolution 4K cameras are used, since they allow for easy reading of board/screen context. Unfortunately, while 4K cameras are now quite affordable, the back-end computing infrastructure to process and distribute a multitude of recorded 4K streams can be costly. Furthermore, the bandwidth requirements for a 4K stream are exorbitant - running to over 2GB for a 45-60 minute lecture. These factors mitigate against the use of such technology in a low-resource environment, and motivated our investigation into methods to reduce resource requirements for both the institution and students. We describe the design and implementation of a low resource 4K lecture recording solution, which addresses these problems through a computationally efficient video processing pipeline. The pipeline consists of a front-end, which segments presenter motion and writing/board surfaces from the stream and a back-end, which serves as a virtual cinematographer (VC), combining this contextual information to draw attention to the lecturer and relevant content. The bandwidth saving is realized by defining a smaller fixed-size, context-sensitive ‘cropping window’ and generating a new video from the crop regions. The front-end utilises computationally cheap temporal frame differencing at its core: this does not require expensive GPU hardware and also limits the memory required for processing. The VC receives a small set of motion/content bounding boxes and applies established framing heuristics to determine which region to extract from the full 4K frame. Performance results coupled to a user survey show that the system is fit for purpose: it is able to produce good presenter framing/context, over a range of challenging lecture venue layouts and lighting conditions within a time that is acceptable for lecture video processing.

Item Type:	Conference paper
Subjects:	Computing methodologies > Computer graphics > Image manipulation > Image processing Applied computing > Education > Distance learning
Alternate Locations:	https://wvvw.easychair.org/publications/download/V8j2
Date Deposited:	12 Sep 2022 10:55
Last Modified:	12 Sep 2022 10:55
URI:	https://pubs.cs.uct.ac.za/id/eprint/1531

Actions (login required)

View Item