Fitzhenry, Charles and Khatieb, Tanweer and Marais, Patrick and Marquard, Stephen (2022) Low Resource, Post-processed Lecture Recording from 4K Video Streams, Proceedings of 43rd Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT), 18-20 July 2022, Cape Town, EPiC Series in Computing, 85, 15-35, EPiC.
Text
Low_Resource-_Post-processed_Lecture_Recording_from_4K_Video_Streams.pdf - Published Version Download (15MB) |
Abstract
Many universities are using lecture recording technology to expand the reach of their teaching programs, and to continue instruction when face to face lectures are not possi- ble. Increasingly, high-resolution 4K cameras are used, since they allow for easy reading of board/screen context. Unfortunately, while 4K cameras are now quite affordable, the back-end computing infrastructure to process and distribute a multitude of recorded 4K streams can be costly. Furthermore, the bandwidth requirements for a 4K stream are exorbitant - running to over 2GB for a 45-60 minute lecture. These factors mitigate against the use of such technology in a low-resource environment, and motivated our investigation into methods to reduce resource requirements for both the institution and students. We describe the design and implementation of a low resource 4K lecture recording solution, which addresses these problems through a computationally efficient video processing pipeline. The pipeline consists of a front-end, which segments presenter motion and writing/board surfaces from the stream and a back-end, which serves as a virtual cinematographer (VC), combining this contextual information to draw attention to the lecturer and relevant content. The bandwidth saving is realized by defining a smaller fixed-size, context-sensitive ‘cropping window’ and generating a new video from the crop regions. The front-end utilises computationally cheap temporal frame differencing at its core: this does not require expensive GPU hardware and also limits the memory required for processing. The VC receives a small set of motion/content bounding boxes and applies established framing heuristics to determine which region to extract from the full 4K frame. Performance results coupled to a user survey show that the system is fit for purpose: it is able to produce good presenter framing/context, over a range of challenging lecture venue layouts and lighting conditions within a time that is acceptable for lecture video processing.
Item Type: | Conference paper |
---|---|
Subjects: | Computing methodologies > Computer graphics > Image manipulation > Image processing Applied computing > Education > Distance learning |
Alternate Locations: | https://wvvw.easychair.org/publications/download/V8j2 |
Date Deposited: | 12 Sep 2022 10:55 |
Last Modified: | 12 Sep 2022 10:55 |
URI: | https://pubs.cs.uct.ac.za/id/eprint/1531 |
Actions (login required)
View Item |