Embedded Real-time Deep Learning for a Smart Guitar: A Case Study on Expressive Guitar Technique Recognition

Stefani, Domenico

doi:10.15168/11572_399995

Smart musical instruments are an emerging class of digital musical instruments designed for music creation in an interconnected Internet of Musical Things scenario. These instruments aim to integrate embedded computation, real-time feature extraction, gesture acquisition, and networked communication technologies. As embedded computers become more capable and new embedded audio platforms are developed, new avenues for real-time embedded gesture acquisition open up. Expressive guitar technique recognition is the task of detecting notes and classifying the playing techniques used by the musician on the instrument. Real-time recognition of expressive guitar techniques in a smart guitar would allow players to control sound synthesis or to wirelessly interact with a wide range of interconnected devices and stage equipment during performance. Despite expressive guitar technique recognition being a well-researched topic in the field of Music Information Retrieval, the creation of a lightweight real-time recognition system that can be deployed on an embedded platform still remains an open problem. In this thesis, expressive guitar technique recognition is investigated by focusing on real-time execution, and the execution of deep learning inference on resource-constrained embedded computers. Initial efforts have focused on clearly defining the challenges of embedded real-time music information retrieval, and on the creation of a first, fully embedded, real-time expressive guitar technique recognition system. The insight gained, led to the refinement of the various steps of the proposed recognition pipeline. As a first refinement step, a novel procedure for the optimization of onset detectors was developed. The proposed procedure adopts an evolutionary algorithm to find parameter configurations that are optimal both in terms of detection accuracy and latency. A subsequent study is devoted to shedding light on the performance of generic deep learning inference engines for embedded real-time audio classification. This consisted of a comparison of four common inferencing libraries, which focus on the applicability of each library to real-time audio inference, and their performance in terms of execution time and several additional metrics. Different insights from these studies supported the development of a new expressive guitar technique classifier, which is accompanied by an in-depth analysis of different aspects of the recognition problem. Finally, the experience collected during these studies culminated in the definition of a procedure to deploy deep learning inference to a prominent embedded platform. These investigations have been shown to improve the state-of-the-art by proposing approaches that surpass previous alternatives and providing new knowledge on problems and tools that can aid the creation of a smart guitar. The new knowledge provided was also adopted for embedded audio tasks that differ from real-time expressive guitar technique recognition.

Embedded Real-time Deep Learning for a Smart Guitar: A Case Study on Expressive Guitar Technique Recognition / Stefani, Domenico. - (2024 Jan 11), pp. 1-205. [10.15168/11572_399995]