This thesis is concerned with the problem of automatic extraction of harmonic and rhythmic information from music audio signals using statistical framework and advanced signal processing methods. Among different research directions, automatic extraction of chords and key has always been of a great interest to Music Information Retrieval (MIR) community. Chord progressions and key information can serve as a robust mid-level representation for a variety of MIR tasks. We propose statistical approaches to automatic extraction of chord progressions using Hidden Markov Models (HMM) based framework. General ideas we rely on have already proved to be effective in speech recognition. We propose novel probabilistic approaches that include acoustic modeling layer and language modeling layer. We investigate the usage of standard N-grams and Factored Language Models (FLM) for automatic chord recognition. Another central topic of this work is the feature extraction techniques. We develop a set of new features that belong to chroma family. A set of novel chroma features that is based on the application of Pseudo-Quadrature Mirror Filter (PQMF) bank is introduced. We show the advantage of using Time-Frequency Reassignment (TFR) technique to derive better acoustic features. Tempo estimation and beat structure extraction are amongst the most challenging tasks in MIR community. We develop a novel method for beat/downbeat estimation from audio. It is based on the same statistical approach that consists of two hierarchical levels: acoustic modeling and beat sequence modeling. We propose the definition of a very specific beat duration model that exploits an HMM structure without self-transitions. A new feature set that utilizes the advantages of harmonic-impulsive component separation technique is introduced. The proposed methods are compared to numerous state-of-the-art approaches by participation in the MIREX competition, which is the best impartial assessment of MIR systems nowadays.

Music signal processing for automatic extraction of harmonic and rhythmic information / Khadkevich, Maksim. - (2011), pp. 1-143.

Music signal processing for automatic extraction of harmonic and rhythmic information

Khadkevich, Maksim
2011-01-01

Abstract

This thesis is concerned with the problem of automatic extraction of harmonic and rhythmic information from music audio signals using statistical framework and advanced signal processing methods. Among different research directions, automatic extraction of chords and key has always been of a great interest to Music Information Retrieval (MIR) community. Chord progressions and key information can serve as a robust mid-level representation for a variety of MIR tasks. We propose statistical approaches to automatic extraction of chord progressions using Hidden Markov Models (HMM) based framework. General ideas we rely on have already proved to be effective in speech recognition. We propose novel probabilistic approaches that include acoustic modeling layer and language modeling layer. We investigate the usage of standard N-grams and Factored Language Models (FLM) for automatic chord recognition. Another central topic of this work is the feature extraction techniques. We develop a set of new features that belong to chroma family. A set of novel chroma features that is based on the application of Pseudo-Quadrature Mirror Filter (PQMF) bank is introduced. We show the advantage of using Time-Frequency Reassignment (TFR) technique to derive better acoustic features. Tempo estimation and beat structure extraction are amongst the most challenging tasks in MIR community. We develop a novel method for beat/downbeat estimation from audio. It is based on the same statistical approach that consists of two hierarchical levels: acoustic modeling and beat sequence modeling. We propose the definition of a very specific beat duration model that exploits an HMM structure without self-transitions. A new feature set that utilizes the advantages of harmonic-impulsive component separation technique is introduced. The proposed methods are compared to numerous state-of-the-art approaches by participation in the MIREX competition, which is the best impartial assessment of MIR systems nowadays.
2011
XXIII
2011-2012
Ingegneria e Scienza dell'Informaz (cess.4/11/12)
Information and Communication Technology
Omologo, Maurizio
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
Maksim_Khadkevich_PhD_Thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.28 MB
Formato Adobe PDF
2.28 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/367673
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact