TY - JOUR
T1 - Normalized cuts for predominant melodic source separation
AU - Lagrange, Mathieu
AU - Martins, Luis Gustavo
AU - Murdoch, Jennifer
AU - Tzanetakis, George
N1 - Funding Information:
Manuscript received December 21, 2006; revised August 22, 2007. This work was supported by the National Science and Research Council of Canada (NSERC), the Canada Foundation for Innovation (CFI), the Portuguese Foundation for Science and Technology (FCT), and the Fundação Calouste Gulbenkian (Portugal). The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Michael M. Goodwin.
Copyright:
Copyright 2009 Elsevier B.V., All rights reserved.
PY - 2008/2
Y1 - 2008/2
N2 - The predominant melodic source, frequently the singing voice, is an important component of musical signals. In this paper, we describe a method for extracting the predominant source and corresponding melody from ldquoreal-worldrdquo polyphonic music. The proposed method is inspired by ideas from computational auditory scene analysis. We formulate predominant melodic source tracking and formation as a graph partitioning problem and solve it using the normalized cut which is a global criterion for segmenting graphs that has been used in computer vision. Sinusoidal modeling is used as the underlying representation. A novel harmonicity cue which we term harmonically wrapped peak similarity is introduced. Experimental results supporting the use of this cue are presented. In addition, we show results for automatic melody extraction using the proposed approach.
AB - The predominant melodic source, frequently the singing voice, is an important component of musical signals. In this paper, we describe a method for extracting the predominant source and corresponding melody from ldquoreal-worldrdquo polyphonic music. The proposed method is inspired by ideas from computational auditory scene analysis. We formulate predominant melodic source tracking and formation as a graph partitioning problem and solve it using the normalized cut which is a global criterion for segmenting graphs that has been used in computer vision. Sinusoidal modeling is used as the underlying representation. A novel harmonicity cue which we term harmonically wrapped peak similarity is introduced. Experimental results supporting the use of this cue are presented. In addition, we show results for automatic melody extraction using the proposed approach.
KW - Computational auditory scene analysis (CASA)
KW - Music information retrieval (MIR)
KW - Normalized cut
KW - Sinusoidal modeling
KW - Spectral clustering
UR - http://www.scopus.com/inward/record.url?scp=64849087459&partnerID=8YFLogxK
U2 - 10.1109/TASL.2007.909260
DO - 10.1109/TASL.2007.909260
M3 - Article
AN - SCOPUS:64849087459
SN - 1558-7916
VL - 16
SP - 278
EP - 290
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
IS - 2
M1 - 4432646
ER -