Auditory Sparse Coding
Venue
Music Data Mining, CRC Press/Chapman Hall (2011)
Publication Year
2011
Authors
Steven R. Ness, Thomas Walters, Richard F. Lyon
BibTeX
Abstract
The concept of sparsity has attracted considerable interest in the field of machine
learning in the past few years. Sparse feature vectors contain mostly values of
zero and one or a few non-zero values. Although these feature vectors can be
classified by traditional machine learning algorithms, such as SVM, there are
various recently-developed algorithms that explicitly take advantage of the sparse
nature of the data, leading to massive speedups in time, as well as improved
performance. Some fields that have benefited from the use of sparse algorithms are
finance, bioinformatics, text mining, and image classification. Because of their
speed, these algorithms perform well on very large collections of data; large
collections are becoming increasingly relevant given the huge amounts of data
collected and warehoused by Internet businesses. We discuss the application of
sparse feature vectors in the field of audio analysis, and specifically their use
in conjunction with preprocessing systems that model the human auditory system. We
present results that demonstrate the applicability of the combination of
auditory-based processing and sparse coding to content-based audio analysis tasks:
a search task in which ranked lists of sound effects are retrieved from text
queries, and a music information retrieval (MIR) task dealing with the
classification of music into genres.
