Transfer Learning In MIR: Sharing Learned Latent Representations For Music Audio Classification And Similarity
Venue
14th International Conference on Music Information Retrieval (ISMIR '13) (2013)
Publication Year
2013
Authors
Philippe Hamel, Matthew E. P. Davies, Kazuyoshi Yoshii, Masataka Goto
BibTeX
Abstract
This paper discusses the concept of transfer learning and its potential
applications to MIR tasks such as music audio classification and similarity. In a
traditional supervised machine learning setting, a system can only use labeled data
from a single dataset to solve a given task. The labels associated with the dataset
define the nature of the task to solve. A key advantage of transfer learning is in
leveraging knowledge from related tasks to improve performance on a given target
task. One way to transfer knowledge is to learn a shared latent representation
across related tasks. This method has shown to be beneficial in many domains of
machine learning, but has yet to be explored in MIR. Many MIR datasets for audio
classification present a semantic overlap in their labels. Furthermore, these
datasets often contain relatively few songs. Thus, there is a strong case for
exploring methods to share knowledge between these datasets towards a more general
and robust understanding of high level musical concepts such as genre and
similarity. Our results show that shared representations can improve classification
accuracy. We also show how transfer learning can improve performance for music
similarity.
