Large Scale Content-Based Audio Retrieval from Text Queries

Gal Chechik

Eugene Ie

Martin Rehn

Samy Bengio

Richard F. Lyon

ACM International Conference on Multimedia Information Retrieval (MIR), ACM (2008)

Google Scholar

Abstract

In content-based audio retrieval, the goal is to find sound recordings (audio documents) based on their acoustic features. This content-based approach differs from retrieval approaches that index media files using metadata such as file names and user tags.

In this paper, we propose a machine learning approach for retrieving sounds that is novel in that it (1) uses free-form text queries rather sound sample based queries, (2) searches by audio content rather than via textual meta data, and (3) can scale to very large number of audio documents and very rich query vocabulary. We handle generic sounds, including a wide variety of sound effects, animal vocalizations and natural scenes. We test a scalable approach based on a passive-aggressive model for image retrieval (PAMIR), and compare it to two state-of-the-art approaches; Gaussian mixture models (GMM) and support vector machines (SVM).

We test our approach on two large real-world datasets: a collection of
short sound effects, and a noisier and larger collection of
user-contributed user-labeled recordings (25K files, 2000 terms
vocabulary). We find that all three methods achieved very good
retrieval performance. For instance, a positive document is retrieved
in the first position of the ranking more than half the time, and on
average there are more than 4 positive documents in the first 10
retrieved, for both datasets. PAMIR completed both training and
retrieval of all data in less than 6 hours for both datasets, on a
single machine. It was one to three orders of magnitude faster than
the competing approaches. This approach should therefore scale to much
larger datasets in the future.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Large Scale Content-Based Audio Retrieval from Text Queries

Abstract

Research Areas

Learn more about how we conduct our research