Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-02-11T02:31:25.735Z Has data issue: false hasContentIssue false

General sound classification and similarity in MPEG-7

Published online by Cambridge University Press:  15 February 2002

Michael Casey
Affiliation:
MERL Cambridge Research Laboratory, Cambridge, USA E-mail: casey@merl.com
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We introduce a system for generalised sound classification and similarity using a machine-learning framework. Applications of the system include automatic classification of environmental sounds, musical instruments, music genre and human speakers. In addition to classification, the system may also be used for computing similarity metrics between a target sound and other sounds in a database. We discuss the use of hidden Markov models for representing the temporal evolution of audio spectra and present results of testing the system on classification and retrieval tasks. The system has been incorporated into the MPEG-7 international standard for multimedia content description and is therefore publicly available in the form of a set of standardised interfaces and software reference tools for developers and researchers.

Type
Research Article
Copyright
© 2001 Cambridge University Press