Category: Conference

Feature Learning with Deep Scattering for Urban Sound Analysis

10/6/2015

In this paper we evaluate the scattering transform as an alternative signal representation to the mel-spectrogram in the context of unsupervised feature learning for urban sound classification. We show that we can obtain comparable (or better) performance using the scattering transform whilst reducing both the amount of training data required for feature learning and the size of the learned codebook by an order of magnitude. In both cases the improvement is attributed to the local phase invariance of the representation. We also observe improved classification of sources in the background of the auditory scene, a result that provides further support for the importance of temporal modulation in sound segregation.

For further details please see our paper:

J. Salamon and J. P. Bello. "Feature Learning with Deep Scattering for Urban Sound Analysis", 2015 European Signal Processing Conference (EUSIPCO), Nice, France, August 2015.
[EURASIP][PDF][BibTex]

0 Comments

Tony: A New Tool for Transcribing Melodies

3/4/2015

2 Comments

We present Tony, a software tool for the interactive annotation of melodies from monophonic audio recordings, and evaluate its usability and the accuracy of its note extraction method. The scientific study of acoustic performances of melodies, whether sung or played, requires the accurate transcription of notes and pitches. To achieve the desired transcription accuracy for a particular application, researchers manually correct results obtained by automatic methods. Tony is an interactive tool directly aimed at making this correction task efficient. It provides (a) state-of-the art algorithms for pitch and note estimation, (b) visual and auditory feedback for easy error-spotting, (c) an intelligent graphical user interface through which the user can rapidly correct estimation errors, (d) extensive export functions enabling further processing in other applications. We show that Tony’s built in automatic note transcription method compares favourably with existing tools. We report how long it takes to annotate recordings on a set of 96 solo vocal recordings and study the effect of piece, the number of edits made and the annotator’s increasing mastery of the software. Tony is Open Source software, with source code and compiled binaries for Windows, Mac OS X and Linux available from:
https://code.soundsoftware.ac.uk/projects/tony/

Screenshot of the Tony interface on OSX.

For further details please check out our paper:

M. Mauch, C. Cannam, R. Bittner, G. Fazekas, J. Salamon, J. Dai, J. P. Bello, and S. Dixon. Computer-aided melody note transcription using the Tony software: Accuracy and efficiency. In First International Conference on Technologies for Music Notation and Representation (TENOR), Paris, France, May 2015.
[TENOR][PDF][BibTex]

2 Comments

Unsupervised Feature Learning for Urban Sound Classification

5/2/2015

0 Comments

Recent studies have demonstrated the potential of unsupervised feature learning for sound classification. In this paper we further explore the application of the spherical k-means algorithm for feature learning from audio signals, here in the domain of urban sound classification. Spherical k-means is a relatively simple technique that has recently been shown to be competitive with other more complex and time consuming approaches. We study how different parts of the processing pipeline influence performance, taking into account the specificities of the urban sonic environment. We evaluate our approach on the largest public dataset of urban sound sources available for research, and compare it to a baseline system based on MFCCs. We show that feature learning can outperform the baseline approach by configuring it to capture the temporal dynamics of urban sources. The results are complemented with error analysis and some proposals for future research.

J. Salamon and J. P. Bello. "Unsupervised Feature Learning for Urban Sound Classification", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015.
[IEEE][DOI][PDF][BibTeX][Copyright]

0 Comments

mir_eval wins best poster presentation at ISMIR 2014

30/10/2014

0 Comments

Our paper "mir_eval: A Transparent Implementation of Common MIR Metrics", lead and presented by fearless Colin Raffel has won the Best Poster Presentation Award at the ISMIR 2014 conference!

Here's the paper's abstract:
Central to the field of MIR research is the evaluation of algorithms used to extract information from music data. We present mir_eval, an open source software library which provides a transparent and easy-to-use implementation of the most common metrics used to measure the performance of MIR algorithms. In this paper, we enumerate the metrics implemented by mir_eval and quantitatively compare each to existing implementations. When the scores reported by mir_eval differ substantially from the reference, we detail the differences in implementation. We also provide a brief overview of mir_eval’s architecture, design, and intended use.

A massive congratulations to comrades Colin, Brian, Eric, Oriol Dawen and Dan for creating this awesome project, and in particular to Colin for leading this initiative and doing a fantastic job at presenting it at ISMIR today!

You can check out mir_eval here: https://github.com/craffel/mir_eval

0 Comments

ESSENTIA wins ACM Multimedia '13 Best Open Source Software Award

31/12/2013

ESSENTIA is an audio analysis software library developed at the MTG over the past eight years, to which I am proud to have made my small contribution too (through the great effort of Dmitry Bogdanov).

Recently ESSENTIA was released as open source software, and shortly after won the ACM Multimedia 2013 Best Open Source Award! A massive congratulations to everyone at the MTG who has worked on the library over the years, and especially to Dmitry Bogdanov and Nicolas Wack.

ESSENTIA's first open source release is accompanied with two papers:

D. Bogdanov, N. Wack, E. Gómez, S. Gulati, P. Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata and X. Serra, "ESSENTIA: an Audio Analysis Library for Music Information Retrieval", in Proc. 14th International Society for Music Information Retrieval Conference (ISMIR 2013), Curitiba, Brazil, November 2013.

[ISMIR][PDF][BibTex]

D. Bogdanov, N. Wack, E Gómez, S. Gulati, P. Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata and X. Serra, "ESSENTIA: an Open-Source Library for Sound and Music Analysis", in 21st ACM Int. Conf. on Multimedia, Barcelona, Spain, Oct. 2013.

[ACM][PDF][BibTex]

Back from ISMIR 2012

22/10/2012

2 Comments

The ISMIR 2012 conference is now over and it was a very productive experience, and quite intense too!

The week started off with the tutorials on Monday, of which I'd like to highlight the tutorial on Reusable software and reproducibility in music informatics research, which discussed important issues in software development which are often overlooked by the research community.

On Tuesday I didn't have to present anything and so could enjoy a relaxed day of interesting presentations, posters, and conversations with the members of the MIR community who with each conference become more and more like an extended "academic family" :)

Wednesday was definitely the most intense: it started off with my oral presentation which was the first of the day, describing our work in:

J. Salamon, G. Peeters and A. Röbel, "Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.

The talk was well received, even though it being the first session of the day meant the audience didn't attack with the usual set of feisty questions as they often do. Thanks to Kazuyoshi Yoshii for the interesting question and suggestion!

Next up was our poster presentation with Julián:

J. Salamon and J. Urbano, "Current Challenges in the Evaluation of Predominant Melody Extraction Algorithms", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.

The poster managed to attract considerable interest, which is good, because it means the Audio Melody Extraction Annotation Initiative might become a reality after all! Stay posted on this topic...

On thursday we presented two posters, one on tonic detection in Indian classical music and another on tracking melodic patterns in Flamenco music:

J. Salamon, S. Gulati and X. Serra, "A Multipitch Approach to Tonic Identification in Indian Classical Music", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.

A. Pikrakis, F. Gómez, S. Oramas, J. M. D. Báñez, J. Mora, F. Escobar, E. Gómez and J. Salamon, "Tracking Melodic Patterns in Flamenco Singing by Analyzing Polyphonic Music Recordings", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.

On Friday Emilia presented our work on Flamenco transcription:

E. Gómez, F. Cañadas, J. Salamon, J. Bonada, P. Vera and P. Cabañas, "Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.

And in the afternoon during the Late Breaking / Demo Session I demo'ed MELODIA, our recently released Melody Extraction vamp plug-in, which was a great opportunity to get some feedback from the community regarding the plug-in.

I should also mention that this year there was massive MTG participation, which you can read more about here. To get an idea, here's a photo we took of all MTG current and ex-members who attended ISMIR 2012, and even here there are a couple of people missing because they didn't make it in time for the photo:

Finally, ISMIR was of course also a good chance for enjoying portuguese food, wine (vino de porto) and beer together with fellow researchers.

On a more serious note though, it was an excellent forum to get feedback for my work, and I highly recommend ISMIR for PhD students working in a related area.

Next year... Brazil!

2 Comments

Two More ISMIR Papers - 5 In Total!

3/7/2012

2 Comments

I'm glad two confirm two more papers which I have co-authored on computational analysis of Flamenco music have also been accepted to the ISMIR 2012 conference, upping the total of accepted papers to 5!

Here's the list of all five accepted papers (also on my Publications page):

J. Salamon, G. Peeters and A. Röbel, "Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.
J. Salamon and J. Urbano, "Current Challenges in the Evaluation of Predominant Melody Extraction Algorithms", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.
J. Salamon, S. Gulati and X. Serra, "A Multipitch Approach to Tonic Identification in Indian Classical Music", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.
E. Gómez, F. Cañadas, J. Salamon, J. Bonada, P. Vera and P. Cabañas, "Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.
A. Pikrakis, F. Gómez, S. Oramas, J. M. D Báñez, J. Mora, F. Escobar, E. Gómez and J. Salamon, "Tracking Melodic Patterns in Flamenco Singing by Analyzing Polyphonic Music Recordings", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012.

2 Comments

Three papers accepted for ISMIR 2012

9/6/2012

2 Comments

I'm glad to announce that three papers for which I am the first author have been accepted for publication at the 2012 International Society for Music Information Retrieval (ISMIR) Conference. The papers are titled:

Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction (J. Salamon, G. Peeters and A. Röbel)
Current Challenges in the Evaluation of Predominant Melody Extraction Algorithms (J. Salamon and J. Urbano)
A Multipitch Approach to Tonic Identification in Indian Classical Music (J. Salamon, S. Gulati and X. Serra)

I would also like to thank and congratulate my co-authors on these papers! PDFs of the papers will be uploaded to my publications page shortly.

Porto here we come!

2 Comments

Forward>>

Feature Learning with Deep Scattering for Urban Sound Analysis

Tony: A New Tool for Transcribing Melodies

Unsupervised Feature Learning for Urban Sound Classification

mir_eval wins best poster presentation at ISMIR 2014

ESSENTIA wins ACM Multimedia '13 Best Open Source Software Award

Back from ISMIR 2012

Two More ISMIR Papers - 5 In Total!

Three papers accepted for ISMIR 2012

NEWS

Archives

Categories