For further details please see our paper:
J. Salamon and J. P. Bello. "Feature Learning with Deep Scattering for Urban Sound Analysis", 2015 European Signal Processing Conference (EUSIPCO), Nice, France, August 2015.
[EURASIP][PDF][BibTex]
In this paper we evaluate the scattering transform as an alternative signal representation to the mel-spectrogram in the context of unsupervised feature learning for urban sound classification. We show that we can obtain comparable (or better) performance using the scattering transform whilst reducing both the amount of training data required for feature learning and the size of the learned codebook by an order of magnitude. In both cases the improvement is attributed to the local phase invariance of the representation. We also observe improved classification of sources in the background of the auditory scene, a result that provides further support for the importance of temporal modulation in sound segregation. For further details please see our paper: J. Salamon and J. P. Bello. "Feature Learning with Deep Scattering for Urban Sound Analysis", 2015 European Signal Processing Conference (EUSIPCO), Nice, France, August 2015. [EURASIP][PDF][BibTex]
0 Comments
We present Tony, a software tool for the interactive annotation of melodies from monophonic audio recordings, and evaluate its usability and the accuracy of its note extraction method. The scientific study of acoustic performances of melodies, whether sung or played, requires the accurate transcription of notes and pitches. To achieve the desired transcription accuracy for a particular application, researchers manually correct results obtained by automatic methods. Tony is an interactive tool directly aimed at making this correction task efficient. It provides (a) state-of-the art algorithms for pitch and note estimation, (b) visual and auditory feedback for easy error-spotting, (c) an intelligent graphical user interface through which the user can rapidly correct estimation errors, (d) extensive export functions enabling further processing in other applications. We show that Tony’s built in automatic note transcription method compares favourably with existing tools. We report how long it takes to annotate recordings on a set of 96 solo vocal recordings and study the effect of piece, the number of edits made and the annotator’s increasing mastery of the software. Tony is Open Source software, with source code and compiled binaries for Windows, Mac OS X and Linux available from: https://code.soundsoftware.ac.uk/projects/tony/ For further details please check out our paper:
M. Mauch, C. Cannam, R. Bittner, G. Fazekas, J. Salamon, J. Dai, J. P. Bello, and S. Dixon. Computer-aided melody note transcription using the Tony software: Accuracy and efficiency. In First International Conference on Technologies for Music Notation and Representation (TENOR), Paris, France, May 2015. [TENOR][PDF][BibTex] Recent studies have demonstrated the potential of unsupervised feature learning for sound classification. In this paper we further explore the application of the spherical k-means algorithm for feature learning from audio signals, here in the domain of urban sound classification. Spherical k-means is a relatively simple technique that has recently been shown to be competitive with other more complex and time consuming approaches. We study how different parts of the processing pipeline influence performance, taking into account the specificities of the urban sonic environment. We evaluate our approach on the largest public dataset of urban sound sources available for research, and compare it to a baseline system based on MFCCs. We show that feature learning can outperform the baseline approach by configuring it to capture the temporal dynamics of urban sources. The results are complemented with error analysis and some proposals for future research. J. Salamon and J. P. Bello. "Unsupervised Feature Learning for Urban Sound Classification", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015.
[IEEE][DOI][PDF][BibTeX][Copyright] Our paper "mir_eval: A Transparent Implementation of Common MIR Metrics", lead and presented by fearless Colin Raffel has won the Best Poster Presentation Award at the ISMIR 2014 conference! Here's the paper's abstract:
Central to the field of MIR research is the evaluation of algorithms used to extract information from music data. We present mir_eval, an open source software library which provides a transparent and easy-to-use implementation of the most common metrics used to measure the performance of MIR algorithms. In this paper, we enumerate the metrics implemented by mir_eval and quantitatively compare each to existing implementations. When the scores reported by mir_eval differ substantially from the reference, we detail the differences in implementation. We also provide a brief overview of mir_eval’s architecture, design, and intended use. A massive congratulations to comrades Colin, Brian, Eric, Oriol Dawen and Dan for creating this awesome project, and in particular to Colin for leading this initiative and doing a fantastic job at presenting it at ISMIR today! You can check out mir_eval here: https://github.com/craffel/mir_eval ESSENTIA is an audio analysis software library developed at the MTG over the past eight years, to which I am proud to have made my small contribution too (through the great effort of Dmitry Bogdanov). Recently ESSENTIA was released as open source software, and shortly after won the ACM Multimedia 2013 Best Open Source Award! A massive congratulations to everyone at the MTG who has worked on the library over the years, and especially to Dmitry Bogdanov and Nicolas Wack. ESSENTIA's first open source release is accompanied with two papers:
The ISMIR 2012 conference is now over and it was a very productive experience, and quite intense too! The week started off with the tutorials on Monday, of which I'd like to highlight the tutorial on Reusable software and reproducibility in music informatics research, which discussed important issues in software development which are often overlooked by the research community. On Tuesday I didn't have to present anything and so could enjoy a relaxed day of interesting presentations, posters, and conversations with the members of the MIR community who with each conference become more and more like an extended "academic family" :) Wednesday was definitely the most intense: it started off with my oral presentation which was the first of the day, describing our work in: J. Salamon, G. Peeters and A. Röbel, "Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012. The talk was well received, even though it being the first session of the day meant the audience didn't attack with the usual set of feisty questions as they often do. Thanks to Kazuyoshi Yoshii for the interesting question and suggestion! Next up was our poster presentation with Julián: J. Salamon and J. Urbano, "Current Challenges in the Evaluation of Predominant Melody Extraction Algorithms", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012. The poster managed to attract considerable interest, which is good, because it means the Audio Melody Extraction Annotation Initiative might become a reality after all! Stay posted on this topic... On thursday we presented two posters, one on tonic detection in Indian classical music and another on tracking melodic patterns in Flamenco music: J. Salamon, S. Gulati and X. Serra, "A Multipitch Approach to Tonic Identification in Indian Classical Music", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012. A. Pikrakis, F. Gómez, S. Oramas, J. M. D. Báñez, J. Mora, F. Escobar, E. Gómez and J. Salamon, "Tracking Melodic Patterns in Flamenco Singing by Analyzing Polyphonic Music Recordings", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012. On Friday Emilia presented our work on Flamenco transcription: E. Gómez, F. Cañadas, J. Salamon, J. Bonada, P. Vera and P. Cabañas, "Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing", in Proc. 13th International Society for Music Information Retrieval Conference (ISMIR 2012), Porto, Portugal, October 2012. And in the afternoon during the Late Breaking / Demo Session I demo'ed MELODIA, our recently released Melody Extraction vamp plug-in, which was a great opportunity to get some feedback from the community regarding the plug-in. I should also mention that this year there was massive MTG participation, which you can read more about here. To get an idea, here's a photo we took of all MTG current and ex-members who attended ISMIR 2012, and even here there are a couple of people missing because they didn't make it in time for the photo: Finally, ISMIR was of course also a good chance for enjoying portuguese food, wine (vino de porto) and beer together with fellow researchers.
On a more serious note though, it was an excellent forum to get feedback for my work, and I highly recommend ISMIR for PhD students working in a related area. Next year... Brazil! I'm glad two confirm two more papers which I have co-authored on computational analysis of Flamenco music have also been accepted to the ISMIR 2012 conference, upping the total of accepted papers to 5!
Here's the list of all five accepted papers (also on my Publications page):
I'm glad to announce that three papers for which I am the first author have been accepted for publication at the 2012 International Society for Music Information Retrieval (ISMIR) Conference. The papers are titled:
I would also like to thank and congratulate my co-authors on these papers! PDFs of the papers will be uploaded to my publications page shortly. Porto here we come! |
NEWSMachine listening research, code, data & hacks! Archives
March 2023
Categories
All
|