Abstract
This paper describes the methods that are used in our submission
to the LifeCLEF 2014 Bird task. A segmentation algorithm is created
that is capable of segmenting the audio files of the Bird task dataset.
These segments are used to select relevant Mel-Frequency Cepstral Coefficients
(MFCC) frames from the MFCC dataset. Three datasets are
created, 48: containing only the mean MFCC per segment, 96: containing
the mean and variance of the MFCCs in a segment, and 240: containing
the mean, variance and the mean of three sections. These dataset are
shuffled and split in a test and train set to train Deep Neural Networks
with several topologies, which are capable to classify the segments of the
datasets. It was found that the best network was capable of correctly
classifying 73% of the segments. The results of a run from our system
placed us 6th in the list of 10 participating teams. In a follow-up research
it is found that shuffling the data before splitting introduces overfitting,
which can be reduced by not shuffling the datasets prior to splitting, and
using dropout networks.
to the LifeCLEF 2014 Bird task. A segmentation algorithm is created
that is capable of segmenting the audio files of the Bird task dataset.
These segments are used to select relevant Mel-Frequency Cepstral Coefficients
(MFCC) frames from the MFCC dataset. Three datasets are
created, 48: containing only the mean MFCC per segment, 96: containing
the mean and variance of the MFCCs in a segment, and 240: containing
the mean, variance and the mean of three sections. These dataset are
shuffled and split in a test and train set to train Deep Neural Networks
with several topologies, which are capable to classify the segments of the
datasets. It was found that the best network was capable of correctly
classifying 73% of the segments. The results of a run from our system
placed us 6th in the list of 10 participating teams. In a follow-up research
it is found that shuffling the data before splitting introduces overfitting,
which can be reduced by not shuffling the datasets prior to splitting, and
using dropout networks.
Original language | English |
---|---|
Title of host publication | CLEF2014 Working Notes |
Subtitle of host publication | Sheffield, UK, September 15-18, 2014 |
Editors | Linda Cappellato , Nicola Ferro, Martin Halvey , Wessel Kraaij |
Pages | 634-642 |
Publication status | Published - 2014 |
Publication series
Name | CEUR Workshop Proceedings |
---|---|
Volume | 1180 |
ISSN (Electronic) | 1613-0073 |
Keywords
- Deep Learning
- Neural Networks
- Feature Learning
- Birdsong Recognition
- Bioacoustics