[aubio-user] Onset detection in Python

Lukasz Tracewski lukasz.tracewski at outlook.com
Wed Mar 19 14:47:02 CET 2014

Hi all,

> Interesting. I will try to give it a go later this week. The default method works well on percussive sounds. You might want to try the "phase" method.

I did try all the methods and found "energy" to be the best suited for my applications. It seems they all have their uses: those considered in music annotations as "poor", in other discipline (bird calls detection) are doing perfectly. I am glad there is such a selection in aubio!

> Interesting, I just made some tests, and even with a 45 minutes long,
> scipy is still 3 or 4 times slower. wow, memory is cheap nowadays, and
> file access is slow. still, loading the entire file in memory is usually
> *not* the best approach, at least not the one i prefer. :-)

Sure, scipy is slower with loading files. What is faster in my case is when I load a file into memory and then do scipy filtering on a whole sample - and not on frames. I have yet to try the filtering that is in aubio, had no clue that such thing existed. Even if I get high-pass filtering working your way, I still need to get whole sample into memory I am afraid for other reasons. 

One of crucial steps in my case is noise reduction by spectral subtraction. I use aubio onsets to find potential animals' calls. Once they are identified, I can also tell which regions are noise-only, analyse their spectral content and then subtract the noise from the whole sample. As you can see in this case having complete file in memory cannot be avoided (or rather trying to do so would increase complexity and reduce effectiveness of the method - what if noise-only region is only at the end of the sample?).

> why do you need this hstack line? you could use aubio.digital_filter and
> use the highpassed_samples directly.

The hstack is to get complete sample into memory by concatenating frames read by "source". Reason for this I have just explained above. Is there a smarter way of doing this? 

Anyway, thanks for the tip with using filters! Indeed an example in Python would be welcomed to make people more aware of this nice feature. The filter coefficients can be easily produced by scipy.signal module.

> that's great, i love thinking these lines of code can help a bit!

The project is to support efforts in kiwi protection, famous flightless bird from New Zealand. We count their calls (and also determine gender) in each file and use this information to understand ecosystem 
health and how well protection efforts are working. As you can imagine finding that there is *something* worth interest in a recording is crucial: if there are too many false positives, then noise reduction by spectral subtraction becomes inefficient. The other way risk is even greater: if bird call goes undetected, then it will be included in noise-only regions and then subtracted from a whole sample, effectively eliminating many possible candidates.

Processing time becomes important when there are truly many sample. As mentioned before, there are 10000 hours of recordings. To get them identified we will probably buy time on Amazon Web Services EC2 compute-optimized instance (something with 32 virtual CPUs). Time is money in this case, so keeping it short is crucial.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.aubio.org/pipermail/aubio-user/attachments/20140319/96bbafd7/attachment.html>

More information about the aubio-user mailing list