Database Errors & Question RE: Species with different call bandwidths (#148) · Issues · public_projects / ketos

Database Errors & Question RE: Species with different call bandwidths

Hi All, my question is two-fold - first has to do with errors in the creation of a database, the second relates to recommendations for a network with calls from species that are in vastly different frequency bands. I am attempting to create a network that contains low frequency calls from humpbacks (calls are ~ 200 Hz to 1500 Hz) and delphinids that predominantly produce burst pulses (~4-30 kHz) in an acoustic environment with a lot of ambient and anthropogenic noise. I attached my spectrogram config file and training/validation dataset below (didn't have any problems with all steps leading up to the creation of the database by adding the training dataset.

Using the framework provided in the tutorial (thank you SOO much for that!), I got to the point of creating a database (step 8 from the tutorial) using a spectrogram config file with a sampling rate of 70 kHz (to match the wav files) and a freq_max of 35000. I'm guessing perhaps this is resulting in the types of errors I encountered below while adding the training data to the database (only 974 of ~9700 were added to database) :

Error 1: 28%|██▊ | 2599/9310 [37:22<2:22:58, 1.28s/it]Warning: while loading 5353.210412031504.wav, Message: invalid number of data points (0) specified
Error 2: 55%|█████▍ | 5078/9310 [1:21:36<04:41, 15.04it/s]Warning: while writing 5353.210403101502.wav, Message: could not broadcast input array from shape (110,8961) into shape (94,8961) Warning: while writing 5353.210403101502.wav, Message: could not broadcast input array from shape (110,8961) into shape (94,8961)

Thoughts appreciated!

In general, I'm wondering if the very different frequency bands of the calls, but use of a single spectrogram configuration file might result in some issues. If the spectrogram clips are given the same frequency max, then the calls containing humpback whales will likely have a large amount of noise included in the spectrograms, resulting in performance issues. Am I correct to assume it would be problematic for the humpback whale detection? Ideally, I would like to set a different frequency max for the two different call types. Is this a possibility or is there problems in this approach?

spec_config.json

TrainingSelections.csv

ValSelections.csv

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information