Chaining together sound segments from separate wav files (#125) · Issues · public_projects / ketos

Chaining together sound segments from separate wav files

The audio_loader module in ketos provides useful functionalities for loading audio segments from wav files. For example, the segments may be provided in the form of a selection table with: filename, start, and end times.

However, the current implementation does not allow for segments to span multiple files. This has not been a limitation so far since we have been working primarily with short segments (a few seconds) and long audio files (several minutes), but could become a serious issue if we want to extract longer segments with length comparable or even exceeding the audio file length.

Therefore, we should consider generalizing the audio_loader module to be able to handle segments that span multiple files. We can probably restrict our attention to situations in which there are no temporal gaps between audio files.

Note that this will also require changes to ketos' selection table format and potentially also the annotation table format. In particular, selections (and annotations) will need to be associated with not one, but two (or more) filenames.

We should strive to implement these changes in a manner that ensures backward compatability. In particular, it should still be possible to specify only a single filename for selections and annotations if these never span multiple files.

@fsfrazao , your thoughts on this? In today's HALLO meeting (2021.07.30) Kaitlin Palmer from SMRU mentioned that their audio files are only 1 minute long. If we want to extract long segments from these files, e.g., for computing a running mean of the ambient noise levels for the PCEN frontend, we may find ourselves in a situation where it is necessary to chain together audio files ...

Edited Jul 30, 2021 by Oliver Kirsebom

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information