Storing attributes in HDF5 table when working with multiple audio representations (#170) · Issues · public_projects / ketos

Storing attributes in HDF5 table when working with multiple audio representations

Ketos provides the option to store multiple audio representations within the same HDF5 table. That is, a single row in the table can contain multiple 'data' fields, for example, a spectrogram and the original waveform.

However, the current implementation only saves attributes (filename,label,offset,etc) pertaining to the first representation: https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/blob/master/ketos/data_handling/database_interface.py#L496

This is fine if the the various audio representations that are being saved share the same attributes, but not if they differ. In particular, I've run into a case which the representations have different offsets.

One possible solution is to store the audio representations in different tables. However, in its current form the batch generator cannot be easily configured to load from several tables at once. Instead, we could simply add additional offset fields to the table when multiple representations are stored, e.g., offset, offset1, offset2, etc.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information