ketos issueshttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues2022-09-29T17:57:47Zhttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/188New format for ketos models2022-09-29T17:57:47ZFabio FrazaoNew format for ketos modelsThe current format for saved ketos models is an archive containing information on:
-pre-processing audio files: audio representation config
-the model: recipe + weights
I think it would be helpful to expand it to include more instructi...The current format for saved ketos models is an archive containing information on:
-pre-processing audio files: audio representation config
-the model: recipe + weights
I think it would be helpful to expand it to include more instructions on pre-processing (what to do with the data before it reaches the network) and post-processing (what to do with the network outputs.
This added information combined with corresponding functions to read and apply these instructions would allow for more general versions of some scripts in the ketos-scripts repo.
I would add:
-pre-processing audio files: audio representation + custom audio representation code + transform function + custom transform code
-the model: recipe + weights + custom architecture code
-post-processing: post-processing recipe + custom post-processing code
In summary, each option could use configuration files (such as the audio representation and model recipe already in use) and optionally code that allows operations not covered by the configuration files (i,e,: things that are not available as a ketos function/class)https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/179Tensorflow protobuf package version issue2022-07-06T21:40:16ZOliver KirsebomTensorflow protobuf package version issueUpon install of ketos and all dependencies in a fresh python environment, I get the following error
```
>>> import tensorflow
2022-07-05 15:13:16.039574: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dyna...Upon install of ketos and all dependencies in a fresh python environment, I get the following error
```
>>> import tensorflow
2022-07-05 15:13:16.039574: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-05 15:13:16.039611: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/__init__.py", line 37, in <module>
from tensorflow.python.tools import module_util as _module_util
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/python/__init__.py", line 37, in <module>
from tensorflow.python.eager import context
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/python/eager/context.py", line 29, in <module>
from tensorflow.core.framework import function_pb2
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/core/framework/function_pb2.py", line 16, in <module>
from tensorflow.core.framework import attr_value_pb2 as tensorflow_dot_core_dot_framework_dot_attr__value__pb2
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/core/framework/attr_value_pb2.py", line 16, in <module>
from tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/core/framework/tensor_pb2.py", line 16, in <module>
from tensorflow.core.framework import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/core/framework/resource_handle_pb2.py", line 16, in <module>
from tensorflow.core.framework import tensor_shape_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__shape__pb2
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/tensorflow/core/framework/tensor_shape_pb2.py", line 36, in <module>
_descriptor.FieldDescriptor(
File "/home/oliskir/envs/env-ketos-py39/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 560, in __new__
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
```
Some contextual information:
* Ubuntu 22.04
* Python 3.9.13
* Ketos 2.6.1
* Tensorflow 2.8.0
* Pip 22.1.2
Installation method:
```
python setup sdist
pip install dist/ketos-2.6.1.tar.gz
```
The solution is very simple:
```
pip install protobuf==3.20.*
```
Not sure if this requires any updates to the docs?Bruno PadoveseBruno Padovesehttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/176AudioLoader and process function behavior2022-07-06T17:01:17ZBruno PadoveseAudioLoader and process function behaviorketos tutorial create_a_narw_detector is not working in the current ketos version.
The issue comes from running an audio loader multiple times without resetting it. In past ketos versions, the audio loader was reset by default at the e...ketos tutorial create_a_narw_detector is not working in the current ketos version.
The issue comes from running an audio loader multiple times without resetting it. In past ketos versions, the audio loader was reset by default at the end (the `stop` argument was set to `False` as default). But sometime between ketos 2.4.2 and ketos 2.6, this was changed (`stop` is now set to `True`).
We can, of course, just update the tutorial and call reset when needed with a text similar to the following: "Now to reuse the audio loader, we have to first rest it by calling xxx function"
But could you give an explanation as to why you changed `stop` to `True` by default? It seems to me that the most common use case would be to reset the audio loader by default so that it can be reused again if necessary.
Do you remember why this change was made? @kirsebomhttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/169Remove dropout from ResNet? (And other ketos architectures)2022-04-02T18:47:04ZOliver KirsebomRemove dropout from ResNet? (And other ketos architectures)In my own experience, training the ketos ResNet architecture with dropout > 0 results in (far) worse performance than dropout = 0. This appears to be consistent with observations made by others. See for example: https://www.kdnuggets.com...In my own experience, training the ketos ResNet architecture with dropout > 0 results in (far) worse performance than dropout = 0. This appears to be consistent with observations made by others. See for example: https://www.kdnuggets.com/2018/09/dropout-convolutional-networks.html . I'm wondering if we should simply remove the dropout argument from the definition of the ResNetBlock and ResNetArch in ketos: https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/blob/master/ketos/neural_networks/resnet.py#L100 ?
@bpadovese , @fsfrazao , your thoughts/experiences?
I'm not sure if the dropout has similar negative impact on the other ketos architectures (cnn, densenet, inception) ...https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/139Importing Ketos occupying GPU memory2022-02-03T20:26:28ZSadman SakibImporting Ketos occupying GPU memoryCurrently, importing any ketos package or just importing ketos itself occupy the GPU memory to some extend. So even if the script has no other code apart from importing ketos or any ketos module, it still occupies GPU. I checked this on ...Currently, importing any ketos package or just importing ketos itself occupy the GPU memory to some extend. So even if the script has no other code apart from importing ketos or any ketos module, it still occupies GPU. I checked this on two machines, got the same issue.https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/133Possible problem with implementation of group_detections2022-08-12T17:20:00ZOliver KirsebomPossible problem with implementation of group_detectionsIn the current implementation of the [group_detections](https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/blob/master/ketos/neural_networks/dev_utils/detection.py#L120), the duration of a grouped detection event can be shorter th...In the current implementation of the [group_detections](https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/blob/master/ketos/neural_networks/dev_utils/detection.py#L120), the duration of a grouped detection event can be shorter than the length of a single spectrogram sample if the samples are overlapping in time. For example, if a spectrogram with start=5.0s and duration=3.0s triggers a detection, but the subsequent spectrogram starting at start=5.5s does not trigger a detection, the grouped detection event is assumed to begin at start=5.0s and have a duration of only 0.5s. This is reflected in this unit test:
https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/blob/master/ketos/tests/neural_networks/test_detection.py#L198
However, as far as I can tell this constraining effect is only implemented for later samples, not earlier ones. In the above unit test, for example, the spectrogram sample starting at 4.5s and ending at 7.5s does not trigger a detection, yet the grouped detection event returned by the group_detection function still starts at 5.0s. Isn't this an inconsistent treatment of future and past, so to speak?
Infact, if we were to treat earlier negatives on par with later ones, the grouped detection event in the unit test would be have to be discarded altogether.
@fsfrazao , your thoughts?https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/131Medium-term objectives for Ketos2021-12-15T17:13:24ZOliver KirsebomMedium-term objectives for KetosSuggested list of medium-term (6-9 months) objectives for Ketos:
1. Publish a peer-reviewed paper on Ketos
2. Transfer learning module
3. Active learning module
4. Tensorflow graph computation of spectrograms
5. New neural network archi...Suggested list of medium-term (6-9 months) objectives for Ketos:
1. Publish a peer-reviewed paper on Ketos
2. Transfer learning module
3. Active learning module
4. Tensorflow graph computation of spectrograms
5. New neural network architectures:
* Sequence model
* Similarity model (e.g. Siamese network)
6. Rewamp/improve the html docs (style and structure)
7. Review all tutorials, and add new ones as needed (Ketos has many new functionalities not covered in the existing tutorials)
8. Neural net calibration (i.e. calibrate output scores so they correspond more closely to probabilities/confidences)
@fsfrazao , @bpadovese , @sadman , please chime in with your thoughts (no rush). The above list is probably too ambitious, so it would be helpful to assign priorities.
I have made this issue confidential (only visible to team members with a least Reporter access), but we could consider opening it up to more people to solicit user inputs?).
Have a nice weekend everyone,
Oliverhttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/125Chaining together sound segments from separate wav files2022-08-26T16:18:33ZOliver KirsebomChaining together sound segments from separate wav filesThe [audio_loader](ketos/audio/audio_loader.py) module in ketos provides useful functionalities for loading audio segments from wav files. For example, the segments may be provided in the form of a [selection table](ketos/data_handling/s...The [audio_loader](ketos/audio/audio_loader.py) module in ketos provides useful functionalities for loading audio segments from wav files. For example, the segments may be provided in the form of a [selection table](ketos/data_handling/selection_table.py) with: filename, start, and end times.
However, the current implementation does not allow for segments to span multiple files. This has not been a limitation so far since we have been working primarily with short segments (a few seconds) and long audio files (several minutes), but could become a serious issue if we want to extract longer segments with length comparable or even exceeding the audio file length.
Therefore, we should consider generalizing the audio_loader module to be able to handle segments that span multiple files. We can probably restrict our attention to situations in which there are no temporal gaps between audio files.
Note that this will also require changes to ketos' selection table format and potentially also the annotation table format. In particular, selections (and annotations) will need to be associated with not one, but two (or more) filenames.
We should strive to implement these changes in a manner that ensures backward compatability. In particular, it should still be possible to specify only a single filename for selections and annotations if these never span multiple files.
@fsfrazao , your thoughts on this? In today's HALLO meeting (2021.07.30) Kaitlin Palmer from SMRU mentioned that their audio files are only 1 minute long. If we want to extract long segments from these files, e.g., for computing a running mean of the ambient noise levels for the PCEN frontend, we may find ourselves in a situation where it is necessary to chain together audio files ...https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/124abstraction layer for joining neural networks2022-08-26T16:22:17ZOliver Kirsebomabstraction layer for joining neural networksIt would be useful to have methods for building composite neural networks., e.g., by joining neural networks serially or in parallel.
For example, one could envision attaching a spectrogram computation frontend or a PCEN frontend to a ...It would be useful to have methods for building composite neural networks., e.g., by joining neural networks serially or in parallel.
For example, one could envision attaching a spectrogram computation frontend or a PCEN frontend to a ResNet classification network.
Some thought would have to put into the design of such an abstraction layer.https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/118Behaviour of freeze_block and unfreeze_block2021-05-25T18:35:50ZOliver KirsebomBehaviour of freeze_block and unfreeze_blockI was using the ResNet freeze_block method as follows
```python
resnet.model.freeze_block([0,1,2,3,4,5]) #freeze all 6 blocks
# now train the model for a while ...
resnet.model.freeze_block([0,1,2,3,4,5]) #freeze only the first 5 blocks
...I was using the ResNet freeze_block method as follows
```python
resnet.model.freeze_block([0,1,2,3,4,5]) #freeze all 6 blocks
# now train the model for a while ...
resnet.model.freeze_block([0,1,2,3,4,5]) #freeze only the first 5 blocks
# continue training ...
```
However, eventually I realized that this was the wrong use of this method.
To achieve the desired result, I had to do as follows,
```python
resnet.model.freeze_block([0,1,2,3,4,5]) #freeze all 6 blocks
# now train the model for a while ...
resnet.model.unfreeze_block([5]) #unfreeze the last block
# continue training ...
```
I am wondering if other users might be fooled by this too?
Would it be more intuitive to implement the `freeze_block` in a manner so that past history of freezing layers is forgotten?
Or perhaps we could add an argument to enable such use?
Or at least highlight this in the doc string as a potential pitfall, e.g., via an example.
@fsfrazao what do you think?https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/114transforms not available though Waveform or Spectrogram classes2021-04-13T17:50:50ZFabio Frazaotransforms not available though Waveform or Spectrogram classesSome transforms are not available through the Waveform or Spectrogram classes.
These are the ones I identified:
* apply_preemphasis
- https://docs.meridian.cs.dal.ca/ketos/modules/audio/utils/filter.html#ketos.audio.utils.filter.apply...Some transforms are not available through the Waveform or Spectrogram classes.
These are the ones I identified:
* apply_preemphasis
- https://docs.meridian.cs.dal.ca/ketos/modules/audio/utils/filter.html#ketos.audio.utils.filter.apply_preemphasis
* apply_median_filter
- https://docs.meridian.cs.dal.ca/ketos/modules/audio/utils/filter.html#ketos.audio.utils.filter.apply_median_filter
* filter_isolated_spots
- https://docs.meridian.cs.dal.ca/ketos/modules/audio/utils/filter.html#ketos.audio.utils.filter.filter_isolated_spots
Should they be included as methods in the appropriate classes, so they can use the same transform logs as other methods?Oliver KirsebomOliver Kirsebomhttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/112Option to specify relative sample occurrence in batch generator2021-03-30T12:36:20ZOliver KirsebomOption to specify relative sample occurrence in batch generatorIn an active-learning scenario where a human validates model outputs and identifies certain samples as particularly important, it would be useful to have a mechanism for over-sampling individual samples in the training process. One way t...In an active-learning scenario where a human validates model outputs and identifies certain samples as particularly important, it would be useful to have a mechanism for over-sampling individual samples in the training process. One way to achieve this could be to assign to each sample an integer to indicate how many times that sample should appear in a training epoch. By default, the value would be 1, i.e., each sample appears once in an epoch. This integer (which we could call `multiplicity') could be saved to a column in the hdf5 table. Then we could then point the BatchGenerator to this column in the hdf5 table using an argument like `mult_field` or something like that.
@fsfrazao , your thoughts?https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/110improved tracking of annot id2021-04-12T11:48:21ZOliver Kirsebomimproved tracking of annot idWhen the `select` method is used to generate selections with the `keep_id` argument set to `True`, the resulting selection table includes an extra column with the id of the original annotation. However, when subsequently using the `creat...When the `select` method is used to generate selections with the `keep_id` argument set to `True`, the resulting selection table includes an extra column with the id of the original annotation. However, when subsequently using the `create_database` method to create a database of the selection, the id of the annotation is lost. It would be nice if this id could also be saved to the database.
In fact, perhaps we could implement a solution to ensure that any additional fields in the selection table are saved to the database? In addition to the annotation id, another common field might be a confidence indicator.https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/109Behaviour of load_audio_representation2021-11-23T22:56:42ZFabio FrazaoBehaviour of load_audio_representationThe current implementation requires a parameter 'name' to be passed.
If that is not specified, the default None is used an empty dictionary is returned.
in addition, only parameters recognized by the 'parse_audio_representation' functio...The current implementation requires a parameter 'name' to be passed.
If that is not specified, the default None is used an empty dictionary is returned.
in addition, only parameters recognized by the 'parse_audio_representation' function are returned. If the specification has any extra parameters, these are ignored.
I suggest the following changes:
1) load everything that's in the json file, so if you have a 'spectrogram' and a 'waveform' representation, both would be loaded by default unless you specify which one you want.
2)return any parameters not expected by the parser as they come (i.e.: in whatever format the default translation from json to python gives)
There could be options to control these behaviours in the interface, like a boolean 'return_unparsed' and set name='all' if you want to return everything.Oliver KirsebomOliver Kirsebomhttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/108Behaviour of JointBatchGen2021-03-29T15:10:34ZOliver KirsebomBehaviour of JointBatchGenHey @fsfrazao
I have a question about `JointBatchGen` class:
If the input generators have different number of batches and have been initialized with `refresh_on_epoch=True`, while the joint batch generator is initialized with `n_batch...Hey @fsfrazao
I have a question about `JointBatchGen` class:
If the input generators have different number of batches and have been initialized with `refresh_on_epoch=True`, while the joint batch generator is initialized with `n_batches=min` and `reset_generators=True`, will the individual generators still be refreshed at the end of each epoch? Having studied the implementation of the `JointBatchGen` class, it seems to me that only the generator with the smallest number of batches will be refreshed at the end of the epoch ...Fabio FrazaoFabio Frazaohttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/100Neural net calibration2021-11-05T15:26:27ZOliver KirsebomNeural net calibration[This paper](/uploads/e04565ff79df21019e3abf0127e50ad1/guo2017_calibrating_NN.pdf) describes a straightforward approach (Platt/Temperature scaling) to ``calibrating'' neural networks so that the output scores correspond more closely to p...[This paper](/uploads/e04565ff79df21019e3abf0127e50ad1/guo2017_calibrating_NN.pdf) describes a straightforward approach (Platt/Temperature scaling) to ``calibrating'' neural networks so that the output scores correspond more closely to probabilities/confidences.
It would be desirable to have this approach implemented in the next release of Ketos!https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/96Reduce repo size2021-08-11T17:25:17ZOliver KirsebomReduce repo sizeHi @fsfrazao ,
It appears the ketos repository still takes up 2.7 GB.
Should we try this to reduce the size:
https://stackoverflow.com/questions/13716658/how-to-delete-all-commit-history-in-github
(will remove the entire commit histo...Hi @fsfrazao ,
It appears the ketos repository still takes up 2.7 GB.
Should we try this to reduce the size:
https://stackoverflow.com/questions/13716658/how-to-delete-all-commit-history-in-github
(will remove the entire commit history)
Oliverhttps://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/78Real and imaginary components of spectrogram2020-03-18T20:53:40ZOliver KirsebomReal and imaginary components of spectrogramCurrently, we work with the magnitude spectrogram (real^2 + imag^2). It would be interesting to retain the full information encoded in the complex spectrogram, to see if this can improve learning. For example, one could work with a doubl...Currently, we work with the magnitude spectrogram (real^2 + imag^2). It would be interesting to retain the full information encoded in the complex spectrogram, to see if this can improve learning. For example, one could work with a double spectrogram representation, using the usual magnitude, real^2+imag^2, plus the imaginary part, imag.https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/76Improving the design of the Spectrogram class2020-03-18T20:53:31ZOliver KirsebomImproving the design of the Spectrogram classThe current implementation of Spectrogram class has several short-comings on the developer side, most notably:
1. The handling of annotations and time/file data is clumsy
2. The conversion from physical values (time, frequency) to bin...The current implementation of Spectrogram class has several short-comings on the developer side, most notably:
1. The handling of annotations and time/file data is clumsy
2. The conversion from physical values (time, frequency) to bin numbers is a constant source of trouble
3. Repeated application of the same operation (e.g. cropping) to a list of spectrograms is rather slow
4. The spectrogram shares several methods with the audio class, but these currently have separate implementations
To solve these issues, I suggest the following changes:
1. Store annotations and time/file data as numpy arrays with same 0th dimension as the spectrogram image
2. Create a separate Axis class to handle conversion from physical values to bin numbers
3. Add an extra dimension to all numpy arrays to allow for multiple spectrograms and vectorized operations
4. Implement the Audio class a special instance of the more general Spectrogram class with 1st dimension equal to 1
Thus, the new Spectrogram class would have the following attributes:
* image: 3D numpy array (L x M x N) of type float
* annotation_matrix: 3D numpy array (L x K x N) of type bool
* time_vector: 2D numpy array (L x N) of type float
* file_vector: 2D numpy array (L x N) of type int
* taxis: instance of the Axis class to handle conversion of time to bin numbers
* faxis: instance of the Axis class to handle conversion of frequency to bin numbers
where,
* L = number of time bins
* M = number of frequency bins
* N = number of spectrograms
* K = number of labels
The current Spectrogram class would then correspond to the special case N = 1, and the Audio class would correspond to the special case M = 1.
@fsfrazao , any thoughts?https://gitlab.meridian.cs.dal.ca/public_projects/ketos/-/issues/23Constant Q Transform (CQT)2020-03-18T20:51:27ZOliver KirsebomConstant Q Transform (CQT)It would be interesting to implement a method for generating CQT spectrograms, in addition to the standard FFT spectrogram.
CQT is an alternative frequency representation of acoustic signals in which frequency bins become wider with inc...It would be interesting to implement a method for generating CQT spectrograms, in addition to the standard FFT spectrogram.
CQT is an alternative frequency representation of acoustic signals in which frequency bins become wider with increasing frequency (while the time resolution increases). CQT is described in this paper: [https://asa.scitation.org/doi/10.1121/1.400476](https://asa.scitation.org/doi/10.1121/1.400476)
Librosa has an implementation of CQT that we could use: [https://librosa.github.io/librosa/generated/librosa.core.cqt.html](https://librosa.github.io/librosa/generated/librosa.core.cqt.html)