We load the training/validation dataset from the same HDF5 database that contains the test dataset used previously.
We load the training/validation dataset from the HDF5 table that we created previously.
%% Cell type:code id: tags:
``` python
importnumpyasnp
fromketos.data_handling.data_feedingimportBatchGenerator# A helper class to read data from disk in batches
```
%% Cell type:markdown id: tags:
We'll split this dataset into training and validation using an stratified sampling algorithm ([scikit-learn's StratifiedShuffleSplit](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedShuffleSplit.html)). This yields training and validation datasets with the same proportions of positive (upcall) and negative (no upcall) examples.
We will split the dataset into a training set of 30 (randomly selected) samples and a validation set consisting of the 10 remaining samples.
Now we have configured two batch generators, which will load 32 spectrograms and associated labels at a time during the training process. After attaching these generators to the new_resnet_model, we can run the training loop for a couple of epochs.
Now we have configured two batch generators, which will load 10 spectrograms and associated labels at a time during the training process. After attaching these generators to the new_resnet_model, we can run the training loop for a couple of epochs.