tutorial.ipynb 253 KB
Newer Older
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1
2
3
4
5
6
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
7
    "# NARW Tutorial"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
8
9
10
11
12
13
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
14
    "In this tutorial we will be  using a pre-trained deep neural network to detect upcalls made by the North Atlantic right whale (NARW). The model is based on a ResNet architecture and was trained on audio recordings from the Gulf of Saint Lawrence. The training data set consisted of approximately 6,000 spectrograms, each of 3 seconds duration and covering the frequency range 0-500 Hz, with about half of the spectrograms containing an upcall. These calls are characterized by an upsweep frequency from about 50 Hz to 350 Hz and a duration of about 1 second. \n",
15
16
    "\n",
    "Below you see an example of a NARW upcall recorded by DFO scientists in the Emerald Basin off the coast of Nova Scotia a few years ago. Superimposed on the upcall you also see broadband impulsive noise from nearby pile driving, which makes the detection of the upcall more challenging.\n",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
17
18
19
20
21
22
23
24
25
26
    "\n",
    "![Example of NARW upcall](assets/upcall_example_dfo.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***\n",
    "<a id='main_menu'></a>\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
27
    "We will start by loading the pre-trained binary classifier and using it as a detector on a 30-minute long audio file. Time permitting, we will also go over the data pre-processing steps and training steps. The tutorial is organized as follows,\n",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
28
    "\n",
29
    "### Part I: Loading and using the pre-trained model\n",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
30
31
32
33
    "\n",
    " 1. [Import python modules](#step_1)\n",
    " 2. [Load pre-trained neural network](#step_2)\n",
    " 3. [Load test data](#step_3)\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
34
    " 4. [Classify a single spectrogram](#step_4)\n",
35
    " 5. [Run the model through a 30 min file](#step_5)\n",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
36
    " \n",
37
    "### Part II: Building the training data set\n",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
38
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
39
40
41
    " 6. [Load and view annotations](#step_6)\n",
    " 7. [Compute spectrograms and store them in an HDF5 database](#step_7)\n",
    " 8. [Inspect the HDF5 database](#step_8)\n",
42
43
44
    " \n",
    "### Part III: Training the model\n",
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
45
46
47
    " 9. [Select ResNet architecture](#step_9)\n",
    " 10. [Configure batch generator](#step_10)\n",
    " 11. [Train the network](#step_11)"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
48
49
50
51
52
53
54
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***\n",
55
    "# Part I: Loading and using the pre-trained model\n",
56
    "\n",
57
    "We will use a pre-trained ketos model to build a NARW upcall detector. The model is based on a ResNet and was trained as a binary classifier to distinguish those spectrograms that contain a NARW upcall from those that only contain background noise. As stated at the beginning, the spectrograms are 3 s long and cover the frequency range 0-500 Hz.\n",
58
    "\n",
59
    "Below is a diagram of the ResNet architecture.\n",
60
61
    "\n",
    "![model Architecture](assets/architecture.png)"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
62
63
64
65
66
67
68
69
70
71
72
73
74
75
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='step_1'></a>\n",
    "## 1. Import python modules"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
76
    "We begin by importing a bunch of useful stuff. This includes the Ketos implementation of ResNet and several useful functions and classes for handling spectrograms."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
77
78
79
80
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
81
   "execution_count": 1,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
82
83
84
   "metadata": {},
   "outputs": [],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
85
    "from ketos.neural_networks.resnet import ResNetInterface\n",
86
87
    "from ketos.data_handling.parsing import load_spectrogram_configuration\n",
    "from ketos.data_handling.data_handling import SpecProvider\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
88
    "import matplotlib.pyplot as plt\n",
89
90
    "from tqdm import tqdm\n",
    "import numpy as np\n",
91
    "import pandas as pd"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
92
93
94
95
96
97
98
99
100
101
102
103
104
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='step_2'></a>\n",
    "## 2. Load pre-trained neural network"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
105
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
106
    "Ketos models are packaged in .kt files. These contain the model weights and also a recipe to build the model. In addition to the .kt file, we need to specify a folder that will be created and used to store any changes to the model. (This is not actually relevant in our case because we will not be retraining the model.)"
107
   ]
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
108
109
110
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
111
   "execution_count": 2,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
112
113
114
   "metadata": {},
   "outputs": [],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
115
    "resnet_model = ResNetInterface.load_model(model_file=\"assets/narw.kt\", new_model_folder=\"new_model/\")"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
116
117
118
119
120
121
122
123
124
125
126
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='step_3'></a>\n",
    "## 3. Load test data"
   ]
  },
  {
127
   "cell_type": "markdown",
128
129
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
130
    "Our model works with 3-s spectrograms, but our input data consists of a 30-min long .wav file. Therefore, some data processing will be required before we can feed the input data to the model.\n",
131
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
132
    "In the 'assets' folder you received with this tutorial, you will find a file named 'spec_config.json'. This file contains the spectrogram parameters that were used to build the training data set. We can load this configuration file as shown below. If you are used to working with spectrograms, most of the parameters will be familiar to you."
133
134
135
136
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
137
   "execution_count": 3,
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SpectrogramConfiguration(rate=1000, window_size=0.256, step_size=0.032, bins_per_octave=None, window_function=<WinFun.HAMMING: 1>, low_frequency_cut=0, high_frequency_cut=500, length=3.0, overlap=0.0, type='Mag')\n"
     ]
    }
   ],
   "source": [
    "# load spectrogram configuration\n",
    "cfg = load_spectrogram_configuration('assets/spec_config.json')\n",
    "\n",
    "# view parameters\n",
    "print(cfg)"
   ]
  },
156
157
158
159
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
160
161
162
    "Ketos has a useful class called 'SpecProvider' that helps us generate spectrograms from the raw .wav file.\n",
    "\n",
    "Below, we create an instance of the SpecProvider class that loads audio data from the file 'assets/data/wav_30min/15653951.WAV' and uses the parameters stored in the object 'cfg' for computing the spectrograms. "
163
164
   ]
  },
165
166
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
167
   "execution_count": 4,
168
169
170
   "metadata": {},
   "outputs": [],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
171
    "provider = SpecProvider(path='assets/data/audio_30min.wav', spec_config=cfg, pad=False)"
172
173
   ]
  },
174
175
176
177
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
178
179
180
    "Note that in the 'cfg' object, we have specified that each spectrogram should be 3.0 seconds long (length=3.0) and that there should be no overlap between successive spectrograms (overlap=0.0). This is a very simple way of processing the 30-min file. We could increase the ovelap, which would increase the amount of data that the model has to process, but also increases our chances of getting a good snapshot of every upcall in the file. For the purposes of this tutorial, we will keep things simple and use zero overlap.\n",
    "\n",
    "We can check the number of segments (i.e., spectrograms) that the provider will create. Note that these are only actually computed, one at the time, when requested."
181
182
   ]
  },
183
184
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
185
   "execution_count": 5,
186
187
188
189
190
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
191
       "599"
192
193
      ]
     },
Oliver Kirsebom's avatar
Oliver Kirsebom committed
194
     "execution_count": 5,
195
196
197
198
199
200
201
202
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "provider.num_segs"
   ]
  },
203
204
205
206
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
207
    "In the assets folder, you will also find a .csv file containing the annotations for the 30-minute audio file. The time stamps are in seconds from the beginning of the file, and each time stamp designates approximatelly the center of one upcall. For this file, there are 30 annotated calls."
208
209
   ]
  },
210
211
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
212
   "execution_count": 6,
213
214
215
216
217
218
219
220
   "metadata": {},
   "outputs": [],
   "source": [
    "annotations_30min = pd.read_csv(\"assets/data/annotations_30min.csv\")"
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
221
   "execution_count": 7,
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>filename</th>\n",
       "      <th>timestamp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>870.504</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>913.295</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>969.021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1002.460</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1022.432</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1082.304</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1098.727</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1113.589</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1131.041</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1156.156</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1165.508</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1186.670</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1204.007</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1241.110</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1264.618</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1289.622</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1356.996</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1373.933</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1407.758</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1446.546</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1492.776</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1504.553</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1518.534</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1550.912</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1590.568</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1627.414</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1690.724</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1734.094</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1760.430</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>15653951.WAV</td>\n",
       "      <td>1791.097</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
       "        filename  timestamp\n",
       "0   15653951.WAV    870.504\n",
       "1   15653951.WAV    913.295\n",
       "2   15653951.WAV    969.021\n",
       "3   15653951.WAV   1002.460\n",
       "4   15653951.WAV   1022.432\n",
       "5   15653951.WAV   1082.304\n",
       "6   15653951.WAV   1098.727\n",
       "7   15653951.WAV   1113.589\n",
       "8   15653951.WAV   1131.041\n",
       "9   15653951.WAV   1156.156\n",
       "10  15653951.WAV   1165.508\n",
       "11  15653951.WAV   1186.670\n",
       "12  15653951.WAV   1204.007\n",
       "13  15653951.WAV   1241.110\n",
       "14  15653951.WAV   1264.618\n",
       "15  15653951.WAV   1289.622\n",
       "16  15653951.WAV   1356.996\n",
       "17  15653951.WAV   1373.933\n",
       "18  15653951.WAV   1407.758\n",
       "19  15653951.WAV   1446.546\n",
       "20  15653951.WAV   1492.776\n",
       "21  15653951.WAV   1504.553\n",
       "22  15653951.WAV   1518.534\n",
       "23  15653951.WAV   1550.912\n",
       "24  15653951.WAV   1590.568\n",
       "25  15653951.WAV   1627.414\n",
       "26  15653951.WAV   1690.724\n",
       "27  15653951.WAV   1734.094\n",
       "28  15653951.WAV   1760.430\n",
       "29  15653951.WAV   1791.097"
436
437
      ]
     },
Oliver Kirsebom's avatar
Oliver Kirsebom committed
438
     "execution_count": 7,
439
440
441
442
443
444
445
446
447
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "annotations_30min"
   ]
  },
  {
448
   "cell_type": "markdown",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
449
450
   "metadata": {},
   "source": [
451
    "<a id='step_4'></a>\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
452
    "## 4. Classify a single spectrogram"
453
454
455
   ]
  },
  {
456
   "cell_type": "markdown",
457
458
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
459
    "Let us now use the provider object to generate a spectrogram and pass it through the model.\n",
460
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
461
    "To this end, we will use the SpecProvider's 'get' method, which creates a spectrogram starting from a specified time (in seconds from the beginning of the file). So, let us try to generate a spectrogram containing an upcall. According to the annotations table, there is for example an upcall at 1590.568 s. To capture this upcall, we will specify the time as 1589; this will create a spectrogram going starting at 1589 s and ending at 1592 s. "
462
463
464
465
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
466
   "execution_count": 8,
467
468
469
470
   "metadata": {},
   "outputs": [
    {
     "data": {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
471
      "image/png": "\n",
472
      "text/plain": [
473
       "<Figure size 432x288 with 2 Axes>"
474
475
476
477
478
479
480
481
482
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
    "spec = provider.get(1589) # get spectrogram starting at t = 1589 sec\n",
    "spec.plot() # plot the spectrogram\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The upcall is clearly visible, centered in the spectrogram. (The structure at the top is just an artifact produced by the resampling algorithm.)\n",
    "\n",
    "Next, we pass the spectrogram to the model, which attempts to determine if an upcall is present or not."
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
499
   "execution_count": 9,
Oliver Kirsebom's avatar
Oliver Kirsebom committed
500
501
502
503
504
505
506
507
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([1]), array([0.9999114], dtype=float32))"
      ]
     },
Oliver Kirsebom's avatar
Oliver Kirsebom committed
508
     "execution_count": 9,
Oliver Kirsebom's avatar
Oliver Kirsebom committed
509
510
511
512
513
514
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "resnet_model.run(spec.image, return_raw_output=False)"
515
516
517
   ]
  },
  {
518
   "cell_type": "markdown",
519
520
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
521
    "The model output is an array with 2 elements. The first is the class (in this case 1 for upcall) and the second is the model score for that class (0.9999114).\n",
522
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
523
524
525
526
527
528
529
530
531
532
    "This is the output format we get with the parameter 'return_raw_output' set to False. If the parameter is set to True, we still get an array with 2 elements, but the first is the score for the negative class (0 or background noise) end the second is the score for the positive class (1 or upcall). Try to change this option to see the difference. This will be useful in our next step, when we process the whole file. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also try changing the time stamp so that the upcall appears in different positions in the spectrogram. See if the model can still find the upcall. He are a few suggestions: 1590, 1590.2, 1590.3, 1590.35, 1590.4, 1591\n",
    "\n",
    "You will see that the model can still detect the upcall, even if part of it is cut off, but if too large a fraction of the call is missing, the model fails at identifying it."
533
534
535
   ]
  },
  {
536
   "cell_type": "markdown",
537
538
   "metadata": {},
   "source": [
539
540
    "<a id='step_5'></a>\n",
    "## 5. Run the model through a 30 min file"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
541
542
543
544
545
546
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
547
    "To run the model through the entire 30-min file, we simply need to repeat the previous step many times, starting from the beginning of the file and each time feeding a new spectrogram to the model."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
548
549
   ]
  },
550
  {
551
   "cell_type": "markdown",
552
553
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
554
    "First, however, we will create a vector to store the detection scores. Each entry in this vector will have the score for the corresponding spectrogram. Since we are only interested in the scores for the positive class, we will set 'return_raw_output' argument to True and use the 2nd element of the output array."
555
556
557
558
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
559
   "execution_count": 10,
560
   "metadata": {},
561
   "outputs": [],
562
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
563
564
565
566
567
568
569
    "detection_scores = np.zeros(provider.num_segs) # create an array full of zeros with the desired length"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
570
    "To run through the 30-min file, we use the SpecProvider's 'next' method. Every time this method is called, a new 3-s spectrogram is computed for us."
571
572
573
574
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
575
   "execution_count": 11,
576
577
578
   "metadata": {},
   "outputs": [
    {
579
     "name": "stderr",
580
581
     "output_type": "stream",
     "text": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
582
      "100%|██████████| 599/599 [02:37<00:00,  3.80it/s]\n"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
583
584
585
586
     ]
    }
   ],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
587
588
589
    "for seg in tqdm(range(provider.num_segs)): # loop over all 3-s segments\n",
    "    spec = next(provider) # compute spectrogram for the next segment\n",
    "    detection_scores[seg] = resnet_model.run(spec.image, return_raw_output=True)[0,1] # pass the spectrogram to the model and save the score"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
590
591
   ]
  },
592
  {
593
   "cell_type": "markdown",
594
595
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
596
597
    "(Note that this sequential processing of spectrograms is computationally far from optimal. If we were actually going to put this detector in production, we would take several steps to optimize it. For example, we would take advantage of Tensorflow's capability to simultaneously process multiple spectrograms.) \n",
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
598
    "Let us take a look at the contents of the detection vector."
599
600
601
602
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
603
   "execution_count": 12,
604
605
606
607
608
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
       "array([3.25103174e-05, 1.84721756e-03, 2.88015799e-05, 1.31746550e-04,\n",
       "       1.45053826e-04, 4.45676233e-05, 1.39017648e-04, 5.85941089e-05,\n",
       "       5.39475237e-04, 1.17518452e-04, 1.11683109e-03, 5.96933300e-04,\n",
       "       4.30527070e-05, 1.11627007e-04, 1.96311017e-03, 7.29096573e-05,\n",
       "       2.20240036e-04, 6.05283894e-05, 9.70350593e-05, 3.23398563e-04,\n",
       "       1.10562960e-05, 1.07290589e-05, 9.38101730e-04, 2.41774064e-03,\n",
       "       9.33470583e-05, 1.87409496e-05, 2.87891671e-05, 6.10046009e-05,\n",
       "       3.10339819e-05, 3.08630624e-05, 8.46288283e-04, 1.06648041e-03,\n",
       "       5.21671376e-04, 3.50971852e-04, 1.12637339e-04, 1.28330186e-03,\n",
       "       5.10433316e-02, 2.34265681e-05, 1.74073828e-03, 4.84236225e-05,\n",
       "       3.06919741e-04, 9.25731874e-05, 1.30788423e-03, 2.06160275e-05,\n",
       "       1.09795707e-04, 6.02453132e-04, 2.33001847e-04, 8.31094861e-04,\n",
       "       2.77195883e-04, 1.21523944e-05, 1.06875632e-04, 1.12388061e-05,\n",
       "       1.03376195e-04, 1.27490321e-05, 4.06417566e-05, 1.96307028e-05,\n",
       "       1.43483121e-04, 3.78430763e-04, 1.01296515e-04, 6.42022542e-06,\n",
       "       2.60388246e-03, 1.30819928e-04, 3.90775494e-05, 7.93217987e-05,\n",
       "       1.12979840e-04, 2.96198472e-04, 3.15768179e-04, 1.20500976e-04,\n",
       "       2.55604973e-04, 3.25581379e-04, 6.15778554e-05, 3.67265056e-05,\n",
       "       2.84981761e-05, 5.22261007e-06, 1.29925611e-04, 1.48251507e-04,\n",
       "       8.53832971e-05, 4.70594474e-04, 1.95290166e-04, 1.90779872e-04,\n",
       "       4.28174186e-04, 9.04456610e-05, 6.46805929e-05, 8.35912433e-05,\n",
       "       1.31688270e-04, 5.50853329e-05, 2.05333985e-04, 1.23381280e-04,\n",
       "       1.28205575e-03, 3.68389930e-03, 1.20396682e-04, 6.10195566e-04,\n",
       "       1.18694291e-03, 5.70171542e-05, 4.77496287e-05, 2.47372640e-03,\n",
       "       1.00226935e-05, 9.91477136e-05, 1.01329279e-04, 1.01349387e-05,\n",
       "       1.20536285e-03, 2.36890791e-03, 8.28645076e-04, 2.76947813e-03,\n",
       "       2.97997642e-04, 6.42887375e-04, 1.71205815e-04, 7.51076557e-04,\n",
       "       3.16585065e-04, 1.04156358e-03, 6.00215397e-04, 3.47295892e-03,\n",
       "       2.10167817e-03, 3.00393743e-03, 2.31780857e-03, 7.36259390e-04,\n",
       "       2.82386813e-04, 8.41878296e-04, 3.54635238e-04, 5.51663339e-04,\n",
       "       7.33824345e-05, 5.02019405e-01, 2.31279456e-03, 2.18642977e-04,\n",
       "       6.60793250e-03, 7.84891832e-04, 5.19934343e-04, 2.71450001e-04,\n",
       "       1.88300008e-04, 2.42896890e-03, 2.68089026e-03, 3.43711535e-03,\n",
       "       1.72872620e-03, 3.62673216e-03, 1.36304752e-03, 8.56923929e-04,\n",
       "       9.12186457e-04, 5.17411623e-03, 3.04099661e-03, 8.50503973e-04,\n",
       "       5.04211173e-04, 3.23298248e-03, 2.38127308e-03, 5.46457130e-04,\n",
       "       6.88770146e-04, 1.19325484e-03, 3.93416150e-04, 4.61159245e-04,\n",
       "       1.44737225e-03, 4.59187105e-03, 6.80104783e-03, 3.41807189e-03,\n",
       "       1.20683794e-03, 1.61665922e-03, 1.08367475e-02, 4.32794797e-04,\n",
       "       5.24699641e-03, 7.43154436e-04, 2.39992561e-03, 9.91834677e-04,\n",
       "       1.35366071e-03, 6.61350277e-05, 6.06107176e-04, 1.02957478e-02,\n",
       "       2.48123863e-04, 2.48100172e-04, 1.70190807e-03, 1.89172348e-03,\n",
       "       5.07948548e-03, 8.35916668e-04, 9.58997814e-04, 1.38205208e-03,\n",
       "       5.30109974e-04, 7.40365125e-03, 1.96435532e-04, 8.53275356e-04,\n",
       "       1.65310223e-03, 4.76198364e-03, 1.89344387e-03, 2.76626716e-03,\n",
       "       2.67864059e-04, 1.97357134e-04, 1.65467069e-03, 1.01839995e-03,\n",
       "       1.21228013e-03, 1.16496719e-03, 1.20456517e-03, 2.47212942e-03,\n",
       "       9.89748514e-04, 2.65464233e-03, 9.93123511e-04, 1.60007869e-04,\n",
       "       2.82608526e-05, 5.94544737e-03, 1.96615607e-03, 4.53210901e-03,\n",
       "       5.57572488e-03, 1.46142312e-03, 4.29390045e-03, 1.66376354e-03,\n",
       "       7.69333623e-04, 3.67129804e-03, 9.92190391e-02, 3.39614903e-03,\n",
       "       3.22407461e-03, 3.52989118e-05, 6.07729750e-03, 2.60140304e-03,\n",
       "       9.37574171e-03, 1.08546694e-04, 1.52242463e-03, 8.34668579e-04,\n",
       "       2.35513668e-03, 7.18702038e-04, 3.74765787e-03, 3.93469440e-04,\n",
       "       3.56605015e-04, 9.20232385e-03, 1.15799811e-03, 4.22665337e-03,\n",
       "       5.05987601e-03, 4.36083088e-03, 1.45586627e-03, 1.28951005e-03,\n",
       "       1.09926229e-02, 5.20045112e-04, 1.45204272e-03, 2.01423280e-03,\n",
       "       3.61742405e-03, 1.61840732e-03, 6.40645041e-04, 3.45500298e-02,\n",
       "       4.32052836e-03, 6.83876546e-03, 7.47086480e-04, 2.08280093e-04,\n",
       "       3.13037308e-03, 3.73432122e-04, 2.33037514e-03, 3.77662422e-04,\n",
       "       2.26686685e-03, 4.07038908e-03, 3.06437514e-03, 1.84523556e-02,\n",
       "       8.36690815e-05, 1.75605982e-03, 5.16396540e-04, 3.72483069e-03,\n",
       "       3.53988889e-03, 7.71495746e-04, 3.17341153e-04, 2.46578420e-04,\n",
       "       1.34027004e-03, 3.14177596e-04, 4.72312066e-04, 6.67768996e-03,\n",
       "       3.96426348e-03, 2.35654283e-04, 1.57541619e-03, 1.90858194e-03,\n",
       "       1.09549938e-03, 5.08184265e-03, 1.02478906e-03, 6.19210629e-03,\n",
       "       7.27466686e-05, 1.83896534e-02, 6.58159668e-04, 1.13107241e-03,\n",
       "       5.02813840e-04, 1.54812110e-03, 3.49797425e-04, 1.07937949e-02,\n",
       "       2.25283951e-03, 1.01938273e-03, 2.62287399e-03, 8.42762832e-03,\n",
       "       3.71785718e-04, 1.00618824e-02, 2.87088263e-03, 1.36690112e-04,\n",
       "       3.21222120e-04, 8.22853472e-05, 1.44724338e-03, 1.16190175e-03,\n",
       "       8.93276464e-03, 3.28288274e-03, 1.71411448e-04, 3.28343798e-04,\n",
       "       1.28542655e-03, 9.85203385e-01, 1.68702509e-02, 1.49116653e-03,\n",
       "       1.57765124e-03, 3.14994901e-03, 5.16088167e-03, 1.76458806e-03,\n",
       "       1.22391470e-02, 7.19536096e-04, 1.00929267e-03, 8.03771487e-04,\n",
       "       1.79993287e-02, 1.41347398e-03, 9.95830866e-04, 9.64375079e-01,\n",
       "       1.13931121e-04, 1.73668296e-03, 1.87991100e-04, 5.57288062e-04,\n",
       "       7.34192843e-04, 6.91757814e-05, 1.77041849e-03, 7.77320063e-04,\n",
       "       1.10276276e-04, 6.80161815e-04, 2.52705021e-03, 2.60275006e-02,\n",
       "       3.56302015e-03, 3.99484561e-04, 3.08445713e-04, 3.15393903e-04,\n",
       "       1.89432804e-03, 9.21585597e-03, 8.99248481e-01, 2.10528262e-04,\n",
       "       3.44644475e-04, 1.11099864e-04, 2.76138162e-04, 3.45527031e-03,\n",
       "       3.77202581e-04, 4.09887591e-03, 1.13641319e-04, 5.62758185e-04,\n",
       "       1.02769036e-03, 9.99094009e-01, 1.30365475e-03, 2.99079489e-04,\n",
       "       5.06809272e-04, 2.98013852e-04, 1.11349544e-03, 9.97581244e-01,\n",
       "       8.34947452e-03, 4.37650946e-04, 3.28251772e-04, 1.72936905e-03,\n",
       "       6.53808063e-04, 4.41020995e-04, 1.66539894e-03, 3.27593647e-04,\n",
       "       2.38479624e-04, 1.69718335e-03, 1.34771457e-03, 5.24162268e-03,\n",
       "       6.41926192e-04, 1.58140392e-04, 6.94694230e-03, 1.47970894e-03,\n",
       "       5.29872545e-04, 3.06873373e-03, 4.56265901e-04, 9.99975085e-01,\n",
       "       3.91731970e-03, 1.64610695e-03, 3.34267766e-04, 9.33761708e-04,\n",
       "       2.28558504e-03, 9.99981642e-01, 1.44882550e-04, 3.54703236e-03,\n",
       "       2.15269905e-03, 1.38260802e-04, 9.99994874e-01, 4.88582591e-04,\n",
       "       8.58857180e-04, 1.93152076e-03, 1.30157394e-03, 1.45285636e-01,\n",
       "       1.46602199e-03, 1.97790607e-04, 7.02150632e-04, 2.91838776e-03,\n",
       "       5.80750580e-04, 3.04506300e-03, 6.48182002e-04, 2.32121279e-03,\n",
       "       9.99787271e-01, 1.16765805e-04, 5.00834612e-05, 9.99968052e-01,\n",
       "       8.74640245e-04, 5.47275133e-03, 2.33719195e-03, 5.82921144e-04,\n",
       "       8.83747544e-03, 6.41526049e-03, 9.99989271e-01, 8.57791820e-05,\n",
       "       5.96568733e-03, 5.42560010e-04, 7.54281320e-03, 4.98460140e-03,\n",
       "       6.80191994e-01, 9.30126105e-03, 5.35080093e-04, 5.20486273e-02,\n",
       "       4.90235980e-04, 1.19846174e-03, 1.33237208e-03, 9.91337700e-04,\n",
       "       2.76439693e-02, 2.67452240e-04, 1.97887566e-04, 1.08666508e-03,\n",
       "       1.00000000e+00, 1.45023305e-03, 5.08472847e-04, 5.56964241e-03,\n",
       "       1.44385628e-03, 7.21750199e-04, 4.78984817e-04, 5.30433003e-03,\n",
       "       9.99979734e-01, 6.52495632e-03, 1.10743614e-03, 1.39092561e-04,\n",
       "       2.90530570e-05, 5.31956437e-04, 8.79578438e-05, 2.32769598e-04,\n",
       "       9.99996901e-01, 4.90721257e-04, 6.54685078e-04, 2.09235688e-04,\n",
       "       1.35119725e-02, 4.70292201e-04, 3.01735965e-03, 1.27779739e-03,\n",
       "       1.27885193e-02, 1.12024986e-03, 5.01318509e-03, 2.32865947e-04,\n",
       "       1.97823495e-02, 5.02030691e-03, 4.13244998e-04, 3.59443366e-04,\n",
       "       5.40453882e-04, 2.37396313e-03, 4.73770808e-04, 3.96521250e-03,\n",
       "       3.86433647e-04, 2.66597053e-04, 1.20833446e-03, 9.93458211e-01,\n",
       "       7.42740347e-04, 5.08015743e-04, 1.98365026e-03, 3.91423004e-04,\n",
       "       9.99998093e-01, 2.36575701e-03, 4.07542626e-04, 1.44156637e-02,\n",
       "       1.96205860e-04, 4.84340737e-04, 1.31302774e-02, 1.48697868e-02,\n",
       "       6.96613220e-04, 3.54965404e-03, 1.23294874e-03, 2.15344899e-03,\n",
       "       4.34671529e-03, 1.23240185e-04, 2.80178909e-04, 4.59261530e-04,\n",
       "       7.07147177e-04, 7.15203467e-04, 2.59218563e-04, 6.62643276e-03,\n",
       "       6.39486650e-04, 5.13037958e-04, 9.18830931e-03, 1.33649760e-03,\n",
       "       5.41154444e-01, 3.48580210e-03, 7.15837639e-04, 1.14072708e-03,\n",
       "       9.59440693e-03, 4.34432887e-02, 2.43427767e-03, 4.50734282e-04,\n",
       "       5.83332148e-04, 5.42793982e-03, 7.63710006e-04, 5.79391664e-04,\n",
       "       1.56200863e-03, 1.22847408e-03, 5.70420059e-04, 1.10402482e-03,\n",
       "       8.62342179e-01, 1.24865837e-04, 3.87420738e-03, 6.60500009e-05,\n",
       "       6.78751618e-02, 7.05460028e-04, 3.96517292e-02, 5.20470378e-04,\n",
       "       9.98917341e-01, 2.59362045e-03, 7.64121243e-04, 2.20116763e-03,\n",
       "       8.97013396e-03, 1.16171854e-04, 4.23348043e-04, 1.22599592e-02,\n",
       "       8.43642512e-04, 1.09476550e-03, 2.73547135e-04, 9.99999285e-01,\n",
       "       1.75327118e-02, 2.06261594e-03, 1.58684573e-03, 2.04785122e-03,\n",
       "       2.84299604e-03, 6.69307046e-05, 9.30183160e-04, 2.21232578e-04,\n",
       "       2.48298049e-04, 5.29422006e-03, 9.67081869e-04, 5.04438358e-04,\n",
       "       9.99238491e-01, 2.06735334e-03, 1.75012305e-04, 2.57918867e-03,\n",
       "       6.30914466e-04, 2.16959254e-03, 4.76346497e-04, 2.46973638e-03,\n",
       "       1.01947319e-03, 1.80755754e-03, 1.78129901e-03, 4.07406682e-04,\n",
       "       3.65069926e-01, 9.04327631e-01, 7.35878129e-04, 3.55693093e-03,\n",
       "       1.32218498e-04, 4.01167141e-04, 5.77535102e-05, 5.22539718e-04,\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
746
       "       1.16324075e-03, 1.50150183e-04, 2.86751566e-03, 7.71471474e-04,\n",
747
748
749
750
751
752
753
754
755
756
757
       "       1.83998037e-03, 3.78185771e-02, 2.70077930e-04, 2.34913532e-04,\n",
       "       4.93836764e-04, 8.34441464e-03, 2.61657555e-02, 1.26036629e-03,\n",
       "       3.71488131e-04, 1.25312901e-04, 2.53790524e-03, 6.68659050e-04,\n",
       "       1.53577188e-04, 8.75511952e-03, 1.45516545e-03, 4.95519140e-04,\n",
       "       9.26809199e-03, 1.34035619e-03, 5.00587281e-04, 8.13135295e-04,\n",
       "       2.53237342e-03, 6.76614512e-03, 5.83549612e-04, 9.55684518e-04,\n",
       "       1.00000000e+00, 4.03505005e-03, 3.09835625e-04, 1.12562459e-02,\n",
       "       5.08364523e-04, 2.90895958e-04, 8.04855605e-04, 5.94655110e-04,\n",
       "       1.60420939e-04, 9.99984264e-01, 6.01116160e-04, 1.16055785e-03,\n",
       "       4.76010377e-03, 2.16048094e-03, 2.69407989e-04, 7.79609720e-04,\n",
       "       1.05636974e-03, 5.18143177e-04, 2.29002660e-04, 9.99998569e-01,\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
758
       "       4.16913361e-04, 1.60444505e-03, 3.74676427e-03])"
759
760
      ]
     },
Oliver Kirsebom's avatar
Oliver Kirsebom committed
761
     "execution_count": 12,
762
763
764
765
766
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
767
    "detection_scores"
768
769
   ]
  },
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
770
771
772
773
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
774
    "Just a bunch of numbers! To gain a better understanding of what these numbers mean, let us make a simple plot that shows the detection scores versus time, and compares them to the annotations. The following lines of code generate a plot in which the annotations are shown as green dots and the detections as a blue line."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
775
776
777
778
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
779
   "execution_count": 13,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
780
781
782
   "metadata": {},
   "outputs": [
    {
783
     "data": {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
784
      "image/png": "\n",
785
786
787
788
789
790
791
792
      "text/plain": [
       "<Figure size 864x288 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
793
794
795
    }
   ],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
796
    "# prepare the canvas\n",
797
    "fig, axes = plt.subplots(2,1, sharex=False, figsize=(12,4))\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
798
799
    "\n",
    "# draw the annotations as green dots\n",
800
801
802
803
804
    "axes[0].scatter(annotations_30min['timestamp'], np.ones(30), color='green', s=15)\n",
    "axes[0].set_xlim(0, 1800)\n",
    "axes[0].set_xticks(np.linspace(0, 1800, 30))\n",
    "axes[0].set_xticklabels([str(t) for t in np.arange(1, 31, 1)])\n",
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
805
    "# draw the detection scores as a solid, blue line\n",
806
807
808
    "axes[1].plot(detection_scores)\n",
    "axes[1].set_xticks(np.linspace(0, 600, 30))\n",
    "axes[1].set_xticklabels([str(t) for t in np.arange(1, 31, 1)])\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
809
810
    "axes[1].margins(x=0)\n",
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
811
    "plt.show()"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
812
813
814
815
816
817
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
818
819
    "This gives us a good visual idea of how the model is performing. \n",
    "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
820
    "This concludes the first part of the tutorial. However, there are many things we could still explore. The next natural step would be to actually compute some metrics like precision and recall. And this raises some interesting questions: What should count as a true positive? If the detection is displaced by 0.5 seconds with respect to the annotation timestamp, should it still count as a detection? What if it is 5 seconds off? Or 1 minute? Maybe we should define a buffer around each annotation and count any detections that fall within that extended time window? But should you do the same when counting false positives? And how should you take the score value into consideration?\n",
821
822
    "\n",
    "The answer to these questions probably depends on your application and what you want to do with the model. We will talk a little more about these choices when we present our study case after this hands-on session."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
823
824
825
826
827
828
829
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
830
    "# Part II: Building the training data set"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
831
832
833
834
835
836
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
837
    "In this part we will illustrate the main steps taken to create that training data base that was used to train the ResNet model."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
838
839
840
841
842
843
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
844
845
    "<a id='step_6'></a>\n",
    "## 6. Load and view annotations"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
846
847
848
849
850
851
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
852
    "As usual, we begin by importing necessary packages and functions."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
853
854
855
856
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
857
   "execution_count": 14,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
858
859
860
   "metadata": {},
   "outputs": [],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
861
862
863
864
865
    "from ketos.data_handling.parsing import load_spectrogram_configuration\n",
    "from ketos.data_handling.database_interface import create_spec_database, load_specs\n",
    "import pandas as pd\n",
    "import tables\n",
    "import matplotlib.pyplot as plt"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
866
867
868
869
870
871
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
872
    "To save space, we have only included 40 of the more than 6,000 audio segments that were used to create the original training data set. Below, we specify the path to (1) the folder containing these 40 audio files, (2) the annotations table, and (3) the spectrogram configuration file."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
873
874
875
   ]
  },
  {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
876
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
877
   "execution_count": 15,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
878
   "metadata": {},
Oliver Kirsebom's avatar
Oliver Kirsebom committed
879
   "outputs": [],
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
880
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
881
882
883
    "path_to_audio = 'assets/data/wav_3s/'  # (1) audio data folder\n",
    "path_to_ann = 'assets/data/annotations_3s.csv'  # (2) annotations table\n",
    "path_to_spec = 'assets/spec_config.json'  # (3) spectrogram configuration"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
884
885
886
   ]
  },
  {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
887
   "cell_type": "markdown",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
888
889
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
890
    "Let us briefly inspect the contents of the annotations table,"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
891
892
893
894
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
895
   "execution_count": 16,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
896
   "metadata": {},
897
898
   "outputs": [
    {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
899
900
901
     "name": "stdout",
     "output_type": "stream",
     "text": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
      "                                filename  label  start  end\n",
      "0    IML-BA_Sample_2019-06-02_025209.wav      1    0.0  3.0\n",
      "1    IML-BA_Sample_2019-06-02_065235.wav      1    0.0  3.0\n",
      "2    IML-BA_Sample_2019-06-08_012119.wav      1    0.0  3.0\n",
      "3    IML-BA_Sample_2019-06-21_201729.wav      1    0.0  3.0\n",
      "4    IML-BA_Sample_2019-06-29_124902.wav      1    0.0  3.0\n",
      "5       VAS_Sample_2019-06-06_052058.wav      1    0.0  3.0\n",
      "6       VAS_Sample_2019-06-06_052147.wav      1    0.0  3.0\n",
      "7       VAS_Sample_2019-06-06_052545.wav      1    0.0  3.0\n",
      "8       VAS_Sample_2019-06-06_052735.wav      1    0.0  3.0\n",
      "9       VAS_Sample_2019-06-06_053040.wav      1    0.0  3.0\n",
      "10      VAS_Sample_2019-06-06_053236.wav      1    0.0  3.0\n",
      "11      VAS_Sample_2019-06-06_054134.wav      1    0.0  3.0\n",
      "12      VAS_Sample_2019-06-06_055829.wav      1    0.0  3.0\n",
      "13      VAS_Sample_2019-06-06_062147.wav      1    0.0  3.0\n",
      "14      VAS_Sample_2019-06-29_012008.wav      1    0.0  3.0\n",
      "15      VAS_Sample_2019-06-29_081309.wav      1    0.0  3.0\n",
      "16      VAS_Sample_2019-06-29_120121.wav      1    0.0  3.0\n",
      "17      VAS_Sample_2019-06-30_141022.wav      1    0.0  3.0\n",
      "18      VAS_Sample_2019-06-30_141128.wav      1    0.0  3.0\n",
      "19      VAS_Sample_2019-06-30_141718.wav      1    0.0  3.0\n",
      "20      VAS_Sample_2019-06-30_143503.wav      1    0.0  3.0\n",
      "21      VAS_Sample_2019-07-02_230526.wav      1    0.0  3.0\n",
      "22      VAS_Sample_2019-07-02_231034.wav      1    0.0  3.0\n",
      "23      VAS_Sample_2019-07-02_233525.wav      1    0.0  3.0\n",
      "24      VAS_Sample_2019-07-03_134346.wav      1    0.0  3.0\n",
      "25      VAS_Sample_2019-07-04_170440.wav      1    0.0  3.0\n",
      "26      VAS_Sample_2019-07-05_070118.wav      1    0.0  3.0\n",
      "27      VAS_Sample_2019-07-05_070511.wav      1    0.0  3.0\n",
      "28      VAS_Sample_2019-07-05_073450.wav      1    0.0  3.0\n",
      "29      VAS_Sample_2019-07-05_121637.wav      1    0.0  3.0\n",
      "..                                   ...    ...    ...  ...\n",
      "775     VAS_Sample_2019-08-19_051816.wav      1    0.0  3.0\n",
      "776     VAS_Sample_2019-08-20_031409.wav      1    0.0  3.0\n",
      "777     VAS_Sample_2019-08-20_032612.wav      1    0.0  3.0\n",
      "778     VAS_Sample_2019-08-20_032632.wav      1    0.0  3.0\n",
      "779     VAS_Sample_2019-08-20_033450.wav      1    0.0  3.0\n",
      "780     VAS_Sample_2019-08-20_033744.wav      1    0.0  3.0\n",
      "781     VAS_Sample_2019-08-20_033806.wav      1    0.0  3.0\n",
      "782     VAS_Sample_2019-08-20_034951.wav      1    0.0  3.0\n",
      "783     VAS_Sample_2019-08-20_035010.wav      1    0.0  3.0\n",
      "784     VAS_Sample_2019-08-20_035434.wav      1    0.0  3.0\n",
      "785     VAS_Sample_2019-08-20_035515.wav      1    0.0  3.0\n",
      "786     VAS_Sample_2019-08-20_040953.wav      1    0.0  3.0\n",
      "787     VAS_Sample_2019-08-20_070757.wav      1    0.0  3.0\n",
      "788     VAS_Sample_2019-08-20_131106.wav      1    0.0  3.0\n",
      "789     VAS_Sample_2019-08-20_131235.wav      1    0.0  3.0\n",
      "790     VAS_Sample_2019-08-20_132646.wav      1    0.0  3.0\n",
      "791     VAS_Sample_2019-08-20_154533.wav      1    0.0  3.0\n",
      "792     VAS_Sample_2019-08-20_161003.wav      1    0.0  3.0\n",
      "793     VAS_Sample_2019-08-20_162215.wav      1    0.0  3.0\n",
      "794     VAS_Sample_2019-08-20_163426.wav      1    0.0  3.0\n",
      "795     VAS_Sample_2019-08-20_164353.wav      1    0.0  3.0\n",
      "796     VAS_Sample_2019-08-20_172823.wav      1    0.0  3.0\n",
      "797     VAS_Sample_2019-08-20_173659.wav      1    0.0  3.0\n",
      "798     VAS_Sample_2019-08-20_174028.wav      1    0.0  3.0\n",
      "799     VAS_Sample_2019-08-20_174038.wav      1    0.0  3.0\n",
      "800     VAS_Sample_2019-08-20_174947.wav      1    0.0  3.0\n",
      "801     VAS_Sample_2019-08-20_195026.wav      1    0.0  3.0\n",
      "802     VAS_Sample_2019-08-20_195349.wav      1    0.0  3.0\n",
      "803     VAS_Sample_2019-08-20_195357.wav      1    0.0  3.0\n",
      "804     VAS_Sample_2019-08-20_202745.wav      1    0.0  3.0\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
964
      "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
965
      "[805 rows x 4 columns]\n"
966
967
968
     ]
    }
   ],
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
969
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
970
    "print(pd.read_csv(path_to_ann))"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
971
972
973
   ]
  },
  {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
974
   "cell_type": "markdown",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
975
976
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
977
    "We see that the annotations file has four named columns: filename, label, start, end. The label indicates which sound is present. In this case, we are only concerned with one sound, namely, the NARW upcall, so all entries have label 1. The columns 'start' and 'end' give the start and end time of the call, in seconds from the beginning of the file. In the present case, the files are all 3 seconds long and the entire file is designated as an upcall, even though the upcall is usually only 1 second long."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
978
979
980
981
982
983
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
984
985
986
987
    "<a id='step_7'></a>\n",
    "## 7. Compute spectrograms and store them in an HDF5 database\n",
    "\n",
    "We use the same spectrogram parameters as in the first part of the tutorial:"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
988
989
990
991
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
992
   "execution_count": 17,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
993
   "metadata": {},
Oliver Kirsebom's avatar
Oliver Kirsebom committed
994
995
   "outputs": [
    {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
996
997
998
999
1000
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SpectrogramConfiguration(rate=1000, window_size=0.256, step_size=0.032, bins_per_octave=None, window_function=<WinFun.HAMMING: 1>, low_frequency_cut=0, high_frequency_cut=500, length=3.0, overlap=0.0, type='Mag')\n"
     ]
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1001
1002
    }
   ],
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1003
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1004
1005
    "cfg = load_spectrogram_configuration(path_to_spec) # load parameters\n",
    "print(cfg) # print them"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1006
1007
1008
   ]
  },
  {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1009
   "cell_type": "markdown",
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1010
1011
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1012
    "Ketos has a handy function called 'create_spec_database', which allows us to create the training database in just one step:"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1013
1014
1015
1016
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1017
   "execution_count": 18,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1018
1019
1020
   "metadata": {},
   "outputs": [
    {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1021
1022
1023
1024
1025
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "40 spectrograms saved to train.h5\n"
     ]
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1026
1027
1028
    }
   ],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1029
    "create_spec_database(output_file='train.h5', input_dir=path_to_audio, annotations_file=path_to_ann, spec_config=cfg, pad=False)"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1030
1031
1032
1033
1034
1035
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1036
    "The spectrograms have now been stored in a HDF5-format database along with the annotation information.  "
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1037
1038
1039
1040
1041
1042
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1043
1044
    "<a id='step_8'></a>\n",
    "## 8. Inspect the HDF5 database"
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1045
1046
1047
   ]
  },
  {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1048
   "cell_type": "markdown",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1049
1050
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1051
    "Before moving on to the third and last part of the tutorial, where we will be training the network, let us spend a moment inspecting the contents of the HDF5 database file."
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1052
1053
1054
1055
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1056
   "execution_count": 19,
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1057
1058
1059
   "metadata": {},
   "outputs": [
    {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1060
1061
1062
1063
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "train.h5 (File) ''\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1064
      "Last modif.: 'Mon Nov 18 19:21:24 2019'\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1065
1066
1067
1068
1069
      "Object Tree: \n",
      "/ (RootGroup) ''\n",
      "/spec (Table(40,), fletcher32, shuffle, zlib(1)) ''\n",
      "\n"
     ]
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1070
1071
1072
    }
   ],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1073
1074
1075
1076
1077
1078
1079
1080
1081
    "db = tables.open_file('train.h5', 'r')  # open train.h5 in read ('r') mode\n",
    "print(db)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The information generated by the print command is a little cryptic, but essentially tells us that the database file contains one table (called 'spec') with 40 elements, i.e., spectrograms. Ketos has a handy function called 'load_specs' that allows us to easily load loading spectrograms from HDF5 databases:"
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1082
1083
1084
1085
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1086
   "execution_count": 20,
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1087
1088
1089
   "metadata": {},
   "outputs": [],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1090
    "table = db.get_node(\"/spec\")  # handle for the table\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1091
    "specs = load_specs(table, [17,37])  # load spectrograms 17 and 37"
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1092
1093
1094
1095
1096
1097
1098
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, let us plot the two spectrograms we've just loaded."
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1099
1100
1101
1102
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1103
   "execution_count": 21,
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1104
1105
1106
1107
   "metadata": {},
   "outputs": [
    {
     "data": {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1108
      "image/png": "\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1109
      "text/plain": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
       "<Figure size 432x396 with 4 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x396 with 4 Axes>"
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1123
1124
1125
1126
1127
1128
1129
1130
1131
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1132
1133
1134
1135
1136
1137
1138
    "# plot the 1st spectrogram\n",
    "specs[0].plot(1) \n",
    "plt.show()\n",
    "\n",
    "# plot the 2nd spectrogram\n",
    "specs[1].plot(1)\n",
    "plt.show()"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1139
1140
1141
1142
1143
1144
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1145
    "We see that the first spectrogram contains an upcall, whereas the second spectrogram only has background noise."
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1146
1147
1148
1149
1150
1151
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1152
1153
    "***\n",
    "# Part III: Training the model"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1154
1155
1156
1157
1158
1159
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1160
1161
    "<a id='step_9'></a>\n",
    "## 9. Select ResNet architecture"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1162
1163
1164
1165
1166
1167
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1168
    "We start by creating a new ResNet model based on that same recipe that we used in the first part of the tutorial. However, this time we will not load any weights."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1169
1170
1171
1172
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1173
   "execution_count": 22,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1174
   "metadata": {},
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1175
   "outputs": [],
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1176
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1177
1178
    "recipe = ResNetInterface.read_recipe_file(\"assets/recipe.json\")\n",
    "new_resnet_model = ResNetInterface.build_from_recipe(recipe)"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1179
1180
1181
   ]
  },
  {
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1182
1183
   "cell_type": "markdown",
   "metadata": {},
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1184
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1185
1186
    "<a id='step_10'></a>\n",
    "## 10. Configure batch generator"
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1187
1188
1189
1190
1191
1192
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1193
    "We load the training/validation dataset from the HDF5 table that we created previously."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1194
1195
1196
1197
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1198
   "execution_count": 43,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1199
   "metadata": {},
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1200
   "outputs": [],
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1201
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1202
1203
    "import numpy as np\n",
    "from ketos.data_handling.data_feeding import BatchGenerator # A helper class to read data from disk in batches "
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1204
1205
1206
1207
1208
1209
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1210
    "We will split the dataset into a training set of 30 (randomly selected) samples and a validation set consisting of the 10 remaining samples.  "
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1211
1212
1213
1214
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1215
   "execution_count": 44,
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1216
   "metadata": {},
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1217
1218
1219
1220
1221
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1222
1223
1224
      "[24  2 17 15 39 25  9 22 28 16 30  5 14  1 32  6  3 35 13  8 27 33 19 10\n",
      " 34 31  7 26 36 21]\n",
      "[0, 4, 11, 12, 18, 20, 23, 29, 37, 38]\n"
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1225
1226
1227
     ]
    }
   ],
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1228
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1229
1230
    "train_indices = np.random.choice(np.arange(40), 30, replace=False) # select 30 random indices out of 0...39\n",
    "val_indices = [i for i in range(40) if i not in train_indices]     # 10 indices that were not selected\n",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1231
1232
1233
    "\n",
    "print(train_indices)\n",
    "print(val_indices)"
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1234
1235
1236
1237
1238
1239
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1240
    "Next, we create two batch generators. These are handy objects, which can read data from disk in batches and serve them to the neural network in the required format."
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1241
1242
1243
1244
   ]
  },
  {
   "cell_type": "code",
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1245
   "execution_count": 39,
Oliver Kirsebom's avatar
readme    
Oliver Kirsebom committed
1246
1247
1248
   "metadata": {},
   "outputs": [],
   "source": [
Oliver Kirsebom's avatar
Oliver Kirsebom committed
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
    "from ketos.data_hand