Geophony output format
As we begin implementing functionalities for storying/loading geophony modelling results, it will be impotant to align our efforts with what has already been done by the UQAR team in relation to the Ocean Soundscape Atlas. Below, is a description provided by Pierre of their format:
We are using the netCDF-4 classic format. That one is based on hdf5 and allows to have files bigger than 2GB.
In terms of metadata that we use in those files, we are following the CF and ACDD conventions. Like I said in yesterday’s meeting, we have some in-depth documentation and a summary spreadsheet which are already available here:
The next thing we will have to decide is what kind of data we will have to transfer between us so that we can use it for the Soundscape Atlas. I don’t know exactly what kind of outputs you will be able to produce. I would guess that you can make something like 3D maps of the geophony sounds (in dBs) for a particular time and frequency (like we do here for the shipping noise). In our case, after checking with the team here (that I’ve also added in CC), what we would ultimately need on our side is what we call here the risk maps. Those are the risks of exceeding a certain sound pressure level threshold which is basically 1 minus the CDF (the cumulative distribution function) of the sound pressure levels over all the timesteps you have over a day or another period of time (and that for each day, 3D point and frequency). The periods of time we use are daily, monthly and annually. We usually make 1 file by day for the daily risks, 1 file by month for the monthly risks, etc.. even if we have a dimension of time in our files. When we compute the CDF, we use that opportunity to add an error distribution over our results such as a gaussian distribution or a chi-square distribution. From those risks/CDFs we will be able to compute the quantiles maps and the impact risks maps that requires the geophony noise.
We understand that making those files is an extra step and that they can take a good amount of storage space (in our case for with the shipping noise, it takes about 12 TB of storage alone). So if you prefer, we could make those files from your sound pressure levels outputs (we have a very similar netCDF format for them) once they will be completed. This may also allow us to reduce the size of the data to transfer.
Finally, it might be good to share the same coordinates/dimensions (or some of them) such as the frequencies that we use, the depth scale and the 2D grid. Note that this is not at all a necessity since you might have already decided on the coordinates/dimensions that you are using and that you might have different needs than us for them. We can always interpolate the data if needed. Here is the list our dimensions that we are currently using :
Frequencies : 16, 20, 40, 63, 125, 200, 1000, 10000 Hz Depth scale : 0.5, 1.5, 2.5, 5, 7.5, 10, 15, 20, 25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 535 meters 2D coordinates : We use a GEBCO grid regular in lat-lon with a step of 30 arc seconds (or 1/120 degrees). It makes “pixels” of about 0.6km by 0.9km. I’ve uploaded on the Google Drive an empty netCDF file which contain the grid that we are using (along with the other dimensions and the metadata fields). Sound pressure level thresholds (the x axis of the CDFs) : -15 to 250 dB with a step 0.5 dB Time scale : We use 30 minutes between time steps for the shipping noise since we have a lot variability but it doesn’t really matter what you choose on your side since it will be converted into daily CDFs anyway. We will need the data for the year 2013 if it is possible since we have computed the shipping noise for that same year (actually we are about to finish this, we still have to complete the high frequencies with Bellhop).