The audio data files are large in size and long in duration. 10-second samples at 192kHz are 3.5MB in size and and each ACD generates 1.26GB every 6 hours. It is difficult for humans to listen to these files. All of the sounds are scrambled below 7kHz to obscure human voice which makes the sound inherently noisy and disorienting. To spot trends and shifts over long periods of time requires an attentive form of listening that is also very difficult and time consuming.
We have developed an experimental application that allowed us to extract the spectral energy across each 10-second segment and save these aggregated snapshots of spectra alongside one another. With this application, is possible to rapidly ‘scrub’ through the spectral content but still use our ears to pick out areas of interest very quickly. When a spectral snapshot seems interesting or surprising, we can listen to the original sound file that made the spectra.
Another approach we have explored involves concatenating the files into 6-hour chunks and generating spectrograms of that entire period of time, as shown in the following figure [click image to expand]:
Long-duration spectrograms like this give an instant overview of a period that a researcher may find interesting. Whole days and different ACDs can quickly be compared while trends, anomalies and other features (such as sounds in the ultrasonic band) noticed at a glance. Researchers interested to explore more closely can then home in on these specific files and generate further close-up spectrograms of particular areas of interest. With several of these images lined up researchers can look longitudinally across the season or seasons, laterally across the day and spatially across however many boxes are making recordings.
Upstairs in St Cecilia’s concert hall are six boxes on wheels. Each box contains a small computer, amplifier, speakers, battery and a compass sensor. Playing through these boxes are sounds recorded in and around the Meadows in Edinburgh. As you move the boxes around, the sound changes. When you’ve discovered positions that you think you like, sit back, lie down, relax and wait for others to shift things. If you want to make a change, intervene…
I am in the thick of developing the sound installation for this project which will reveal some of the concepts behind our work and show some of the sounds that will be captured by the Simon Chapple’s sensor network. I’ll explain more about that in another post soon. Meanwhile, I’m taking a break from thinking about gel loudspeakers, compass and heading settings on mobile phones in order to say a little about my experience working with Simon’s Rasberry Pi, wireless, 192kHz audio beacon prototype earlier this year.
Simon lent me his prototype in order for me to hear what’s going on in my garden in late January and to run some noise profiling tests. I was keen to see if the small mammals that must live in the garden are interested and active around our compost heap. I dutifully positioned the sensor box where I hoped I’d hear mice and other mammals fighting over leftover potato peelings but sadly — as far as I can tell at least — nothing of the sort: no fights or mating rituals at this time of year. The absence of activity is useful, since it suggests that there has been plentiful food for small mammals to find earlier in the year/day and they’re not risking the wind, rain, snow and something new in the garden to get a late night snack. However, a largely quiet night means that the few moments of sonic event are all the more interesting and easy to spot.
A word on the tools I’m using
I’ve been using SoX to generate spectrograms of the 10-second audio clips collected. It’s a good way to quickly inspect if there is something of interest to listen to. With over 9 hours of material though, it’s not interesting to listen to the whole night again. Instead, I first joined all of the files together using a single command line from SoX in the terminal window on OS X:
sox *.wav joined.wav
I then generated a spectrogram of that very long .wav file. However, the resolution of a 9 hour file needs to be massive to give any interesting detail. Instead, I decided to group the files into blocks of an hour and then rendered a 2500 pixel spectrogram of each 10-second burst. It’s very quick to then scroll down through the images until something interesting appears. Here’s the .sh script I used:
### Generate for WAV
for file in *.wav; do
sox "$file" -n spectrogram -t "$title_in_pic" -o "$outfile" -x 2500
From the rendered spectrograms, I can quickly see that there were some interesting things happening at a few points in the night and can zoom in further. For example this moment at 1:43 am looks interesting:
It sounds like this:
I suspect that this is a mouse, cat or rat. Anyone reading this got a suggestion as to what it might be?
As the night drew on, the expected happened and birds began to appear — the first really obvious call was collected at 6:23 am. It looks like this:
And it sounds like this:
If you’re able to hear this audio clip, then you’ll be aware of the noise that is encoded into the file. One of our challenges going forwards is how to reduce and remove this. I’ve tried noise profiling and attempted to reduce the noise from the spectra, but this has affected the sounds we want to hear and interpret. Bascially by removing the noise, you also remove parts of the sound that are useful. I’m reflecting on this and think that there are ways to improve how electricity is distributed to the Rasberry Pi in Simon’s box from the battery and whether we need some USB cables with capacitors built in to stop the noise. However, noise reduction may not be as important to others as it is to me. My speciality is sound itself, in particular, how things sound, I want to get rid of as much unnecessary noise as possible so that when I’m noisy, it’s intentional. However, for an IoT project, the listener isn’t going to be a human but a computer. Computers will analyse the files, computers will detect if something interesting is happening and computers will attempt to work out what that interesting thing is based on things it’s been told to look for. It’s highly likely that the noise, which very quickly makes these sounds seem irritating and hard to engage with for a human, may well be perfectly fine for a computer to deal with. Let’s see.