Environmental Monitoring with Low-Processing Embedded AI through Sound Event Classification

Authors

DOI:

https://doi.org/10.5753/jbcs.2025.4210

Keywords:

Embedded AI, Environmental Monitoring, Sound Event Classification, Machine Learning, Wavelet Packet Transform, Acoustics

Abstract

In this work, we propose an embedded low-processing Machine Learning solution designed to assist in environmental acoustic monitoring. The pre-processing stage employs the Wavelet Packet Transform, generating low-dimensional features that serve as inputs to a Gradient Boosting model for the near-real-time classification of relevant sound events. Subsequently, we introduce an event filter that checks if there is any relevant event occurring at the moment before sending the features to the model or ignores them until any sound event is detected. This approach enhances the robustness of our solution, making it resilient to noise and wind-contaminated samples while optimizing memory, battery, and computational power usage. Finally, we converted the processing pipeline and trained model to the C programming language, successfully embedding them into the Nordic Thingy:53, a low-power hardware device equipped with a built-in digital Pulse Density Modulation microphone (VM3011 from Vesper). To evaluate the efficacy of our proposed method, we compared it with a convolutional neural network approach using Mel-frequency cepstral coefficients and conducted tests using audio recordings of bird species found in forests located in the central and western regions of Brazil, as well as samples of human activity-related sounds. The favorable classification scores obtained, in conjunction with the embedded solution's substantial battery life capacity, have the potential to greatly reduce the necessity for extensive environmental monitoring field surveys.

Downloads

Download data is not yet available.

References

Bardeli, R., Wolff, D., Kurth, F., Koch, M., Tauchert, K.-H., and Frommolt, K.-H. (2010). Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognition Letters, 31(12):1524-1534. DOI: 10.1016/j.patrec.2009.09.014.

Bergstra J., Bardenet R., B. Y. and B., K. (2011). Algorithms for hyper-parameter optimization. In Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., and Weinberger, K., editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc. Available at: [link].

Bianchi D., Mayrhofer E., G. M. B. G. V. A. (2015). Wavelet packet transform for detection of single events in acoustic emission signals. Mechanical Systems and Signal Processing, 64:441-451. DOI: 10.1016/j.ymssp.2015.04.014.

Blagus, R. and Lusa, L. (2013). Smote for high-dimensional class-imbalanced data. BMC bioinformatics, 14:1-16. DOI: 10.1186/1471-2105-14-106.

Bradfer-Lawrence, T., Gardner, N., Bunnefeld, L., Bunnefeld, N., Willis, S. G., and Dent, D. H. (2019). Guidelines for the use of acoustic indices in environmental research. Methods in Ecology and Evolution, 10(10):1796-1807. DOI: 10.1111/cobi.12968.

Branco, S., Ferreira, A. G., and Cabral, J. (2019). Machine learning in resource-scarce embedded systems, fpgas, and end-devices: A survey. Electronics, 8(11):1289. DOI: 10.3390/electronics8111289.

Burivalova, Z., Game, E. T., and Butler, R. A. (2019). The sound of a tropical forest. Science, 363(6422):28-29. DOI: 10.1126/science.aav1902.

Burivalova, Z., Towsey, M., Boucher, T., Truskinger, A., Apelis, C., Roe, P., and Game, E. T. (2018). Using soundscapes to detect variable degrees of human influence on tropical forests in papua new guinea. Conservation Biology, 32(1):205-215. DOI: 10.1111/cobi.12968.

Cai, J., Ee, D., Pham, B., Roe, P., and Zhang, J. (2007). Sensor network for the monitoring of ecosystem: Bird species recognition. In 2007 3rd international conference on intelligent sensors, sensor networks and information, pages 293-298. IEEE. DOI: 10.1109/ISSNIP.2007.4496859.

Carvalho, S. and Gomes, E. F. (2023). Automatic classification of bird sounds: using mfcc and mel spectrogram features with deep learning. Vietnam Journal of Computer Science, 10(01):39-54. DOI: 10.1142/s2196888822500300.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321-357. DOI: 10.1613/jair.953.

Coifman, R. R., Meyer, Y., Quake, S., and Wickerhauser, M. V. (1994). Signal processing and compression with wavelet packets. Wavelets and their applications, pages 363-379. DOI: 10.1007/978-94-011-1028-0_18.

Deichmann, J. L., Acevedo-Charry, O., Barclay, L., Burivalova, Z., Campos-Cerqueira, M., d'Horta, F., Game, E. T., Gottesman, B. L., Hart, P. J., Kalan, A. K., et al. (2018). It's time to listen: there is much to be learned from the sounds of tropical ecosystems. Biotropica, 50(5):713-718. DOI: 10.1111/btp.12593.

Ferreira-Paiva, L., Alfaro-Espinoza, E., Almeida, V. M., Felix, L. B., and Neves, R. V. (2022). A survey of data augmentation for audio classification. In Congresso Brasileiro de Automática-CBA, volume 3. DOI: 10.20906/cba2022/3469.

Frusque, G. and Fink, O. (2022). Learnable wavelet packet transform for data-adapted spectrograms. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3119-3123. IEEE. DOI: 10.1109/ICASSP43922.2022.9747491.

Gao, R. X. and Yan, R. (2010). Wavelets: Theory and applications for manufacturing. Springer Science & Business Media. DOI: 10.1007/978-1-4419-1545-0.

Garamszegi, L. Z., Zsebok, S., and Török, J. (2012). The relationship between syllable repertoire similarity and pairing success in a passerine bird species with complex song. Journal of Theoretical Biology, 295:68-76. DOI: 10.1016/j.jtbi.2011.11.011.

Gokhale, M., Khanduja, D. K., et al. (2010). Time domain signal analysis using wavelet packet decomposition approach. Int'l J. of Communications, Network and System Sciences, 3(03):321. DOI: 10.4236/ijcns.2010.33041.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning. MIT press. DOI: 10.1038/nature14539.

Guo, R. and Zheng, Y. (2022). A wind noise detection and suppression method in digital hearing aid. In 2022 International Conference on Networks, Communications and Information Technology (CNCIT), pages 20-24. IEEE. DOI: 10.1109/cncit56797.2022.00012.

Honkakunnas, A. (2021). Characterizing and detecting wind noise in audio recordings. Master's thesis. Available at: [link].

Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., and Kalenichenko, D. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2704-2713. DOI: 10.1109/cvpr.2018.00286.

Juodakis, J. and Marsland, S. (2022). Wind-robust sound event detection and denoising for bioacoustics. Methods in Ecology and Evolution, 13(9):2005-2017. DOI: 10.1111/2041-210X.13928.

Kahl, S., Wood, C. M., Eibl, M., and Klinck, H. (2021). Birdnet: A deep learning solution for avian diversity monitoring. Ecological Informatics, 61:101236. DOI: 10.1016/j.ecoinf.2021.101236.

Kamarajugadda, R., Battula, R., Borra, C. R., Durga, H., Bypilla, V., Reddy, S. S., Khan, F. F., and Bhavanam, S. (2024). Optimizing avian species recognition with mfcc features and deep learning models. International Journal of Information Technology, 16(7):4621-4626. DOI: 10.1007/s41870-024-02108-1.

Kershenbaum, A., Blumstein, D. T., Roch, M. A., Akçay, Ã., Backus, G., Bee, M. A., Bohn, K., Cao, Y., Carter, G., Cäsar, C., et al. (2016). Acoustic sequences in non-human animals: a tutorial review and prospectus. Biological Reviews, 91(1):13-52. DOI: 10.1111/brv.12160.

Lasseck, M. (2018). Acoustic bird detection with deep convolutional neural networks. In DCASE, pages 143-147. Available at: [link].

Mekonen, S. (2017). Birds as biodiversity and environmental indicator. Indicator, 7(21). DOI: 10.3390/electronics8111289.

Morelli, F., Reif, J., Díaz, M., Tryjanowski, P., Ibáñez-álamo, J. D., Suhonen, J., Jokimäki, J., Kaisanlahti-Jokimäki, M.-L., Moller, A. P., Bussiere, R., et al. (2021). Top ten birds indicators of high environmental quality in european cities. Ecological Indicators, 133:108397. DOI: 10.1016/j.ecolind.2021.1083979.

Nelke, C., Jax, P., and Vary, P. (2016). Wind noise detection: Signal processing concepts for speech communication. Energy, 60(40):20. Available at: [link].

Ntalampiras, S. and Potamitis, I. (2021). Acoustic detection of unknown bird species and individuals. CAAI Transactions on Intelligence Technology, 6(3):291-300. DOI: 10.1049/cit2.12007.

Piczak, K. J. (2015). ESC: Dataset for Environmental Sound Classification. In Proceedings of the 23rd Annual ACM Conference on Multimedia, pages 1015-1018. DOI: 10.1145/2733373.2806390.

Potamitis, I., Ntalampiras, S., Jahn, O., and Riede, K. (2014). Automatic bird sound detection in long real-field recordings: Applications and tools. Applied Acoustics, 80:1-9. DOI: 10.1016/j.apacoust.2014.01.001.

Prince, P., Hill, A., Piña Covarrubias, E., Doncaster, P., Snaddon, J. L., and Rogers, A. (2019). Deploying acoustic detection algorithms on low-cost, open-source acoustic sensors for environmental monitoring. Sensors, 19(3):553. DOI: 10.3390/s19030553.

Rauch, L., Schwinger, R., Wirth, M., Sick, B., Tomforde, S., and Scholz, C. (2023). Active bird2vec: Towards end-to-end bird sound monitoring with transformers. arXiv preprint arXiv:2308.07121. DOI: 10.48550/arxiv.2308.07121.

Saad, A. (2020). Bird species identification using spectrograms and convolutional neural networks. PhD thesis, Ph. D. dissertation, Dept. Comput. and Microelectronic Syst., Univ …. Available at: [link].

Situnayake, D. and Plunkett, J. (2023). AI at the Edge. " O'Reilly Media, Inc.". Book.

Wotton, S., Eaton, M., Sheehan, D., Munyekenye, F. B., Burfield, I., Butchart, S., Moleofi, K., Nalwanga-Wabwire, D., Pomeroy, D., Senyatso, K., et al. (2020). Developing biodiversity indicators for african birds. Oryx, 54(1):62-73. DOI: 10.1017/S0030605317001181.

Xie, J., Hu, K., Zhu, M., Yu, J., and Zhu, Q. (2019). Investigation of different cnn-based models for improved bird sound classification. IEEE Access, 7:175353-175361. DOI: 10.1109/ACCESS.2019.2957572.

Yang, S., Frier, R., and Shi, Q. (2021). Acoustic classification of bird species using wavelets and learning algorithms. In 2021 13th International Conference on Machine Learning and Computing, pages 67-71. DOI: 10.1145/3457682.3457692.

Yen, G. G. and Lin, K.-C. (2000). Wavelet packet feature extraction for vibration monitoring. IEEE transactions on industrial electronics, 47(3):650-667. DOI: 10.1109/41.847906.

Zhang, S., Gao, Y., Cai, J., Yang, H., Zhao, Q., and Pan, F. (2023). A novel bird sound recognition method based on multifeature fusion and a transformer encoder. Sensors, 23(19):8099. DOI: 10.3390/s23198099.

Zhang, T., Feng, G., Liang, J., and An, T. (2021). Acoustic scene classification based on mel spectrogram decomposition and model merging. Applied Acoustics, 182:108258. DOI: 10.1016/j.apacoust.2021.108258.

Zhao, P., Luo, C., Qiao, B., Wang, L., Rajmohan, S., Lin, Q., and Zhang, D. (2022). T-smote: Temporal-oriented synthetic minority oversampling technique for imbalanced time series classification. In IJCAI, pages 2406-2412. DOI: 10.24963/ijcai.2022/334.

Şekercioğlu çağan H, Daily Gretchen C, E. P. R. (2004). Ecosystem consequences of bird declines. Proceedings of the National Academy of Sciences, 101(52):18042-18047. DOI: 10.1073/pnas.0408049101.

Downloads

Published

2025-08-05

How to Cite

Junqueira, B. F., Vieira, R. G., Alves, E. C., Karaziack, B. B., & dos Santos, M. V. (2025). Environmental Monitoring with Low-Processing Embedded AI through Sound Event Classification. Journal of the Brazilian Computer Society, 31(1), 523–544. https://doi.org/10.5753/jbcs.2025.4210

Issue

Section

Articles