A two-stage architecture for soundscape classification and preservation

Di Loreto, Samantha; Montelpare, Sergio

doi:10.1038/s41598-025-22819-0

Sound classification is a fundamental task in audio signal processing with applications ranging from environmental monitoring to urban planning. We present a novel two-stage system for open set audio classification that integrates deep learning techniques with Schafer’s soundscape theory. The two-stage architecture consists of: (1) a variational autoencoder (VAE) that first learns compressed representations of acoustic features and identifies distinctive sounds through reconstruction error analysis, and (2) a convolutional neural network (CNN) that then classifies these sounds into Schafer’s theoretical categories (keynotes, sound signals, and soundmarks) using the learned features. This sequential approach allows the system to first understand what makes sounds unique before categorizing their role in the soundscape. The VAE effectively learns the latent representation of acoustic features from mel-spectrograms, while the CNN leverages these representations for classification. We evaluated the system on standard datasets including UrbanSound8K, ESC-50, URBAN-SED, and TUT urban acoustic scenes, achieving an average accuracy of 80.7% across all datasets. Additionally, we tested our approach on a dataset composed of binaural recordings collected in the university neighborhood of Pescara, Italy. Our research provides empirical validation of Schafer’s theoretical framework through quantitative metrics, demonstrating strong alignment between computational classifications and theoretical descriptions. The proposed methodology advances soundscape analysis by objectively quantifying previously qualitative categories, enabling automated classification while maintaining theoretical fidelity, and creating a foundation for soundscape preservation policies that protect acoustic identity as a form of intangible cultural heritage.

A two-stage architecture for soundscape classification and preservation

Di Loreto, Samantha;Montelpare, Sergio

2025-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Attenzione

Citazioni

social impact

A two-stage architecture for soundscape classification and preservation

Di Loreto, Samantha;Montelpare, Sergio

2025-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)