Summary: | This study proposed a multimodal vehicle classification approach using a hybrid shallow CNN architecture that integrates radar and acoustic sensor data to overcome the limitations of single-sensor systems. The shallow CNN includes convolutional layers with max-pooling and fully connected layers, employing concatenation or summation operators for data fusion at early or late stages. Evaluated on a dataset of 3300 paired samples across five vehicle classes, with spectrogram images as input, the proposed method yielded promising results. Late fusion with concatenation outperformed early fusion and summation. The highest F1-scores were 99%-100% for lorry, motorcycles, buses, and no traffic, and 97% for cars. This multimodal approach demonstrated its potential for intelligent transportation systems and traffic monitoring applications. © 2024 IEEE.
|