Abstract: Deep learning-based object detectors have become increasingly critical in spectrogram-based wideband multi-signal detection, recognition, and time-frequency localization. Current methods ...
Abstract: Considering the power-hungry nature of speech processing, a keyword spotting (KWS) unit, used to detect multiple spoken words, is often integrated as a front-end layer. KWS systems are ...
This study proposes a novel heterogeneous stacking ensemble learning model for the fusion of phonocardiogram (PCG) spectrogram texture and deep features to detect heart failure with preserved ejection ...
Diffusion Speech is a diffusion-based text-to-speech model. Our speech synthesis pipeline is quite simple. We use a diffusion transformer model (DiT) to predict the duration of each phoneme. Then we ...
The development of machine learning for cardiac care is severely hampered by privacy restrictions on sharing real patient electrocardiogram (ECG) data. Although generative AI offers a promising ...