Publications

Modulation Spectrum of Speech from Linear Predictive Models: Applications in Speech Recognition (link)

Sadhu, Samik, and Hynek Hermansky. "Self-supervised Learning with Speech Modulation Dropout." arXiv preprint arXiv:2303.12908 (2023).
APA (link)
Sustek, Martin, et al. "Stabilized training of joint energy-based models and their practical applications." arXiv preprint arXiv:2303.04187 (2023). (link)
Sadhu, Samik, and Hynek Hermansky. "Importance of Different Temporal Modulations of Speech: a Tale of two Perspectives." ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023. (link)
Sadhu, Samik, and Hynek Hermansky. "Blind Signal Dereverberation for Machine Speech Recognition." arXiv preprint arXiv:2210.00117 (2022). (link)
Sustek, Martin, Samik Sadhu, and Hynek Hermansky. "Dealing with Unknowns in Continual Learning for End-to-end Automatic Speech Recognition." Proc. Interspeech. Vol. 2022. 2022. (link)
Sadhu, Samik, and Hynek Hermansky. "Complex Frequency Domain Linear Prediction: A Tool to Compute Modulation Spectrum of Speech." arXiv preprint arXiv:2203.13216 (2022). (link)
Sadhu, S., Hermansky, H. (2021) Radically Old Way of Computing Spectra: Applications in End-to-End ASR. Proc. Interspeech 2021, 1424-1428, doi: 10.21437/Interspeech.2021-643 (link)
Sadhu, S., He, D., Huang, C.-W., Mallidi, S.H., Wu, M., Rastrow, A., Stolcke, A., Droppo, J., Maas, R. (2021) wav2vec-C: A Self-Supervised Model for Speech Representation Learning. Proc. Interspeech 2021, 711-715, doi: 10.21437/Interspeech.2021-717 (link)
Sadhu, S., Hermansky, H. (2020) Continual Learning in Automatic Speech Recognition. Proc. Interspeech 2020, 1246-1250, doi: 10.21437/Interspeech.2020-2962 (link)
S. Sadhu, R. Li and H. Hermansky, "M-vectors: Sub-band Based Energy Modulation Features for Multi-stream Automatic Speech Recognition," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019,
Wang, X., Yang, J., Li, R., Sadhu, S., Hermansky, H. (2019) Exploring Methods for the Automatic Detection of Errors in Manual Transcription. Proc. Interspeech 2019, 3003-3007, doi: 10.21437/Interspeech.2019-1343 (link)
Sadhu, S., Hermansky, H. (2019) Modulation Vectors as Robust Feature Representation for ASR in Domain Mismatched Conditions. Proc. Interspeech 2019, 3441-3445, doi: 10.21437/Interspeech.2019-2723 (link)
Sadhu, Samik, and Prasanta Kumar Ghosh. "Low resource point process models for keyword spotting using unsupervised online learning." 2017 25th European Signal Processing Conference (EUSIPCO). IEEE, 2017.