Junaid Masood,Sheeraz Ahmed,Asim Ali,Ubaid Ullah,Said-ul-Abrar,Muhammad Tayyab,Samhita Priyadarsini Gundala,



Filters,Speech Signal,Signal to Noise Ratio,Mean Square Error,Scaling Factor,


Speech signal segmental framing and the scaling factor is basis for the speech recognition process as first step. The next followed step is existing noise reduction in the recognized speech signal for quality improvement. In this work, the noise reduction is done using newly proposed adaptive median based filtering. Comparison of the observations based on adaptive median filtering with Minimum Mean-Square Error Short-time Spectral Amplitude (MMSE-STSA) and Minimum Mean-Square Error (MMSE) based noise reduction reveal a list of worthy to mention relevant observations. The drawn conclusion also accumulates possible contributions by the proposed adaptive median based filtering technique. Lastly is mentioning of Signal-to-noise ratio (SNR) as the primary metric for observations collection for the newly proposed adaptive median based filtering technique analysis.


I. Brown, A., S. Garg, and J. Montgomery, Automatic and Efficient Denoising of Bioacoustics Recordings Using MMSE STSA. IEEE Access, 2018. 6: p. 5010-5022.

II. Di Liberto, G.M., et al., Atypical cortical entrainment to speech in the right hemisphere underpins phonemic deficits in dyslexia. NeuroImage, 2018.

III. Djaziri-Larbi, S., et al., Watermark-Driven Acoustic Echo Cancellation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018. 26(2): p. 367-378.

IV. Fu, J., L. Zhang, and Z. Ye, Supervised monaural speech enhancement using two-level complementary joint sparse representations. Applied Acoustics, 2018. 132: p. 1-7.

V. Heitkaemper, Jens, Joerg Schmalenstroeer, Joerg Ullmann, Valentin Ion, and Reinhold Haeb-Umbach. “A Database for Research on Detection and Enhancement of Speech Transmitted over HF links.” arXiv preprint arXiv:2106.02472 (2021).

VI. Heese, F., et al. Selflearning codebook speech enhancement. in Speech Communication; 11. ITG Symposium; Proceedings of. 2014. VDE.

VII. Kandagatla, R.K. and P. Subbaiah, Speech enhancement using MMSE estimation of amplitude and complex speech spectral coefficients under phase-uncertainty. Speech Communication, 2018. 96: p. 10-27.

VIII. Khaldi, K., A.-O. Boudraa, and A. Komaty, Speech enhancement using empirical mode decomposition and the Teager–Kaiser energy operator. The Journal of the Acoustical Society of America, 2014. 135(1): p. 451-459.

IX. Khaldi, K., et al., Speech enhancement via EMD. EURASIP Journal on Advances in Signal Processing, 2008. 2008(1): p. 873204.

X. Kuortti, J., J. Malinen, and A. Ojalammi, Post-processing speech recordings during MRI. Biomedical Signal Processing and Control, 2018.

XI. Masood, J., Shahzad, M., Khan, Z.A., Akre, V., Rajan, A., Ahmed, S. and Masood, F., 2020, November. Effective Classification Algorithms and Feature Selection for Bio-Medical Data using IoT. In 2020 Seventh International Conference on Information Technology Trends (ITT) (pp. 42-47). IEEE.

XII. Michelsanti, Daniel, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, and Jesper Jensen. “An overview of deep-learning-based audio-visual speech enhancement and separation.” IEEE/ACM Transactions on Audio, Speech, and Language Processing (2021).

XIII. Nabi, W., et al., A dual-channel noise reduction algorithm based on the coherence function and the bionic wavelet. Applied Acoustics, 2018.

XIV. Rao, C.V.R., M.R. Murthy, and K.S. Rao. Speech enhancement using perceptual Wiener filter combined with unvoiced speech—A new scheme. in Recent Advances in Intelligent Computational Systems (RAICS), 2011 IEEE. 2011. IEEE.

XV. Tabassum Feroz, Uzma Nawaz. : ‘SUPPRESSION OF WHITE NOISE FROM THE MIXTURE OF SPEECH AND IMAGE FOR QUALITY ENHANCEMENT’. J. Mech. Cont. & Math. Sci., Vol.-16, No.-7, July (2021) pp 67-78. DOI : 10.26782/jmcms.2021.07.00006

XVI. Wang, X., et al., A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis. preprint arXiv:1804.02549, 2018.

XVII. Wiem, B., P. Mowlaee, and B. Aicha, Unsupervised single channel speech separation based on optimized subspace separation. Speech Communication, 2018. 96: p. 93-101.

XVIII. Yang, Fan, Ziteng Wang, Junfeng Li, Risheng Xia, and Yonghong Yan. “Improving generative adversarial networks for speech enhancement through regularization of latent representations.” Speech Communication 118 (2020): 1-9.

XIX. Yilmaz, O. and S. Rickard, Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on signal processing, 2004. 52(7): p. 1830-1847.

View Download