A Low Bit Rate Speech Coder using Segmental Sinusoidal Model for Disaster and Emergency Telemedicine

Setiawan, Florentinus Budi and Soegijoko, Soegijardjo and Sugihartono, Sugihartono and Tjondronegoro, Suhartono (2008) A Low Bit Rate Speech Coder using Segmental Sinusoidal Model for Disaster and Emergency Telemedicine. Journal of eHealth Technology and Application, 6 (2). pp. 97-104. ISSN 1881-4581

[img] Text (JETA Journal)
Jeta Paper.pdf - Published Version

Download (9MB)
Official URL: https://www.u-tokai.ac.jp

Abstract

In general, a communication system during and after disaster needs to work properly at a relative very low data rate. This performance is also required in emergency telemedicine, because of limited channel capacity. Limited infrastructure available during and after disasters, and emergency conditions reduce the communication system into its minimum capacity. To establish a communication connection during and after disaster, it needs a speech coder that can function properly at low bit rate. The proposed speech coder is a low complexity coder that should be able to function properly. Therefore, the large number of communication connection can be handled using the limited transmission channel. The low complexity and low bit rate speech coder can be realized using segmental sinusoidal model, so that, the speech signal can be represented as a combination of sinusoidal signal with infinite combination of amplitude, frequency and phase. The segmental sinusoidal model extracted from the periods and the peaks of speech signal along one frame. This model works based on peak-to-peak quantization that detects the positive peaks and the negative peaks. Thus, the time distance and magnitude difference between the consecutive peaks can be easily extracted. In this paper, we describe the proposed method called segmental sinusoidal model to encode a speech signal. A low bit rate can be obtained by sending the information of periods and peaks. This coder is also combined with the waveform interpolation and the use of look-up tables. The resulted maximum mean opinion score (MOS) of the synthesized speech signal is 3.8. With this MOS test score, the human perception due to the synthesized signal is fairly good. The bit rate of the coded signal is 4 kbps at less than 10 MIPS complexity. It is therefore expected that the proposed segmental sinusoidal model and 4 kbps coder will be suitable for disaster and emergency telemedicine applications.

Item Type: Article
Subjects: 600 Technology (Applied sciences) > 620 Engineering > 621 Electrical engineering
Divisions: Faculty of Engineering > Department of Electrical Engineering
Depositing User: Mr F. Budi Setiawan
Date Deposited: 09 Apr 2020 10:02
Last Modified: 09 Apr 2020 10:02
URI: http://repository.unika.ac.id/id/eprint/21095

Actions (login required)

View Item View Item