Automatic emotion recognition from speech signal has become a major research topic in the field of Human computer Interaction (HCI) in the recent times due to its many potential applications. It is being applied to growing number of areas such as humanoid robots, car industry, call centers, mobile communication, computer tutorial applications etc. [1]. In this paper we focus on emotion recognition from acoustic properties of speech. The most commonly used acoustic features in literature are related to MFCC’s and prosody features like pitch, intensity and speaking rate .In addition to these features we exploit a set of loudness features which are extracted by using bark filter bank as described by Zwicker[2]. Rhythm based features which have widely been used in music emotion recognition and in automatic speech assessment applications based on the linguistic properties of speech are exploited in this paper on the basis of acoustic knowledge only. Speech rhythm can be understood as a measure describing the regularity of occurrence of certain language elements in speech that are perceptually similar, e.g., sequences of stressed syllables. We investigate different metrics of speech rhythm with the aim to study their relevance for the characterization of emotion from speech. We also focus on the temporal features which are related to the fluency of speech such as the ratios of pause to voiced parts, pause to unvoiced parts, ratio of voiced to unvoiced parts etc. All these features are explained in detail in Section 3. The overall system is shown in Fig. 1. In our initial step of the system we implement a segmentation algorithm to separate voiced, unvoiced and pauses in the speech signal. This is a very important part as features are extracted on voiced and unvoiced segments separately and temporal features are mainly based on proper segmentation. Next is the Feature extraction unit where we extract a set of 487 features. In the feature selection part we use Information Gain Ratio (IGR) filter for selecting the best features for classification.
