Why most of reasearchers consider 13 MFCC coefficients for analysis of speech or images?

More Vijay Khare's questions See All

How long will macrophages remain in the M1 state after treatment with 100 ng/mL of LPS?

In my experiment, I treated RAW macrophages with 100 ng/mL of LPS. After 24 hours, I collected the supernatant and replaced it with fresh media. I then quantified the cytokine levels of TNF and...

01 August 2024 476 2 View

What psychological factors contribute to the prevalence of obesity?

Psychological factors contribute to the prevalence of obesity

08 July 2024 6,124 2 View

How does excessive use of smartphones impact interpersonal relationships in contemporary society?

Excessive use of smartphones impact interpersonal relationships

03 July 2024 8,362 1 View

In what ways do economic factors influence family dynamics and relationships?

Economic factors influence family dynamics.

02 July 2024 6,235 0 View

Can u pls send me the conversion n selectivity formula for HPLC analysis?

please answer it quickly.

24 June 2024 2,256 2 View

What are the social factors that influence white-collar crime?

social factors that influence white-collar crime

22 June 2024 9,121 3 View

What Prevention Methods are Effective Against Teenagers Pregnancy?

Prevention Methods are Effective Against Teenagers Pregnancy

21 June 2024 6,321 10 View

What are the implications of mass media on social perceptions and behaviors?

Implications of mass media on social perceptions and behaviors

21 June 2024 8,265 6 View

How do the institutions of Family, Marriage and Kinship contribute towards the Socialization of young minds?

Institutions of Family, Marriage and Kinship contribute towards the Socialization

20 June 2024 7,056 4 View

Impact of overcrowded living conditions on mental health and social well-being in urban settings?

Impact of overcrowded living conditions on mental health

19 June 2024 8,713 5 View

Broca’s area must be intact for the learning of new movement sequences?

When the eyes of a person are damaged this causes complete blindness. Likewise, when Wernicke’s and Broca’s areas of neocortex are damaged this causes complete aphasia, losing the ability to...

01 August 2024 6,744 2 View

What is the effectiveness of AI-powered language learning tools in improving language acquisition skills in children with speech and language delays?

The impact of AI-powered language learning tools in enhancing language acquisition skills of children with speech and language delays.

28 June 2024 3,105 2 View

Is the pure phonemic content related to emotional valence?

Dear colleagues, After statistical processing of a large corpus of English utterances assessed for emotional valence, it turned out that the phonemic content of speech is tied to emotional...

17 June 2024 5,459 0 View

What are the challenges of developing technology for real-time speech translation?

What are the significant technical obstacles in the development of instantaneous speech translation tools? I would appreciate your insights on this question. Could you please share your thoughts?

13 June 2024 1,042 3 View

What software to use to assess confidence in a speech?

Good morning community. thank you in advance for your valuable feedback. I am looking to measure confidence in a speech in the context of an experimental design. I specifically want to measure the...

12 May 2024 2,566 1 View

Straightforward way to correct for rater variables?

I want to measure correlations between human ratings of second language speech and a number of quantitative measures in the speech, such as articulation rate. I have collected speech rating data...

02 May 2024 6,400 0 View

Emotions should be set aside when making financial decisions, according to recent expert advice. Are there any studies supporting this notion?

Recently I attended a conference on Financial Literacy in which there were eminent speakers from the field one of the speakers mentioned in his speech that Emotions are set aside while making...

01 April 2024 611 5 View

Publication papers for speech recognition?

I need to require recent publication papers for speech recognition

01 April 2024 4,103 3 View

¿Cómo influye el Conocimiento previo en términos de redes de Conocimiento relacionado y almacenado en esquemas facilitando la comprensión?

¿Cómo influye el Conocimiento previo en términos de redes de Conocimiento relacionado y almacenado en esquemas facilitando la comprensión y el aprendizaje? El conocimiento previo en términos de...

28 March 2024 5,613 0 View

¿Qué es la interculturalidad?

Hoy se menciona la cultura y se tiene algún referente, se habla de multicultural, pluricultural... entendiéndose como una mezcla de tanto. qué podremos decir de intercultural

28 February 2024 7,794 4 View

Amir Hossein Poorjam

An intuition about the cepstral features can help to figure out what we should look for when we use them in a speech-based system.

- As cepstral features are computed by taking the Fourier transform of the warped logarithmic spectrum, they contain information about the rate changes in the different spectrum bands. Cepstral features are favorable due to their ability to separate the impact of source and filter in a speech signal. In other words, in the cepstral domain, the influence of the vocal cords (source) and the vocal tract (filter) in a signal can be separated since the low-frequency excitation and the formant filtering of the vocal tract are located in different regions in the cepstral domain.

- If a cepstral coefficient has a positive value, it represents a sonorant sound since the majority of the spectral energy in sonorant sounds are concentrated in the low-frequency regions.

- On the other hand, if a cepstral coefficient has a negative value, it represents a fricative sound since most of the spectral energies in fricative sounds are concentrated at high frequencies.

- The lower order coefficients contain most of the information about the overall spectral shape of the source-filter transfer function.

- The zero-order coefficient indicates the average power of the input signal.

- The first-order coefficient represents the distribution spectral energy between low and high frequencies.

- Even though higher order coefficients represent increasing levels of spectral details, depending on the sampling rate and estimation method, 12 to 20 cepstral coefficients are typically optimal for speech analysis. Selecting a large number of cepstral coefficients results in more complexity in the models. For example, if we intend to model a speech signal by a Gaussian mixture model (GMM), if a large number of cepstral coefficients is used, we typically need more data in order to accurately estimate the parameters of the GMM.

Mustaqeem Khan

There are many reasons for choosing these numbers of features which depend on the system. So, one main trend is that we try to reduce the number of features in order to make our model feasible for real-time implementation and the lower order coefficients contain more cues about the overall spectral shape of the source.

yes, try 13 coefficient, I think it will be sufficient.