I require some guidance regarding a system that can recognize a user by their speech. Is there a tool or python package to do so?

More Nikhar Bhatia's questions See All

P-nitrophenol acetate assay for esterase and media supernant colour interference. what can be done?

I am performing p nitrophenol acetate assay for esterase enzyme. My media contains a different concentration of peptone (0.5, 2, 5%) and after centrifugation the supernatant is slight yellow....

16 June 2024 4,381 0 View

Can anyone recommend a company that does LC-MS/MS for silver stained gels where I can outsource my samples for proteomics analysis?

Looking for most cost-effective manner to outsource my samples (silver stained gels) for proteomics analysis using LC-MS/MS. If anyone has outsourced samples outside or in INDIA for LC-MS/MS and...

21 April 2024 309 2 View

What could be the cause of getting such blots?

my protein is a nucleolar protein which is tagged with HA, FLAG and myc tag. i have over-expressed it and good GFP expression was observed which is directly proportional to the amount of my...

11 January 2024 2,057 2 View

What is new in Menstrual Health and Hygiene in terms of marketing?

I am researching the taboo behavior around women's menstrual health and hygiene, and exploring ways to shift it towards marketing through better research design.

16 December 2023 2,530 1 View

When I published my article to a journal, it was not Scopus indexed. Now the journal is indexed in Scopus. Will my paper also be indexed now?

The paper was published one year back in a peer-reviewed DOAJ indexed journal by Elsevier. Now the journal has been indexed in Scopus. Will my previously published article in that journal be also...

11 December 2023 4,932 4 View

What are R packages for detecting peaks and peaks area for HPLC chromatogram?

I am performing snake venom fractionation using Reverse-phase HPLC. It generates multiple peaks with different area under the curve. Since I'll be analyzing multiple chromatograms at different...

08 November 2023 9,134 0 View

Can anyone guide how to use the minitab software for optimization studies ?

I want to learn minitab software and how to use that software for optimization studies of the enzyme. How we can find the actual and predicted values from the software.? how we can make Plackett...

23 September 2023 5,983 2 View

Looking for a Topic for my Phd project research which could solve a Problem?

I enrolled for a PhD Programme in Computer Science. For my research work I am looking for a topic to choose which could make an impact and solve a Business problem. my area of interest are Data...

19 March 2023 8,842 4 View

What could be reason for my protein bands not resolving properly?

I have performed venom fractionation using HPLC and run the samples in SDS PAGE. I wanted to know what could be the possible reason for my protein bands to look like this. It should have been...

15 March 2023 3,542 6 View

If I have to perform a spectrophotometric assay using p-nitrophenyl acetate, then the standard graph should be made of p-nitrophenyl??

the standard graph of p-nitrophenyl acetate, how we can make it, which concentrations can be used?

04 August 2022 841 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Request Python code?

Request Python code from this article : Gender equity of authorship in pulmonary medicine over the past decade. THANKS!

08 August 2024 6,242 2 View

Why does everyone use vs code?

Visual Studio Code (VS Code) has become a popular choice among developers for several reasons: 1. **Free and Open Source**: VS Code is free to use and open source, making it accessible to...

07 August 2024 7,013 4 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Broca’s area must be intact for the learning of new movement sequences?

When the eyes of a person are damaged this causes complete blindness. Likewise, when Wernicke’s and Broca’s areas of neocortex are damaged this causes complete aphasia, losing the ability to...

01 August 2024 6,744 2 View

How to do FEL analysis?

In molecular dynamics simulation, to get FEL analysis, I got an error. My Python version is 3.10.7. My input files are made with a lower version of Python. But the final command to generate the...

23 July 2024 5,646 2 View

Mass spectra averaging algorithm?

I am now developing a python module for ms2 database searching, would like to realize a function that similar to what Xcalibur did, choose multiple mass spectra and get an averaged spectra. But...

22 July 2024 3,975 1 View

What are the current challenges and future prospects of integrating artificial intelligence into recognition systems for autonomous vehicles?

This question aims to explore the intersection of artificial intelligence and autonomous vehicle technology. It seeks to identify the key challenges faced in implementing AI for recognition...

20 July 2024 3,469 2 View

Help me download paper?

I have 2 papers below, but I can't access this, you can help me? Shuai Zhang, Xiaodi Li, Xingyu Zhou, Yuning Wang, Yue Hu, Cloud removal using SAR and optical images via attention mechanism-based...

18 July 2024 9,635 0 View

Arseniy Gorin

I think your problem formulation is a little unclear...

1) It's a bit confusing when you say "and convert it into text". If you want to actually do speech-to-text, it is not quite related to speaker recognition...

2) In title you say recognize the user. But then say that you want to find where 2 or more people speak. To find a target speaker in the utterance, you can start with conventional GMM or i-vector based approaches (have a look at https://pypi.python.org/pypi/bob.spear/1.1.2 to start with). However, classifying overlapping speech (from 2 or more users) needs different techniques and in fact a bit more complicated

Feel free to provide more details and reformulate your question to get more relevant feedback

Nikhar Bhatia

Thank you for pointing those out.

I'll reformat it but meanwhile, I'll make it more clear here...

1) The idea is to recognize context out of a conversation about a topic. For instance if two people are talking (assuming without overlapping their voices), it should be able to differentiate between the two voices either by only differentiating between two voices or by differentiating by recognizing the users(which would require training the voices therefore I would work on it once the rest of the project is done).

2) After differentiating the contents of the conversation based on who spoke what, I would further analyze the contents.

OK, now it is clear. Technically, what you want to do is called speaker diarization followed by speech recognition with further content analysis. It consists of several steps: voice activity detection, finding homogeneous segments, speaker clustering and re-alignment of speech boundaries.

First I have to say that both diarization and recognition are challenging, especially if your data are conversational. Even worse if these are telephone quality recordings. Prepare yourself to quite many errors on various stages. In practice, for further NLP processing you will likely need to work with N-best transcription hypotheses or with word lattices to mitigate the errors. As for tools, I do not think what you want to do can be easily done in pure python scripts. There are tools though that work quite well for your task:

1) Speaker segmentation and clustering (aka diarization). One tool is LIUM diarization written in Java ( http://liumtools.univ-lemans.fr//index.php?option=com_content&task=blogcategory&id=32&Itemid=60 ). Another one is Idiap tool written in C++ (https://www.idiap.ch/scientific-research/resources/speaker-diarization-toolkit). For my tasks the second one worked better and faster, but the first one is already integrated in some speech recognition systems (see p.2)

2) Speech recognition. Here you have CMU Sphinx with already trained acoustic models for several language and quite easy tutorials. Here you can find an example of how to use it together with LIUM diarization ( http://cmusphinx.sourceforge.net/wiki/speakerdiarization ). Another too is Kaldi, which allows you to get better accuracy, but is more heavy and a bit less documented. This guy also integrated it with LIUM diarization ( https://github.com/alumae/kaldi-offline-transcriber/blob/master/scripts/diarization.sh ). There is a good english model for Kaldi, too.

Sorry for quite a lot of text, but you should keep in mind that the task is hard, even if you have just 2 non-overlapping speakers.

I will track this tread if you have more questions. You also have forums for both Sphinx and Kaldi for further support. Have fun