What is the fastest way to build a corpus of Telugu dialects?

More Sajida Bhanu's questions See All

Is there anyway to distinguish two proteins of same molecular weight?

I know mass spectrometry will work. But, is there any simple and straightforward method to distinguish between the two proteins of the same molecular weight? They are cryptochromes and...

14 July 2024 9,324 2 View

What should be the possible topic related to the conflict and disaster in the M-phil level?

I want to study on the topic of conflict and disaster related topic for the thesis of m-phil or phD level. So in want to select topic for the research.

10 September 2023 2,757 3 View

Is amylose only present in starch?

Is amylose only present in starch? or are there any other biomolecules where amylose is the building block?

13 August 2023 3,087 5 View

Can we use Schiff bases for removal of heavy metal ion from the water and how ?

Water purification

08 August 2023 321 1 View

Can we enhance the solid state electrical conductivity of metal complexes by using another metal ion?

Solid state electrical conductivity.

08 August 2023 8,050 2 View

Which substrate will be much more effective for large scale pha production?

for their applications in Agricultural and Biomedical Sectors

13 July 2023 9,278 1 View

Why is my recombinant protein not binding to its cofactor?

I have expressed my protein of interest in E.coli and purified it via the Ni-NTA column. Then did a UV-VIS spectroscopy to see if it is binding to its chromophore. Even after several trials, it...

06 June 2023 4,371 6 View

Can be used mixed metal complexes in drug delivery ?

Kindly share any recent research paper regarding the same?

19 May 2023 5,718 3 View

My CONTCAR file in vasp is coming out to be 0kb, can someome please guide how what I am doing wrong??

My CONTCAR file in vasp is coming out to be 0kb, can someome please guide how what I am doing wrong?? my script #PBS -N 0ps #PBS -l nodes=1:ppn=24 #PBS -l walltime=30000:00:00 #PBS -q...

13 May 2023 7,582 5 View

My CONTCAR file in vasp is coming out to be 0kb, can someome please guide how what I am doing wrong?? my script?

The script I am using is following #PBS -N 0ps #PBS -l nodes=1:ppn=24 #PBS -l walltime=30000:00:00 #PBS -q batch #PBS -V #PBS -S /bin/bash...

13 May 2023 1,478 1 View

CAD File of human's & rat's respiratory airways ?

Dear all, I am working on particle deposition in human's & rat's respiratory airways using CFD and I am looking for the 3D CAD file for my simulations (STEP or IGES format). If somone has such...

29 July 2024 1,092 2 View

How do I get people to interview on their motivations for writing graffiti in washrooms in a university?

I am currently investigating the 'graffscape' (linguistic landscape of graffiti) in the washrooms in a public university. I am interested in the language and mode choices. Additionally, I want to...

24 July 2024 9,237 1 View

MDCI module in Orca software?

Dear Researchers, My question is associated with the "MDCI" method in Orca. Please let me clarify my question using a simple example: Suppose we are going to perform CBS extrapolation using "!...

21 July 2024 1,632 0 View

What effects of Autonomous Language Learning can be shown regarding linguistic competence and communicative skills?

Autonomous Language Learning can be implemented from primary to tertiary education. Practioners and students report on its effectiveness, however, there appears to be little quantitative or...

20 July 2024 2,592 2 View

What is wrong with my input file?

im studing gaussian 16 with reading paper about I-131 Metaiodobenzylguanidine in the paper "In a similar vein, nuclear magnetic resonance shielding values were investigated using the widely...

16 July 2024 6,040 4 View

Daniel Everett versus Noam Chomsky on Language?

Many have criticized Noam Chomsky’s theory of language (e.g., Pinker as described in Sihombing 2022), but the most effective criticisms have come from Daniel Everett, given that Chomsky (according...

15 July 2024 492 4 View

I came across oscillations in a pressure profile for a pipeline flow along the axis of a cylinder, how do I prove that these are not numerical err?

In terms of CFD, we often analyze the stability of the error using Von-Neumann analysis, especially for FDM based problems. Should we follow the same approach for a compressible fluid flow using FVM ?

13 July 2024 6,296 5 View

Is language acquisition for children an unconscious process?

After writing a piece toI suggest that the information transfer rate of consolidation of children and adults is similar based on my back-of-the-envelope calculation for the consolidation of...

11 July 2024 3,528 2 View

The idea that children learn languages at an accelerated rate (Chomsky 1959) may not be true?

Much has been made of the idea that humans are genetically programmed to learn languages at an early age, suggesting that learning plays a minor role in this process (Chomsky 1959). But we have...

10 July 2024 5,891 6 View

Which book and outline do you recommend for computational physics course for BS level ?

students already took 1. numerical methods 2. programming language 3. Probability and statistics

09 July 2024 6,271 3 View

Ian Kennedy

You can start by giving this to Google: "how to build a linguistic corpus". Once you have digested that, you can give the same string to Scholar. Then repeat the process with "linguistic corpus software". The fastest way is to shovel all the digital texts you can find into a program that casts out duplicate words and maintains a sorted dictionary with useful add-ons such as frequency of occurrence and your tags. Data cleaning will be necessary.

Sudheer Kolachina

Hi ! Great to know that you are working on corpus building for Telugu. Unfortunately, there is not enough web content for the different dialects of Telugu. You can find large amounts of corpora for Standard Telugu which you can crawl and clean using the Natural Language toolkit (http://nltk.org/).

Kaushik Thallapally

Did anyone build Corpus datasets for any Indian Languages?

Ge Lan

I would check existing corpora (that are open to public first), and you might want to take a look at Python NLTK to see if there is any corpus available in this language. If not, Internet would be a good source of building a corpus quickly.