How to represent characters from words into numeric type for machine learning ?

More Gaurish Thakkar's questions See All

Can anyone help me with data collection on impact of GST on job workers in garment industry?

I am doing a research paper on the topic- Impact of GST on Job workers in textile and garment industry. I am unable to get Secondary statistics data relating to tax on job workers before GST and...

06 July 2020 5,562 0 View

How to calculate performance of Evacuated tube solar water heater?

I wants to calculate theoritical performance of Evacuated solar water heater using excel programing. can any body share such programmed file.

27 March 2020 2,971 3 View

In which terms, the outputs of taguchi method should be represented? i.e. SN ratio, mean, or standard deviation ?

There is no replication, and the selected orthogonal array is L27. Further, should i consider interaction effects?

21 August 2019 6,745 6 View

How can we purify the crude extracts of animal tissues?

I have crude extracts of an animal tissues in different solvents. How can I purify them before going to check its bio activities??

16 July 2019 6,071 1 View

Can we develop microcontroller based solar radiation measuring device?

I want to develop a Microcontroller based solar radiation measuring device. Can any body guide me?

28 March 2019 1,460 8 View

Which is the international event is going to held in Gujarat state of India?

Global innovative and scientific conference is going to held in Gujarat in India. Can any body aware me about it.

14 January 2019 9,504 4 View

Which type solar thermal collector give better results FPC or ETC?

I am designing solar water heaters using different solar thermal collector. But I want to known your valuable opinions.

07 January 2019 6,961 9 View

Is it possible to use solar energy to provide total energy need of this planet in place of conventional energy?

Now a days photovoltaic is found as a emerging technology to satisfy the needs of whole world energy demands. Other solar thermal technologies also play important role to overcome the need of...

21 December 2018 2,727 21 View

What are some embedding algorithms that can take care of the textual table data ?

Interested in reading research material on representing tabular data (text) using embedding techniques.

11 December 2018 8,685 0 View

Biasing circuit for varactor/pin diode in reconfigurable antenna?

I need to know abt biasing ckt for varactor and PIN diodes to be used in reconfigurable microstrip filter. I have seen ckt given in datasheet. It seems it is having more inductor and capacitors...

06 December 2018 5,768 5 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

The Bigger You Are, the Harder You Fall (some lessons from Dinosaurs)?

Evolutionary fitness is based on an organism’s ability to adapt rapidly to changing environmental circumstances. Large-bodied mammals have been equipped with large brains (and hence a high...

06 August 2024 4,849 2 View

Are air moisture harvesting technologies effective in combating desertification?

Air moisture harvesting Air water collection devices

06 August 2024 5,473 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Arie Wahyu Wijayanto

Hi Gaurish Thakkar,

If you want to represent morphemes or words into vectors, I can recommend you to use Word2Vec

Word2Vec

Word2vec is a set of methods/algorithms that are usually used in word embeddings.

How does this work? You can easily find out here:

- https://www.quora.com/How-does-word2vec-work , or here

- https://en.wikipedia.org/wiki/Word2vec

There is a good paper about that, which is done by Mikolov in NIPS 2013 (https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)

Also, there are many really nice library for this

- Using DeepLearning4J (https://deeplearning4j.org/word2vec)

- Using TensorFlow (https://www.tensorflow.org/versions/r0.11/tutorials/word2vec/index.html)

- Original Google version (https://code.google.com/archive/p/word2vec/)

Roughly speaking, it consists of two main methods: Skip Gram and Bag of Words (CBOW)

https://www.quora.com/How-does-word2vec-work

https://en.wikipedia.org/wiki/Word2vec

https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

Samer Sarsam

Hi Gaurish,

In addition to IDF and TF, sveral ways can be applied, such as word occurrence in the document, i.e., if the word appears in the documet, then its value is 1, but 0 if not. Another method relies on word counts in the document.

HTH.

Samer

Gaurish Thakkar

this seems a word level. Is there any morpheme specific representation ?

Joachim Pimiskern

Just appeared on Arxiv:

https://arxiv.org/abs/1610.00479

Why don't you use the Levenshtein distance?

https://en.wikipedia.org/wiki/Levenshtein_distance

Regards,

Joachim