Can anyone suggest approaches that can help me to aggregate keywords extracted from a corpus of documents?

More Mustapha Bouakkaz's questions See All

What are the best academic conferences for social network analysis?

03 April 2018 6,070 4 View

Does anyone have an implementation of one of the four algorithms Pascal, Close, MaxMiner or Apriori?

02 March 2015 1,306 3 View

Can someone introduce to me some important keywords extraction criteria ?

I want to use a to compare betewwn more than 15 algorithms in keywords extraction. I will creat a comparative table (in the row you find algorithms and in columns you find criteria) Which...

09 October 2014 2,959 3 View

Do you know a journal that i can publish my research in keywords extraction

I'm looking for a scientific journal where i can publish my paper in keywords extraction and aggregation. Criteria of the journal: 1- have an impact factor2- indexed by Thomson R or SCOPUS3- fast...

07 August 2014 3,439 2 View

What are the metrics used to evaluate keywords extracted from documents?

What are the metrics used to evaluation keywords extracted from documents, other than recall, precision and F-score?

07 August 2014 2,066 7 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

I recently came across an anatomy text by Carl Moller that was published in 1915 but it is in German or Dutch neither of which I can understand. I would like to know if there is an English...

10 August 2024 4,347 1 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

Do you know best mines of western part of Afghanistan?

I want to know more about Mn deposits in west of Afghanistan.

07 August 2024 3,427 1 View

Mariana Soffer

4.1. Feature extraction

The sklearn.feature_extraction module can be used to extract features in a format supported by machine learning algorithms from datasets consisting of formats such as text and image.

Note

Feature extraction is very different from Feature selection: the former consists in transforming arbitrary data, such as text or images, into numerical features usable for machine learning. The latter is a machine learning technique applied on these features.

http://scikit-learn.org/stable/modules/feature_extraction.html

Andrey Guskov

What do you mean by "aggregation"? May be the tools like RapidMiner would be useful for you?

http://www.youtube.com/watch?v=EjD2M4r4mBM&list=PL7669FFBBA1825900&index=2

Mustapha Bouakkaz

Thanks Sir. Andrey but not what I'm looking for, i give you an example:

for numerical operation : aggregation =(5+8+4+3) / 4=5

for textual aggregation = (TCP + adress + IP + wifi) = network

other example = (data set + olap + cube + ...) = data warehousing

i proposed two approaches based on keywords for automatic textual aggregation that will be published soon

It seems to be the hard task, cause I don't know any way to formally define your "textual aggregation"- operation. To start, I see two possibilities. The first is to define it a priory by the set of the rules, but it looks like hard-coding and isn't agile. The second is to use AI-methods to derive some rules from texts. Or my be something third...

It's interesting, which one you have chosen?

it's easy task if i have the ontology of domain to select the last common ancentre, but if i don't have it, i will propos i new approach that select the most represntative keywords in the corpus