How to measure information loss when converting categorical data to numerical?

10 October 2018 0 9K Report

Assume that a dataset has a mix of categorical and numerical attributes. The dataset has to undergo numeric processing which necessitates the conversion of the categorical attributes to numeric/quantified form.

But if we do this - irrespective of the strategy we employ [like dummy variables/probability weights etc.] - do we always stand to lose information? How does one measure this information loss?

Any references or resource links would be most helpful!

Thanks in advance

Badges
Science topic

More Siddhartha Bhuyan's questions See All

What is the state of the art in semantic reasoning?

Looking for information about methods and techniques pertaining more to common sense and context aware reasoning

09 October 2019 5,627 3 View

What strategies can be used by a reasoning software to differentiate between sarcasm and lies?

How to enable a machine to distinguish between what is a lie and what is sarcasm? Use of a common sense or informational knowledge base will only be able to tell us about the truth or falsity of a...

07 August 2019 5,946 4 View

What are some good ways to represent large paragraphs of text for performing automatic abstractive summarization?

How to represent huge paragraphs of text so as to provide it as an input to a model that performs abstractive summarization? I have read about distributional methods like word2vec but are there...

05 June 2019 235 8 View

In WECS : Gain values for conversion of PMSM rotor speed(rad/s) to p.u. input to wind turbine and Torque(pu) of WT to input of PMSM-Torque (Nm) ?

I am working on developing a MATLAB Simulink model of Wind turbine with PMSG followed by MPPT Boost converter. I have gone through so many literature and videos but I am not finding clear solution...

03 August 2024 888 1 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

Simulation of metal drawing by Abaqus with UMAT?

Hello, colleagues. Recently, I have been working on a metal processing simulation with my UMAT in Abaqus. I have outlined the corresponding simulation, but I keep encountering issues that cause...

30 July 2024 7,062 1 View

How can one determine the conversion factor for a radiolabeled substrate to a radiolabeled product using slopes from linear regression equation?

I am preparing a radiolabeled product using a radiolabeled substrate. I have the cpm values for different dilutions of both my known substrate and unknown product ? How can one determine the...

27 July 2024 3,604 1 View

Which file formats are accepted for supplementary material?

I have a dataset consisting of json files. i tried to upload a zip or tar of it but the system tells me that the file format is not accepted... br

25 July 2024 1,316 3 View

Dataset of synchronized cardiac angiography and ECG?

Hello, I'm working on medical project and I would need synchronized angiography with ECG? Does anyone know if some open source dataset of this kind exist? Regards, Bruno

25 July 2024 2,214 2 View

Does post-translational protein modification cause devisions on observed pI verses calculated pI?

In running two-dimensional gel electrophoresis on bacterial protein, some spots that appear to match a protein sequence have a significantly more acidic isoelectric point than the calculated pI....

24 July 2024 8,076 3 View

Can a shoot-through event of a tri-state digital buffer cause momentary Hi-Z state?

// interested in the difference between floating events and short circuits.

22 July 2024 6,565 0 View

How to Select the most suitable machine learning algorithm depending on the characteristics of the given dataset ?

I'm working on a project that involves analyzing a new dataset, and I'm at the stage of selecting the most appropriate machine learning algorithm. The dataset consists of both numerical and...

22 July 2024 6,097 7 View

How to use evolutionary algorithms with real parameters in ryu sdn controller with large scale?

Hi, I wanna to implement evolutionary algorithms in ryu sdn controller in mininet, i have some challenges, how i can run the big scale topo with one sdn contoller??? and another question is to...

21 July 2024 246 2 View