The correctness of synthetic data and corresponding label generating from smote or adasyn ?

More Tường Nguyễn Minh's questions See All

How to confirm the site-directed mutagenesis result without performing NGS?

I'm cloning a fragment of 3200 nts into plasmid. The cloning was successful, however, 02 amino acids were mutated. Now I want to fix these 02 aa by site-directed mutagenesis technique using...

08 August 2024 4,645 2 View

Dear fellow researchers studying in molecular biology, are there any tips on how to avoid plasmid dimerization when extracting plasmids?

Using INTRON's DNA-spin Plasmid DNA purification kit to extract plasmids from specific E.Coli strains, my team and I have encountered numerous plasmid dimerizations that lead to over-weighted...

23 July 2024 3,490 6 View

What are the frameworks or methodologies to examine written academic ELF?

What should be the frameworks or methodologies to examine written academic ELF? I want to explore the linguistic features of written ELF in research articles.

23 July 2024 4,800 1 View

The journal change publisher, will my published paper get indexed?

Dear all, May I ask a question about indexed by Scopus. 1 month ago, I have published my paper in the Journal name: Challenges in Sustainability in 20/05/2024. However, this Journal has changed...

06 July 2024 6,734 2 View

Transparency oin-in-water emulsion but larger size particle ?

Dear Colleagues, I am currently formulating oil in glycerin/water nanoemulsionn of Thymol and Eugenol, using cationic surfactant. After phase diagram, I got trouble that the clear and...

03 July 2024 4,200 2 View

How to calculate Module Young from compressive strength and strain curve ?

How to calculate Module Young from compressive strength and strain curve

29 June 2024 817 3 View

Is anyone who knows how to buy AMOS sofware?

I have been working in Hanoi University of Public Health. I would like to know where and how I can buy AMOS software?

28 June 2024 6,593 1 View

What are the criterias for Choosing the receptor model (rigid or flexible) in molecular docking?

When choosing the receptor model (rigid or flexible) in molecular docking, does it depend on the designed ligands' characteristics or anything else? For instance: + With the ligand A, the protein...

25 June 2024 2,716 0 View

Why is Ca/P is higher than 1,67 ?

I prepare Mg doped Hydroxyapatite by wet preipitation but (Ca + Mg)/P ratio is higher than 1,82. CaO peak cannot be detected in XRD spectrum

23 June 2024 8,970 19 View

How can I solve the problem with Autodock 1.5.7 version, Command prompt?

I follow the protocol in this paper "Computational protein–ligand docking and virtual drug screening with the AutoDock suite" for the "Flexible Docking". Everything is smooth until I write the...

22 June 2024 3,089 2 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Are there any commercially available Donkey anti-Alpaca secondary antibodies?

Are there any fluorescently labeled anti-Alpaca secondary antibodies raised in Donkey? So far I have only been able to find anti-Alpaca secondaries raised in Goat. Or is this not possible due to...

04 August 2024 4,255 1 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

Srinivas Talasila

Actually machine learning models need lots of data for training but collection of real field data in various circumstances is very difficult and impossible. In this point of view techniques like SMOTE and ADASYN helps to increase the available data. The proposed model which is trained on synthetic data when we test on images in the same data may give good results but when we test our model on the real field, accuracy may be less.

Aki Koivu

How well your synthetic data corresponds to real data depends on your dataset that you are synthesizing from. Just like in classification, generalizability is determined by how well your data represents the real world problem. Methods like SMOTE and ADASYN try to learn patterns from your data and generate observations based on that, this systematic generation can be seen as patterns in the data (Look at the image, provided by Machine Learning Mastery).

How are going to use synthetic data is also important. Adding it to training data and then testing with real data only is the right way, so that you don't affect the performance evaluation with your synthetic data.

Toyosi Ademujimi

There are several synthetic data generation methods and each method has it's underlying assumptions. If the assumptions are correct, then the synthetic data will be correct/representative of the real system. Therefore the quality of the synthetic data depends on the assumptions of the generating method. However, there is no straight forward/easy ways to evaluate these assumptions, domain knowledge / experience might be of help.