Label generation for a massive outlier data

More Sayyed Ahmad Naghavi Nozad's questions See All

Using OBD technique i am trying to measure laser induced shockwaves velocity i found that at start velocity increases and then decay?

i am unable to interpret why its increases in start as shown in figure

11 August 2024 2,179 1 View

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about Uranium ore deposits in world.

11 August 2024 6,720 0 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

10 August 2024 8,198 5 View

Controlling for pupil light reflex when analyzing pupil size time course?

I used eye tracking to examine how participants from two different populations (A and B) react to an image. Participants in population A exhibit larger pupil sizes over time, but they also have...

10 August 2024 3,229 0 View

What are a “Farmers Producer Organization” (FPO) and its essential features?

10 August 2024 477 5 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How do interactions between the biosphere, the carbon cycle, and the water cycle impact global warming and interaction between the atmosphere and the hydrosphere?

09 August 2024 3,291 2 View

How to get moment output in Abaqus Standart?

I have input a moment load in module load Abaqus, i put my moment load on the node surface (using reference point). I have define moment in history output and make a set for moment too. But the...

08 August 2024 4,831 4 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Absorption coefficient of methane?

Hello, Can anyone provide me with the absorption coefficient of methane gas at 7.7 um? Any reference?

06 August 2024 980 5 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

Samer Sarsam

Sayyed Ahmad Naghavi Nozad

Given the unreliable nature of the outlier labels and the employer's insistence on a supervised approach, a potential solution could involve a semi-supervised learning technique. Semi-supervised learning combines aspects of both supervised and unsupervised learning by using a small amount of labeled data along with a larger amount of unlabeled data for training. This approach could help leverage the limited reliable outlier labels while also utilizing the vast amount of unlabeled data available for outlier detection.

NB: The best analytical solution would depend on the specific characteristics of the dataset, the nature of the outliers, and the practical constraints of the problem domain, it may be beneficial to experiment with various approaches, including semi-supervised learning, cluster analysis, etc., to determine the most effective solution for the given scenario.

Cheers,

Dr. Samer Sarsam

Thank you very much Dr. Samer Sarsam ,

However, the situation is like we do not have any reliable labels for the massive-scale dataset which is somehow a stream data as well! I know it is like a problem with no solutions, even though I thought that probably there could be some novel resolution to this issue.

I have heard that there are some methods in image processing tasks that generate some distorted images to utilize for training a (semi-)supervised model in the end. However, I am looking for a generic strategy that could be successfully applied to non-image data too.

Much appreciated

You are welcome, Sayyed Ahmad Naghavi Nozad .

I see. A potential direction could involve leveraging techniques from active learning or human-in-the-loop approaches. These methods allow for iterative improvement of models by selectively labeling data points that are most informative or uncertain. By strategically annotating a small subset of your data and iteratively refining your model, you may be able to achieve reliable outlier detection without relying solely on predefined labels.

I hope that helps.

Kind regards,