How to validate image labeling quality within dataset?

More Nshimyimana Dominique's questions See All

Hello, could you please give me details of the antibodies that can be used to perform quantitative immunofluorescence of AT1Rs in the mouse brain?

Hello, could you please give me precise details of the antibodies that can be used to perform quantitative immunofluorescence of AT1Rs in the mouse brain?

13 June 2024 3,591 1 View

Do cancer cells behave differently to healthy cells when exposed to hyperthermia ?

Dear colleagues, I'd like to find some references documenting the biological reactions of tumour versus normal tissue to heat-based ablations, in terms of cellular sensitivity/resistance and...

13 March 2024 4,814 2 View

(LCA - Ecoinvent) Where can I find the hypotheses behind the Ecoinvent EF 3.1 Caracterisation factors ?

Hello, I study the impact of activities on Oceans. 1. Ecoinvent is mistakingly considering the "infinite dilution" hypothese, forgetting that the dilution of pollution is usually done through its...

07 December 2023 8,172 1 View

Whistling 10 years ago: publication of the following was refused! a shame?

While «hair growth» is an extremely popular item. On February 12, 2013, there were 61,600,000 hits via Google and there were 70,400,000 hits on July 11, 2013? The interest grows on the web as the...

09 November 2023 9,327 1 View

Which tissue is closest to the prostate gland in terms of MRI features (e.g., relaxation times) and physiological characteristics ?

As the pig prostate (our animal model) is relatively difficult to access for an ablation procedure, we'd like to use a proxy organ.

03 September 2023 4,822 0 View

What is the best liquid-liquid extraction method to separate fatty alcohols for LCMS testing with or without derivatizing?

I have been trying to isolate a hydrophobic fatty acid compound from an aqueous extract with a surfactant. The fatty acid compound is Insoluble in water , soluble in ether, chloroform,...

30 August 2023 6,719 0 View

How to do an MRI image average and registration?

Hello, We have done an MRI acquisition of an animal brain using three times the same T1 sequence. However, we are now trying to average these three data sets and do a 3D reconstruction. This is...

22 March 2023 6,565 1 View

Why Rhodomonas salina does actively build aggregates?

Hi all, I recently observed that a Rhodomonas salina strain builds aggregates after only a few minutes. Does anyone have any idea why this is the case? Attached are two jpgs. Sorry for the poor...

22 February 2023 3,089 0 View

Behaviour Tests - Colour discrimination in birds of prey?

Dear Community, I’m searching for a researcher / research group who is working on behavioural tests – colour discrimination in birds of prey. I would be very grateful for advice and hints! Best,...

24 November 2022 9,176 1 View

Bird-window collision research in France?

Dear Community, I'm searching for bird-window collision research in France. Maybe somebody is aware of a research group which is looking into this topic? Best, Dominique

02 August 2022 9,288 2 View

Weak DAPI staining after immunohistochemistry - how to improve?

After immunohistochemistry of previously fixed in PFA and EtOH and then frozen 20 μm sections of zebrafish brain, DAPI staining is very weak (right) compared to the same sections stained without...

05 August 2024 9,637 2 View

Are there any commercially available Donkey anti-Alpaca secondary antibodies?

Are there any fluorescently labeled anti-Alpaca secondary antibodies raised in Donkey? So far I have only been able to find anti-Alpaca secondaries raised in Goat. Or is this not possible due to...

04 August 2024 4,255 1 View

Dirty and clean?

Hi everyone I need a file with a dirty and clean potato image

04 August 2024 7,199 4 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

Hello, regarding Mxene 2D titanium carbide?

I fabricated Ti3C2Tx using concentrated HF 40%, I plot an XRD as attached image below.. please let to know if I obtained it or not.

02 August 2024 6,789 4 View

IHC profiler in image j software. has anyone used it to quantify nuclear DAB positivity?

Dear researchers. I tried using the IHC PROFILER in image j to quantify nuclear DAB staining. I followed the instructions in the original article by "Varghese F, Bukhari AB, Malhotra R, De A...

29 July 2024 2,229 0 View

While using the IHC PROFILER plugin in image j software, to quantify the epithelial cytoplasmic DAB staining do you crop remove the stromal tissue?

My question pertaining to the DAB staining in cytoplasm of human oral squamous cell carcinoma tissue. When quantifying the epithelial cancer cells do we have to crop remove the stromal tissue?...

29 July 2024 2,682 6 View

Why in 2D image created by discovery studio 2021, the halogens appear as alkyl group?

In my molecule there is Chloro group at 2-position of phenyl ring, but in 2D image it appears as methyl showing no interaction.

28 July 2024 734 0 View

Which file formats are accepted for supplementary material?

I have a dataset consisting of json files. i tried to upload a zip or tar of it but the system tells me that the file format is not accepted... br

25 July 2024 1,316 3 View

Dataset of synchronized cardiac angiography and ECG?

Hello, I'm working on medical project and I would need synchronized angiography with ECG? Does anyone know if some open source dataset of this kind exist? Regards, Bruno

25 July 2024 2,214 2 View

Aravinda C V

https://labelyourdata.com/articles/data-labeling-quality-and-how-to-measure-it/

Shafagat Mahmudova

Dear Nshimyimana Dominique ,

Different tasks require different data quality measures. However, many data scientists and researchers tend to agree on a few dimensions of the high quality datasets, which they consider for big data projects. First and foremost, the dataset itself matters. Balance and variety of data points within it are an indicator of how well will the algorithm be able to predict further similar points and patterns. As an example, let's think of an autonomous vehicle training dataset, which is supposed to train AI in differentiating between moving and motionless vehicles. If it contains 90% of images of moving cars but only 10% of those parked, it is considered imbalanced. Naturally, this could lead to a high chance of error. To solve this issue, techniques such as oversampling, downsampling or weight balancing are introduced.

Regards,

Shafagat

Thomas Smith

Hi Nshimyimana Dominique ,

1) Is there a criterion for minimum class label participation? If so, how to satisfy it.

With machine learning in general it is best to have balanced classes in your dataset, as Shafagat Mahmudova mentioned you can use techniques such as weight balancing, downsampling and oversampling. In your case with a 90+% dominance I would recommend doing a mixture of all three techniques for the best performance. Downsample your largest class, oversample your smallest class but not too much as this leads to overfitting these classes, and then use weight balancing.

2) What are the algorithm and benchmarking for validation of labelling quality?

My research has been about using tiny datasets for deep learning, so my first question is how big is your dataset? In my research I have found that there is an inverse correlation between the number of images you have per class and the accuracy of ground truth labels needed.

If you have less than 500 image I would recommend very accurate annotations, and less than 100 images near pixel perfect annotations. Above 1K images the annotations can be looser fitting round your objects because the models can learn the boundaries them selves.

Hope this helps.

Best,

Tom

Abdelhameed Ibrahim

Dear Nshimyimana Dominique

This article may help you

Article Advanced Meta-heuristics, Convolutional Neural Networks, and...

Nshimyimana Dominique

I want to let you know how grateful I am to Shafagat Mahmudova for those methods of oversampling, downsampling or weight balancing, and to Thomas Smith for having recommend adapting accurate annotations with the size of dataset. I wanted to thank you all Abdelhameed Ibrahim and Aravinda C V for the helpful articles.