How is data quality being tested?

More Wisam Mohammed Abed Alqaraghuli's questions See All

Poured Earth Concrete ?

What are the benefits and advantages of concrete flooring when used in parking lot floors, and what are the technical requirements and general specifications that should be considered to ensure...

29 June 2024 2,171 5 View

How to run TensorFlow on Hadoop ?

20 April 2024 162 0 View

How the ventilator generates positive pressure in PSV?

I read that "PSV assists the patient's effort by delivering a positive pressure during inspiration. This reduces the work required to expand the lungs and overcome airway resistance, making...

20 April 2024 1,660 1 View

List the different algorithm techniques in Machine Learning ?

20 April 2024 3,226 2 View

Subject: Seeking a Website for Editing Photos and Adding Scale Bars?

I hope this message finds you well. I'm writing to seek your expertise regarding image editing tools that can add scale bars to photos. I'm working on a research project that involves analyzing...

19 April 2024 9,822 1 View

What is a Bayesian network, and why is it important in AI ?

16 April 2024 3,252 5 View

How can AI be used in fraud detection ?

16 April 2024 5,070 3 View

Which algorithm is used by Facebook for face recognition? Explain its working ?

16 April 2024 3,446 3 View

What is the inference engine, and why it is used in AI ?

16 April 2024 4,211 3 View

Which programming language is not generally used in AI, and why ?

15 April 2024 4,607 4 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

How combine yolo with Faster R-CNN?

I want a model that is balanced with accuracy or speed, faster rcnn has high accuracy while yolo have fast speed. i am thinking to combine them to get a hybrid model to achieve both speed and accuracy

02 August 2024 3,104 0 View

Why can't academics earn the money they deserve?

Only Journals make money from the articles we have worked on for years. Academics do not earn money from their refereeing. Then shouldn't the solution be a system in which academics can earn...

01 August 2024 6,469 6 View

Conjugation of PEG-Amine to an Amino Acid Using EDC?

I am attempting to conjugate PEG to an amino acid at the C-terminus, for the purposes of producing nanoparticles. I have been told that PEG modified with amine groups can be used for this purpose,...

31 July 2024 2,033 1 View

Chuck A Arize

Data quality testing is the practice of making assertions about your data, and then testing whether these assertions are valid. This concept can be used to test both the quality of your raw source data and to validate that the code in your data transformations is working as intended. According to Gartner, bad data costs organizations on average an estimated $12.9 million per year.

Sundus F Hantoosh

Dear doctor

I quoted the following from web hoping being satisfying

"Traditionally, data engineers write data quality rules using SQL. This manual method works well when there are dozens, or even hundreds of tables, but not when there are ten thousand or more. In a modern, data-driven organization, data engineers can never keep up with demand for data quality scripts.

New data quality automation (DQA) tools replace manual methods with ML models. You can view this product segment as a subset of data observability, which addresses both data quality and data pipeline performance. There are three different approaches to DQA: automated checks, automated rules, and automated monitoring. Each approach has its pros and cons, but collectively they represent the future of data quality management.

The following three vendors embody these approaches, respectively.

Ataccama employs the automated checks approach, which uses ML to classify incoming data at the row and column level and automatically apply data quality rules written by data engineers.
First Eigen uses the automated rules approach, which uses ML to generate data quality rules, which consist of standard quality checks (nulls, duplicates, etc.) and complex correlations between data columns and values.
BigEye uses the automated monitoring approach, which uses ML to detect anomalies in the data rather than apply rules. The tool monitors changes to tables at fixed intervals, triggering alerts if it detects an uncharacteristic shift in the profile of the data."

Dr.Sundus Fadhil Hantoosh

Mahender Singh

Very nice query? In such era & in every field the data is generated, stored & used for future plans. And the most important point is quality of data, if the quality is poor the results would be poor. Data in health management system should be of high quality for present & future plans but, unfortunately, the quality remains in poor because the data managers care minimum on data quality. So, the data/manager should be well qualified, trained, adept in handling the data & keep watch in continuum ie see that the data/information is coming timely, correct, complete, regularly & provide feedback to who generated data. Continuous monitoring, comparing previous information, cross-check & on the spot evaluation of data where generated. Maintenance of Data quality is

really a huge task & more focus is necessary.