What are the standards and acceptable benchmarks in AI vs Human Scoring of language tasks?

07 June 2023 1 10K Report

I have an assessment vendor who is claiming that his AI scoring works almost as good as 'trained assessor' scoring on generated responses on language assessment.

We have some open ended questions and recording of test taker's answers on those questions. These are then scored by trained 'Human Evaluators' (2-3 evaluators independently evaluating each recording) and by machine.

The vendor is saying that 40% mismatch between Human and machine evaluation is acceptable (mismatch = difference of more than 1 point of a 6 point scale) and 30% is good, 20% is very good and 10% is acceptable. For me 10% is the maximum we can accept, beyond that it seems problematic.

I am looking for research references in this line. Any help/reference is most welcome and will be highly appreciated.

Thanks in advance.

Vijai Pandey

Thank you Rana Hamza Shakil

, these pointers are very useful.

Badges
Science topic

More Vijai Pandey's questions See All

Can we convert a thousand of FASTA sequence in numeric form in .csv format? If yes kindly send me the script for the same?

I have a .text file for various FASTA sequence , and i want to convert these sequences into a numeric file which will be in .csv format. OR I want to extract physiochemical properties(features)...

25 July 2024 3,650 2 View

Challenges in Stabilizing and Concentrating a Highly Unstable Acetyltransferase for Crystallization?

I am working on an acetyltransferase that is highly unstable. Its pI is 6.65, and its molecular weight is around 18 kDa. The protein elutes at 1M imidazole and begins to precipitate immediately...

19 June 2024 2,590 0 View

How planer biaxiality would affect the dynamic strain aging in SS 316LN during low cycle fatigue?

As the biaxial loading may implement additional hardening of the material during fatigue with an increase in the biaxiality ratio. What would be the effect of planer biaxiality on dynamic strain...

31 May 2024 9,104 0 View

What are the ethical implications of biometrics in light of the ADHAAR project of India?

Ethics

15 April 2024 6,778 1 View

How can I measure the electrochemical stability window of ZIB for electrolytes?

Hello! I want to know how to test the electrochemical stability of the electrolyte (2M ZnSO4 solution) I have measured electrochemical stability window of a electrolyte by 3 electrode...

26 March 2024 3,306 1 View

Hi all how one can calculate C:N:P ratio?

Dear all, Will you please let me know how to calculate the carbon, nitrogen, and phosphorus (C: N:P) ratio in wastewater? Is it advisable to measure total COD, nitrogen, and phosphate? or I...

05 February 2024 4,105 3 View

OCP measurement in corrosion study?

When OCP stabilizes in a given electrolyte there is formation of passivation layer on sample surface or it shows that at this potential corrosion occurs. Also my tafel Ecorr values are not comming...

04 February 2024 8,735 2 View

My PCR thermocycler ABI Veriti has ramp rate 5°C/sec . how i will set program given in description box the problem is that ramp rate programmed as %?

Stage Temp Time Temp/time 1 95°C 5min - 2 95°C-85°C -...

31 January 2024 1,925 0 View

How green knowledge management influence green innovation and organizational green culture?

Please explain the role of green Knowledge management in brief and its influences.

24 January 2024 6,248 2 View

Regarding phyco-remediation cattle derived wastewater?

Hello intellect. I am working with an algal consortium to treat wastewater derived from cattle (CRWW). Different dilutions of raw CFWW (FCWW/dH2O; v/v) were used for algal cultivation. As a...

20 January 2024 341 1 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How to design human-centered classroom in the age of A.I.?

08 August 2024 347 5 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

E-commerce businesses, COVID-19 and Entrepreneurial Strategies (innovation, proactiveness and risk-taking)?

Hello: I am looking for articles that address e-commerce businesses and COVID-19 survivors, as well as Entrepreneurial strategies that reflect e-commerce businesses. Blessings Evelyn Dempsey

05 August 2024 5,332 3 View

What's the role of IT & AI in Telecommunication Industry?

05 August 2024 8,264 3 View