According to Pym (1992), binary and non-binary errors are two types of errors that can be used to evaluate translations. Binary errors refer to any errors that count as incorrect translation, such as spelling, grammar, word choice, or syntax errors. Non-binary errors refer to a translation that is not totally wrong, but may not be appropriate and can be improved. For example, errors of style, register, tone, or collocation are non-binary errors.
To classify the problems of machine translation according to this model, one possible approach is to compare the machine translation output with a reference human translation and identify any differences that affect the quality of the translation. For each difference, one can decide whether it is a binary or a non-binary error based on the following criteria:
- A binary error is a difference that changes the meaning of the source text or makes the target text ungrammatical or incomprehensible.
- A non-binary error is a difference that does not change the meaning of the source text or make the target text ungrammatical or incomprehensible, but may affect the naturalness, fluency, or adequacy of the target text.
For example, suppose the source text is:
- I like this book very much.
And the machine translation output is:
- I like this book a lot.
And the reference human translation is:
- I really like this book.
The difference between "a lot" and "really" is a non-binary error, because it does not change the meaning or grammaticality of the sentence, but it may affect the style or emphasis of the expression.
The difference between "this" and "the" is a binary error, because it changes the meaning of the sentence and makes it inconsistent with the source text.
This approach can be applied to any machine translation problem and can help to identify the strengths and weaknesses of different machine translation systems or methods. It can also help to prioritize which errors need to be corrected or avoided in future development or improvement of machine translation technology.