Relationship between Holistic and Atomistic Quality Issues
Let me start with re-stating several things that otherwise might result in a lot of confusion.
- As human beings, we primarily react to holistic quality of the text as a whole rather than to the quality of its constituent units, such as sentences, strings, etc. (As far as the concept of the “text as a whole” is applicable and we are not dealing with standalone, unrelated segments of text).
Our minds very efficiently reconstruct missing, vague or slightly incorrect details, be it an imperfect translation, a poorly written source, or both. That is why, as argued in Part I of the article any quality model needs to start with holistic quality factors. This, of course, only applies to areas where holistic factors are relevant and does not apply to extreme examples, like instruction manuals, where each particular instruction needs to be both clear and adequate, while holistic quality is simply not on the radar.
- The number of holistic quality factors is limited.
- Two primary ones are readability and adequacy (formally defined in the next section) that apply universally to all cases where holistic quality is relevant.
- Others like suitability for a certain target audience or adhering to a certain tone/voice are less universal and need to be considered only for specific areas or applications, for instance, for computer games, fiction, advertising, etc. The overall number in each case is still relatively modest. Holistic quality extensions are discussed in more detail in my next paper.
- The overall number of potential generic atomistic quality issues is already relatively big, and each special case or new subject area can result in even more custom issue types. That is why having a proper, hierarchical issue classification and a good catalogue of atomistic quality issues is so important.
Working with a “flat” list of issues containing hundreds of unrelated entries would be a nightmare. To make the whole thing practical and flexible we need to use a hierarchical structure with at least two levels, i.e. several major categories, each comprising one or more subcategories.
- As expected, various publicly available quality frameworks and models use slightly (or completely) different terminology among themselves, which makes it impossible to ensure complete “uniformity for all” in any case. Here are the ramifications:
- Each quality issue or factor can appear under different names depending on the chosen framework. For instance, both adequacy and accuracy may essentially refer to the same issue type.
- Similarity of issue type names across frameworks does not completely guarantee that one and the same thing is implied in all cases. For instance, different authors/frameworks might provide dissimilar definitions for fluency.
As a result, to find any parallels or distinctions between quality frameworks or models one has to go not only through issue names, but also through the whole catalogue of their definitions.
- Most, if not all, publicly available quality metrics and models only deal with atomistic quality, i.e. only with quality measured at the level of particular text units like sentences, paragraphs, strings, etc. They do not suggest an approach to measure holistic quality and/or combine holistic and atomistic quality.
It is one of the biggest and most important factors that differentiates the hybrid, holistic/atomistic quality approach I am suggesting and advocating from most others. Models and approaches mentioned above either do not cover holistic factors at all or do not dedicate due attention to the limitations imposed by and special treatment required for partially subjective holistic factors.
- Most references to readability or adequacy (or their equivalents) in other approaches also relate to atomistic quality factors, not holistic ones.
This means that we can easily combine the holistic criteria I proposed and atomistic issue catalogues as defined within each framework without confusion, even if terminology is not completely consistent. It is sufficient to remember that holistic and atomistic criteria apply at different levels, and use appropriate definitions in each case.
One set of definitions would apply to holistic factors (see my definitions for primary holistic factors, i.e. readability and adequacy, in the next section), and another one would come from the chosen atomistic quality framework and apply to atomistic quality issues only.
There is no inconsistency or contradiction here, even if we end up measuring the same thing twice – first on the holistic, then on the atomistic level. For example, translated text can contain a limited number of sentences that do not convey the meaning/message of the original adequately, but this does not necessarily mean the translated text as a whole is not adequate. Unless we are facing an extreme case with a very large share of bad units, our minds will often automatically “repair” and reconstruct its meaning correctly, especially given that most real-life, contiguous texts have a certain level or redundancy.
Three potential scenarios are as follows:
- Ideal scenario (1): Terminology used in this paper and coming from the chosen atomistic quality framework is close enough.
In this case, we will have no problems at all using the same term across the board. For instance, holistic adequacy would indicate how well the overall meaning and message of the source text is conveyed through translation, while atomistic adequacy would describe the same thing for each particular sentence or string.
- Almost ideal scenario (2): Terminology that appears in this paper and that of the chosen atomistic quality framework is essentially dissimilar.
Different terms will have to be used for similar holistic and atomistic factors. This would somewhat complicate terminology and create redundancies, but would not result in any inconsistencies or misunderstanding. For instance, adequacy on the holistic level would be more or less equivalent to accuracy on the atomistic level.
- Worst case scenario (3). Similar terms have different definitions on holistic and atomistic levels, for instance, adequacy is part of the atomistic quality framework, but its definition seriously deviates from the one given below for holistic adequacy.
This is not a hopeless case, either, so long as we are careful to distinguish between holistic and atomistic, which would definitely pose a challenge. Luckily, this is rare.
Defining Readability and Adequacy
Without pretending to present something that is carved in stone or perfect, I would like to define both fundamental notions of readability and adequacy based on their intuitive meaning, and keep these definitions as concise and simple as possible. We will need them to build bridges between LQA methodology and quality frameworks.
Readability of translation defines how easy it is to read and understand the target text (sentence, string, etc.). In other words, readability measures how clear, unambiguous and well-written the target text is. Zero readability means that the text is unreadable or incomprehensible, i.e. it represents a senseless sequence of words. Perfect readability means that you can easily read the text without stumbling over words or phrases, the meaning of the text is absolutely clear and unambiguous, and no additional pondering is required to get to this meaning.
Readability goes far beyond translation quality as such, and formally applies to any text, including monolingual text in any language.
Important! The mere fact that you can easily read and comprehend the text does not mean that its meaning is correct. That is why we also need to measure adequacy.
Adequacy of translation defines how closely the translation follows the meaning of and the message conveyed by the source text (sentence, string, etc.). In other words, adequacy measures whether the translation process resulted in any discrepancies between source and target texts (except for intended ones), including plain translation mistakes, omissions or additions. Zero adequacy means that after translation the meaning of the source text was distorted beyond recognition. Perfect adequacy requires that both the meaning of and message conveyed by the source text were preserved in their entirety, without any deviations.
Adequacy only applies to a combination of both source and target texts (units, strings, sentences, etc.). You will need a bilingual resource to assess it. Analyzing translation would reveal a certain number of potential issues, but they cannot be verified without access to the source, let alone cases where translated text is smooth enough, but incorrect.
Important! It is worth noting that neither of the fundamental quality concepts described above comprises sub-concepts. Each is a standalone, “elementary” semi-objective quality factor.
Drawing Parallels between Holistic Quality Issues and the MQM Quality Framework
Below I attempt to illustrate some analogies and differences in terminology and concepts worth noting when building a bridge between the quality approach suggested in this paper and the MQM framework, which is my favorite among current and developing quality frameworks. (Regrettably, there is no practical way of providing detailed information for all existing quality issue frameworks).
Overall, MQM presents a commendable catalogue of well-defined atomistic-level translation issues, covering a wide range of text imperfections from typos and grammar to formatting to locale-specific content and so on, and I would strongly recommend using its ground-level issue categories over any other publicly available issue catalogue. At the same time, there is a need to analyze and explain parallels between readability and adequacy as defined in this paper and high-level MQM issue categories in more detail to avoid confusion.
Within the MQM framework the concept closest to readability is called fluency, and the one closest to adequacy is called accuracy. Both are high-level categories comprised of multiple others, meaning that here we have Scenario #2, earlier classified as being almost ideal.
Below please find a brief analysis of similarities, differences and finer details related to the two term pairs in question: adequacy/accuracy and readability/fluency.
Adequacy vs. Accuracy
The definition of accuracy in MQM is essentially very close to the general definition of adequacy provided in this paper, except for the fact that accuracy can be further divided into subtypes, including well-familiar cases of mistranslation, omission, addition, and untranslated units. This finer division is relevant when dealing with a complete catalogue of atomistic quality issues. Differences in approaches are minor, and they are listed below:
- A couple of accuracy subtypes seem to be missing from the MQM framework, and these are “under-translation” and its antonym “over-translation”. The first subtype applies in cases when the unit was partially translated, and some parts of it were erroneously left without translation. This happens, for example, when the translator misses the word “and” located between two huge groups of tags, but translates everything else within the string. Theoretically, under-translated units can be treated as untranslated ones, even though it is not completely correct. Over-translation means the opposite, i.e. the unit contains parts that needed to be left untranslated for a certain reason (copyright, left in the source language universally, etc.), but were translated instead.
- In my humble opinion, terminology errors should belong to the same high-level category as style guide errors and violations of instructions provided by the client. Placing them under “mistranslation” is not optimal. Also, while some terminology errors might result in distorted meaning (i.e. reduced adequacy), others, like using a completely recognizable but obsolete term (like “directory” instead of “folder”), would look clumsy but have no effect on translation adequacy.
Readability vs. Fluency
While the term “fluency” from MQM sounds a lot like “readability”, its real meaning is significantly broader. Within MQM, fluency covers not only readability-related factors like style or intelligibility problems, but also a lot of purely technical things like spelling, grammar, locale violations, etc.
I question whether the subtypes under fluency include too many unrelated issues, but that is the decision of MQM creators. It is simply important to remember that readability in this paper is a much more focused and narrower category, only loosely related to fluency from the MQM framework.
Removing Verity from LQA Metrics
As explained in Part III, I feel strongly that verity is an important and relevant issue type when we are looking at translation quality in general. At the same time, it needs to be addressed as part of the market compliance audit (MCA) and outside of the LQA scope, because the majority of verity violations are caused by process- or project-level problems on the client end, rather than during translation. If, for instance, the translation vendor has forgotten to substitute standard source copyright text with its trans-created version provided by the client, it would automatically fall under the scope of not following special client instructions.
Hence, if a quality metric based on the MQM framework is designed for LQAs, I would strongly recommend to avoid using any issue types or subtypes belonging to the Verity branch. On the other hand, if you are designing an MCA process, you should pay a lot of attention to Verity and its issue subtypes.
In an Ideal World…
My ideal picture of high-level atomistic issue categories somewhat differs from the one presented by the MQM framework. I would prefer to divide issues by nature and origin rather than placing most real-life issues under Fluency (including ones that hardly belong there). So instead of the Accuracy – Fluency pair I would rather use the following trio: Language errors – Technical errors – Client-specific issues. As already explained, Verity is a separate high-level issue type that needs to be covered by market compliance audit, not LQA, and I would leave it as it is described within MQM.
Within this triad, language errors cover issues that require knowledge of the language and can only be revealed by a native speaker, while technical errors cover issues that can be revealed without knowing the language, like locale-related issues, broken tags, etc. This distinction (based on whether language knowledge is required) makes a lot of sense, because technical errors (unlike language ones) can be addressed in a centralized manner by non-native speakers who are well aware of locale specifics. Technical errors are also the primary target of automated checks. My suggested high-level issue organization would look as follows (please look for subtype definitions in the MQM issues list):
- Language errors
- Addition (new subtype)
- Under-translation (new subtype)
- Over-translation (new subtype)
- Readability (new subtype replacing Stylistics and Unintelligible that actually reflect one and the same thing)
- Technical errors
- Locale convention
- Markup (including tags, placeholders, etc.)
- Client-specific issues
- Style guide
- Project-specific instructions
- Market compliance instructions (new subtype)
I want to emphasize that that the overall issue catalogue presented by the MQM framework is the best currently available publicly, and those not publicly available are hardly usable on a wide scale. My concerns related to the MQM issue structure by no means diminish its value or usability for practical applications.