Machine Translation (MT) is a highly polarizing topic. Over the years, we’ve heard a multitude of pre-existing concerns, reservations, and strong opinions about MT.
At Logrus IT, we take a pragmatic approach and consider MT technology to be one of the many factors contributing to cost and productivity optimization. Alternatively, MT provides solutions in cases where human translation is out of the question due to time and/or budget restrictions.
When clients request that we avoid using MT, the cause for their concern has less to do with MT, specifically, but rather with insufficient human editing, lack of adequate output quality control procedures, and other processes resulting in potentially substandard translations delivered to them. At Logrus IT, we offset potential MT pitfalls with a comprehensive, multi-layered translation and quality assurance process, and will gladly provide all details related to this process. Numerous built-in checks and substantial human involvement prevent substandard translations from making it into the final delivered documents.
To dispel fears that our clients might have and demystify the subject, we have decided to publish our equivalent of an MT application code.
Optimal results are usually achieved with parallel use of multiple MT engines (different engines may produce the best pre-translations for different document sections.)
We regularly use such engines as Microsoft MT, Microsoft Neural MT, LILT (adaptive MT), Google MT, and others. Results vary, often significantly, among engines.
MT efficiency depends on multiple factors, including the language pair (both source and target), availability of a large and clean enough corpus (the latter of which is more important), subject area, and document structure.
While the quality of MT suggestions for each new translation project may vary significantly, there are indicators known to produce positive outcomes, a number of which are outlined below.
Applying MT most noticeably reduces cost and increases speeds during the initial stages of a new project, when a large TM based on existing translations is unavailable.
MT often boosts translator efficiency. At the same time:
When MT works well, suggestions generated by MT can be treated more or less like traditional low fuzzy matches from the TM. Total savings generally range from 5% to 30% for units where MT was applied.
The days of setting up and training proprietary, standalone Moses servers on local networks are long gone.
A modern, efficient translation process requires a cloud-based CAT (computer-assisted translation) system with simultaneous online access for all parties involved, including PMs, translators and editors. All translations are stored in the cloud, along with glossaries and TMs.
Within this paradigm, cloud-based MT engines are seamlessly connected to the cloud CAT system through APIs and used by translators in a manner similar to TM. MT training is also performed from within the CAT system, which is the most natural way (all TMs are stored in the CAT systems) that also saves significant effort. Well-designed cloud CAT systems already have connectors to most popular MT engines, the selection continues to grow, and it is neither too time-consuming nor costly to add more engines upon request.
As mentioned earlier, using a single MT engine for all projects, or even within a single project, is often not optimal. Quality of MT output varies significantly from engine to engine, depending on the language pair, subject area, and text structure/specifics.
This topic receives extensive coverage, including a recent research study published by our esteemed colleagues from DFKI which compares errors made by neural MT vs. “traditional” MT. For instance, Neural MT (NMT) typically produces translations that sound better, but often make less sense than those produced by statistical and hybrid MT engines.
As a result, Logrus IT prefers translators to have access to multiple MT engines at once, thus granting the opportunity to select the best translation or to discard all MT suggestions and translate the unit from scratch. We regularly use suggestions from Microsoft MT, Microsoft Neural MT, Google MT, LILT (adaptive MT), and others. Free MT options (with translations added to the public domain) are blocked in all cases.
In its entirety, the translation process works as follows: