Translation plagiarism: Human-based translation case study

Translation plagiarism is a particular and subtle type of plagiarism. It can also be qualified as disguised plagiarism because the text is intentionally translated from another language into the language the individual speaks to lose the origin of the source.

The individual copies an entire topic of someone else and translates it into another language, hoping that the possibility of being caught in plagiarism is difficult or not caught at all.

How is translated plagiarism classified?

Translated plagiarism has different ways that it is translated through which individuals try to lose track of the source.

Translation plagiarism types are:

Back-translation
Cross-language

Back-translation

It is a type of translation through which the source is translated from the original language into another language and back.

More quickly, we understand it like this:

The resource is translated from the original language into another language
Then, a separate part is translated back to get a more fundamental meaning
Moreover, in the end, the one that is best translated into the new language is chosen.

Individuals use this way to completely lose track of resources, claiming them as their own. Using this translation method, plagiarism detectors do not have the level of advancement to detect it.

However, there is the most advanced plagiarism detector that detects this translation method is Crossplag; it can also catch the back translation.

Cross-language

Cross-language translation involves combining languages by translating them into a completely new language.

People who use this method in their research start with the idea that cross-language plagiarism is not detectable. Identifying Cross-language or translation plagiarism is quite tricky. Still, the main goal of Crossplag has been to identify it and, in one form or another, reduce its use in the market.

Cross-language is a very complex method of plagiarism, and Crossplag is currently the only plagiarism checker that detects it correctly.

Crossplag’s performance according to World’s best Academic Integrity Researchers

The European Network for Academic Integrity (ENAI) is a combination of educational organizations and people with a stake in upholding academic integrity, currently a leader in academic integrity. It collaborates with national and international groups dedicated to promoting scholarly and research integrity and with professionals in the field.

In 2020 ENAI conducted research called “Testing of support tools for plagiarism detection” to address three main concerns:

Similarity detection methods
Text-matching systems
Plagiarism policies

As part of this research, Akademia was built in 2018, making it the first tool to identify translated plagiarism. Since its debut, this platform has been quite effective at identifying it and continues to make every effort.

Crossplag is essentially a renamed version of Akademia, using the same infrastructure and operating on a global scale, just that it has undergone a rebranding.

In the research done by ENAI, one notable quote was, “In translated texts, all the systems were unable to detect translation plagiarism, with the exception of Akademia.”

ENAI claimed that “As has been shown in other investigations (Weber-Wulff et al., 2013), translation plagiarism is seldom picked up by software systems. The worst performance of the systems in this test was indeed the translation plagiarism, with one notable exception—Akademia. This system is the only one that performs semantic analysis and allows users to choose the translation language.”

ENAI also emphasized that Akademia is the only platform that performs semantic analysis and allows the user to choose the translation language.

How Crossplag solved the translation plagiarism issue

Seeing that translation plagiarism is increasing every day and no one is making a solution to detect it, Crossplag took this step. Although knowing that translated plagiarism is difficult to identify, we decided to bring to market a plagiarism detector that can identify it across 100 language pairs.

Translated plagiarism has recently increased, considering free translation services such as Google Translate. Google Translate is an advanced translator that includes hundreds of languages and offers automatic text translations for free. In most cases, the translation can only be adopted and is not 100% accurate, but because it is a fast translator, people tend to use it a lot.

Simply uploading a document through a link or the entire text, Google Translate translates it in just a few seconds, making it very easy to conduct translingual plagiarism. Another way translation plagiarism is increasing is in the manual form, as so many people speak more than two languages. They take a text and translate it themselves, thinking that the translation itself will be able to escape the plagiarism detector, taking them thirty minutes up to an hour per page.

However, can you get away with plagiarism by translating the text through technology or manual translation?

Not really; whether you translated yourself or with an automatic translator, Crossplag will reveal the source from which you translated. For the Crossplag plagiarism detector, there is no back translation problem to detect translated plagiarism. Knowing that Crossplag is the only detector of translated plagiarism, we are offering the World an opportunity to avoid this type of plagiarism.

What is Crossplag?

Crossplag is the best plagiarism detector which includes translated plagiarism check-in itself. It is the only cross-lingual plagiarism detector. It is dedicated to democratizing plagiarism control while allowing users full ownership and control over their data.

Our vision is to support individuals and institutions interested in maintaining academic integrity and ensuring originality in their writings. We support more than 100 languages and have archived more than 300 million documents and more than 70 billion web articles for comparison.

What does Crossplag do?

Crossplag helps you publish original and honest work while keeping your data protected.

Crossplag offers:

Originality Checking
Unique Workflows
Data Protection
Self Provisioning Environment

Originality Checking – Crossplag is the first and only plagiarism tool offering both single-language and translation plagiarism checking in more than 100 languages.

Unique Workflows – Our unique role-based workflows are designed to optimize the efficiency of upholding academic integrity for every type of institution and business.

Data Protection – Crossplag treats intellectual property rights, data privacy, and data control very seriously and you should control how, where, and who can access your data.

Self Provisioning Environment – Crossplag offers you the most flexible and transparent pricing process to ensure efficiency in the process.

Case study – Manual translation

The manual translation is when a person knows several languages and translates a text alone without using translation tools.

For our case study, we used European Central Bank’s Annual Reports, which are usually published in their native language and English. Annual Reports are never translated using technology due to their sensitivity and usually are professionally translated.

*Note: These tests have used official Central Bank Annual Reports in both native and English languages to demonstrate the power of Crossplag in identifying similarities even when officially translated.

Here is a sneak peek of what the case study will include:

Case study Dashboard

In each row, you can find the percentage of similarity, as well as a ‘Report’ button containing all detailed information and a ‘Delete’ button allowing you to delete a document from our databases immediately. We pride ourselves on being a privacy-driven company and allowing you to decide what happens with your document fully.

Although the original language and the relevant source may be difficult to discover, the Crossplag plagiarism checker identified them. Cross-lingual plagiarism can be challenging to spot, but our primary goal is to do so and provide you with an accurate report in whichever language the copied text appears in.

English-German language pair: Annual Report of the Central Bank of Germany

Last but not least, the German-English pair.

The difficulty of learning articles in English compared to German is undoubtedly the most significant gap between the two languages. For an English speaker, the fact that each article in the German language has many translations is extraordinary. German has three article genders, compared to the two found in most European languages.

This was an interesting problem to tackle, but we did. Let us dive into it deeper.

Here is the screenshot of the Report:

Annual Report of the Central Bank of Germany

This is a document published officially by the German Central Bank, which is the same document in German and in English, translated professionally.

On the right side of the Report, we can see that our translation plagiarism detector found the document to be 67% similar to other documents in languages other than English.

The first source claims that it is 53% similar to ours – mind you, the source is in German, and the document uploaded is in English.

By clicking on the first sentence highlighted in blue with the text “The Supervisory Board performed the tasks assigned to it by law, regulatory requirements, Articles of Association and Terms of Reference.”, we are presented with the source and the sentence as seen below:

Case study – German Bank sentence

The words colored in red will show that the exact words appear in the English document, just that they are translated from German.

This concludes that our translation plagiarism tool can work just as well in the English-German language pair.

You can try out the English-German pair yourself for free of charge on a 1,000-worded document.

English-French language pair: Annual Report of the Central Bank of France

Annual Report of the Central Bank of France

The following Report is for the bank of France. This is a document released officially from the Bank of France, which is the same document in French and English, translated professionally.

English is a Germanic language with Latin and French elements, while French is a Romance language developed from Latin with German and English influences, making them a lot different.

However, does this matter in translation plagiarism detection? Let us dive deeper.

First, on the upper right side of our Report, we will be shown the primary source – the source before translation – with a ~63% similarity. This is a significant sign that the document was translated.

If we click on a rather long sentence such as “such an economic downturn or an increase in interest rates –could weaken the solvency of companies and households and the financial institutions that finance them,” we will be presented with the original sentence in French where it originated first, shown in the image below:

Case study – French Bank sentence

Even though the sentence we were checking for similarity was in plain English, which is different from French, our tool found where it originated first – in French – and showed us proof of where it was first seen.

With this, we can conclude that the English-French language pair is working and does not have an issue.

You can try out the English-French pair yourself for free of charge on a 1,000-worded document.

English-Spanish language pair: Annual Report of the Central Bank of Spain

Next in line is the Spanish-English language pair.

With Spanish, we had to take an exciting approach for many reasons. The most significant distinction between English and Spanish, depending on regional dialects, is that English includes more than 14 vowel sounds while Spanish only has five. Spanish people struggle to distinguish between vowel phonemes in terms like seat and sit because of this.

Let us demonstrate this.

Here is the snapshot of the Report from the Annual Report of the Bank of Spain:

Annual Report of the Central Bank of Spain

This is a document released officially from the Spanish Bank, which is the same document in Spanish and English, translated professionally.

As seen on the top right of the Report, the translated document – in English – and the official Spanish Report (Source 1) is 82% similar to each other.

By clicking on the red-highlighted sentence “One of the priorities of the Banco de España under its Strategic Plan 2020-2024 is to strengthen its analytical work,” we can see that our translation plagiarism checker found the similarity in Spanish:

Case study – Spanish Bank sentence

Colored in red are all the words that appear in the English sentence as well as in the Spanish one.

This is proof that the translation plagiarism in the language pair of English-Spanish works perfectly fine and can detect cross-language plagiarism in this pair.

You can try out the English-Spanish pair yourself for free of charge on a 1,000-worded document.

English-Italian language pair: Annual Report of the Central Bank of Italy

The next in line is the Italian language.

Annual Report of the Central Bank of Italy

This is a document released officially from the Italian Bank, the same document in Italian and English, translated professionally.

On the upper right side of the Report, you will see that the similarity between the officially translated document and the source of the Annual Report of ‘Banca d’Italia’ is 55%.

The first source and the most similar one to the translated document is precisely the original document written in Italian before translation.

If we were to click on the first sentence, “GDP slowed last year, posting a growth of 0.3 percent”, we would be presented with the original sentence in Italian, as the image is shown below.

Case study – Italian Bank sentence

As shown in the image, the sentence we were checking was in English, and the platform informed us that it was translated from another language – Italian in this case.

The complexity of the sentence is not a problem at all. Even if you click on a sentence such as “Both the 1inflation expectations recorded in the euro-area financial markets and Italian firms’ intentions regarding their prices for the next 12 months were revised downward,” you will be presented with the source without an issue at all.

Countless English words have Italian roots. Italian and English are similar to each other for a few reasons. English was significantly impacted by one of Italian’s forebears, even though English is a Germanic language and Italian is a Romance language. Latin

This concludes that the translation plagiarism in the Italian-English language pair works perfectly.

You can try out the English-Italian pair yourself for free of charge on a 1,000-worded document.

English-Czech language pair: Annual Report of the Central Bank of Czech Republic

In the full Report of the Czech bank, we see the similarity in cross-language plagiarism between Czech and English. Even though the two languages are different in writing, Crossplag found similar sources.

Annual Report of the Central Bank of Czech Republic

This is a document released officially from the Czech Bank, which is the same document in Czech and English, translated professionally. The similarity report after the manual translation from Czech to English gave us a result of 59%.

On the right side, the first resource displayed is precisely the original document in Czech, making the platform work flawlessly.

The Report shows similarities – the ‘Exact Match’ and the ‘Possibly altered text.’ The Exact Match will show when a sentence is wholly copy-pasted and translated, while the Possibly Altered Text will display the sentences that are somewhat changed but still have the same meaning.

Case study – Czech Bank sentence

If we click on the “There are several reasons why we define price stability as slight growth in prices rather than zero inflation” sentence highlighted in blue – keep in mind that it is in English – we will be presented with the sentence in Czech, where the similarity was detected.

The Czech language may be challenging for English speakers due to its distinct grammatical structure and vocabulary. However, this does not affect the results of the translation plagiarism at all, as we saw above.

With this, the English-Czech translation plagiarism checking is complete.

You can try out the English-Czech pair yourself for free of charge on a 1,000-worded document.

English-Polish language pair: Annual Report of the Central Bank of Poland

For our next case, we will see how well our translation plagiarism tool works on the Polish-English language pair.

Annual Report of the Central Bank of Poland

We have taken the case of the Annual Report of the Polish Bank translated from Polish to English by a professional translator.

On the upper side of the Report, we see that our English-uploaded document (translated by a professional) is 87% similar to the original document, which is in Polish.

If we click on the first sentence highlighted in blue with the text “In 2018, Narodowy Bank Polski (NBP) pursued a monetary policy in accordance with the Monetary Policy Guidelines for 2018.” we can see that Crossplag found out that it originated from a Polish source as shown below.

Case study – Polish Bank sentence

As we see, all the words are colored red to show that our English version of the document originated in another document in the Polish language. This shows that the Annual Report of the Polish bank was first published in Polish and then in English.

This concludes that even if a document is translated from the Polish language, our translation plagiarism works flawlessly in the English-Polish language pair.

You can try out the English-Polish pair yourself for free of charge on a 1,000-worded document.

English-Portuguese language pair: Annual Report of the Central Bank of Portugal

In our next case, we will take the Annual Report of the Portugal Bank.

Portuguese and English are members of the Indo-European language family but come from different branches. Portuguese is a Romance language, whereas English is a member of the Germanic language family. They also have different grammar in several other respects.

But will grammar affect what our translation plagiarism detector can do?

Annual Report of the Central Bank of Portugal

This is a document released officially from the Portuguese Bank, which is the same document in Portuguese and English, translated professionally.

On the report page, we were shown the primary source, Portuguese, where the text came from, with a 77% similarity from the document uploaded.

On clicking on the first sentence, shaded in red, meaning that it is an ‘Exact Match’ with this structure, “bank, Banco de Portugal shares responsibilities in the design and implementation of the euro area monetary policy.” we are shown with the Portuguese version of this sentence, as shown below.

Case study – Portuguese Bank sentence

The similarities between the two sentences are shown in red in the Sentence Information, where each word is present in both English and Portuguese. The rest that is in black are either stop words or not appearing at all in the sentence.

Based on the image above, we can see that the English-Portuguese language pair is working perfectly fine, and the translation plagiarism, in this case, is found.

You can try out the English-Portuguese pair yourself for free of charge on a 1,000-worded document.

English-Turkish language pair: Annual Report of the Central Bank of Turkey

For our next case, we will test how our translation plagiarism works in Turkish.

The primary distinction between English and Turkish is that the former is an analytical language while the latter is agglutinating. There are few word forms for each lexeme in analytical languages, a defined word order, and subject-object marking through word order.

Here is the snapshot of the Report:

Annual Report of the Central Bank of Turkey

This is a document released officially from the Turkish Bank, which is the same document in Turkish and in English, translated professionally.

On the upper right side of the Report, we see that the similarity between the document in English that was translated from Turkish is 65%. In the sources section, you can see that the translated text appears in many sources where the text was first seen.

When clicking on a rather complex sentence such as “Maintaining an uninterrupted and healthy functioning of financial markets, the credit channel, and the cash flow is critical to contain the adverse effects of the coronavirus pandemic on the Turkish economy.” we can see that the translation plagiarism tool can find the difference in languages as shown below:

Case study – Turkish Bank sentence

Even though we are checking an English sentence, our tool discovered that it was first seen in Turkish on another document.

This means that our translation plagiarism technology has no issues with the language pair English-Turkish.

You can try out the English-Turkish pair yourself for free of charge on a 1,000-worded document.

English-Romanian language pair: Annual Report of the Central Bank of Romania

The subsequent Report is the Romanian bank. Unlike the Russian language, the Roman alphabet does not have many differences from the English one, only a few specific letters.

Nevertheless, it is still not easy to find the similarities in translation, knowing that every language has syntactic changes; you cannot translate the text at the same pace.

Annual Report of the Central Bank of Romania

This is a document released officially from the Romanian Bank, which is the same document in Romanian and English, translated professionally. Comparing cross-linguistic similarities, we have a result of 43.43% of this similarity. We are more focused on cross-language similarity since this text was translated from Romanian to English.

Even if the text does not have the same sentence structure as in the respective language, the Crossplag platform proves that it works perfectly because it manages to detect the source. However, there is room for improvement, which we work hard to achieve.

Case study – Romanian Bank sentence

Sentence information and sources are displayed when the text has been thoroughly checked. Both have almost the same role. The sentence information shows us the source and the sentence in the respective language for that particular sentence. The resources part shows the resources and the similarity percentage to each resource for the document.

This is another point that Crossplag shows in all possible forms, all about translation. It is not easy, but the platform does its best to serve its clients in the best possible way.

Using the above images as proof, we can conclude that the English-Romanian language pair works flawlessly in the translation plagiarism department.

You can try out the English-Romanian pair yourself for free of charge on a 1,000-worded document.

English-Russian language pair: Annual Report of the Central Bank of Russia

Annual Report of the Central Bank of Russia

Above, you can find the Report of the “Annual Report of the Bank of Russia,” translated from the Russian language to English by a professional.

Although the source had a different alphabet with different letters, given that it was a text in the Russian language, we have discovered it and found similarities.

On the top right of the page, we are presented with two options – MLPlag and Crossplag. The former is for same-language detection (Russian to Russian in this case), while the latter is for cross-language detection (Russian to English in this case).

Above that, you will find the Similarity percentage, showing how similar this document is to another one, either in single-language or cross-language. Furthermore, below the MLPlag and Crossplag, you can find all the sources where the text was previously found.

Shaded in blue are all the sentences marked as ‘Possibly altered text.’ On clicking them, you will be presented with what we named ‘Sentence information’ where you can find the source and the sentence – in this case, the text sentence was in English. The source is Russian, proving that the translation plagiarism detector works perfectly fine for the English-Russian language pair.

Case study – Russian Bank sentence

Although the translator was a professional, the Crossplag plagiarism detector could detect similarities between the translated and original Russian languages.

You can try out the English-Russian pair yourself for free of charge on a 1,000-worded document.

How did we do?

On average, we find 62.7% of similarity in the cross-language area. Mind you, this is from fully human-based professional translation – the hardest one to catch.

We are fully aware that this is not perfect. However, we are continually working on improving this area by including more complex data into our algorithm and daily uploading more papers.

Right now, 62.7% outperforms everything in the market by a wide margin. No other currently accessible tool for maintaining academic integrity comes close to this figure in terms of correctness.

Building one of the best cross-language plagiarism detection has been a challenging path to take, and we have dedicated ourselves to building the best one in the market. We have achieved this, and now we are aiming for perfection.

Agnesa Nuha

Agnesa is crazy about math and has won lots of prizes. Although her main gig is being a full-stack developer, she also likes to write about topics she knows really well.

But, Agnesa isn’t just about numbers and algorithms.

When she’s not crunching code or weaving words, you’ll find her conquering mountains with her trusty hiking boots!