Comparison of plagiarism checker tools

In seeking to find the best plagiarism checker tool on the market in 2021, we conducted an experiment comparing the performances of 9 checkers.

We focused on a series of factors in our analysis. For each tool, we analyzed the amount of plagiarism it was able to detect, the quality of matches, and its usability and trustworthiness.

This article describes our research process, explaining how we arrived at our findings.

We discuss:

  • Which plagiarism checkers we selected
  • How we prepared test documents
  • How we analyzed quantitative results
  • How we selected criteria for qualitative analysis
To skip to the results and in-depth comparison, check out our article reviewing the best plagiarism checkers of 2021.

Plagiarism checkers analyzed

We kicked off our analysis by searching for the main plagiarism checkers on the market that can be purchased by individual users, excluding enterprise software. We decided to only compare plagiarism checkers that mentioned students and/or academics as one of their target audiences.

Ultimately, we came up with the following list:

  • Scribbr (in partnership with Turnitin)
  • Grammarly
  • PlagiarismDetector
  • PlagScan
  • PrePostSEO
  • ProWritingAid
  • Quetext
  • UniCheck
  • Viper
  • Plagly (failed to generate a report)
  • DupliChecker (unable to process documents of this size)
  • SearchEngineReports (unable to process documents of this size)
  • Writer (did not detect any plagiarism, so we did not proceed)

Test documents

The initial unedited document consisted of 180 plagiarized sections from 180 different sources. These were equally distributed amongst 8 different source types, in order to assess the performance for each source type individually.

The first document was compiled from:

  • 20 Wikipedia articles
  • 20 older news articles
  • 20 recent news articles
  • 20 open access journal articles
  • 20 small website articles
  • 20 big website articles
  • 20 theses and dissertations
  • 20 other PDFs

Generally speaking, open- and restricted-access journal articles, theses and dissertations, large websites, and PDFs are more relevant to students and academics. Writers and marketers might be more interested in news articles, and websites. Wikipedia articles can be useful for both.

Taking this into account, it is important to distinguish between the various source types. Different users (e.g. students, academics, marketers, writers) use different sources, and have different needs with respect to plagiarism checkers.

Edited test documents

For the next step in our analysis, the unedited test document underwent three levels of editing: light, moderate, and heavy. We wanted to investigate whether each tool was able to find a source text when it had been edited to varying degrees.

  • Light: the original copy-pasted source text was edited by replacing 1 word in every sentence.
  • Moderate: the original copy-pasted source text was edited by replacing 2 to 3 words in every sentence, provided the sentence was long enough.
  • Heavy: the original copy-pasted source text was edited by replacing 4 to 5 words in every sentence, provided the sentence was long enough.

We limited our analysis to the source texts that had been detected by all the checkers previously. Any checker that did not find plagiarism in the manipulated text is likely not able to detect plagiarism if the original text has been altered, so we excluded those sources from analysis.

The edited test documents consisted of 17 source texts from Wikipedia and 2 from older news articles.

Unedited
Multilingualism is the use of more than one language, either by an individual speaker or by a group of speakers. It is believed that multilingual speakers outnumber monolingual speakers in the world’s population.

Lightly edited
Multilingualism is the use of more than one language, either by one speaker or by a group of speakers. It is believed that polyglots outnumber monolingual speakers in the world’s population.

Moderately edited
Multilingualism is the use of more than one tongue, either by one speaker or by a group of speakers. It is believed that polyglots outnumber monolingual speakers in the total population.

Highly edited
Multilingualism is the knowledge of more than one tongue, either by one speaker or by multiple speakers. It is established that polyglots outnumber other speakers in the world’s population.

What is your plagiarism score?

Compare your paper with over 60 billion web pages and 30 million publications.

  • Best plagiarism checker of 2020
  • Plagiarism report & percentage
  • Largest plagiarism database

Scribbr Plagiarism Checker

Procedure

We uploaded each version of the document to each plagiarism checker tool for testing. Subsequently, the documents were manually evaluated by looking at each plagiarized paragraph to determine if the checker:

(0) had not been able to attribute any of the sentences to one or multiple sources
(1) had been able to attribute one or multiple sentences to one or multiple sources
(2) had been able to attribute all sentences to one source

This way, we were able to test whether checkers with a high plagiarism percentage actually had been able to find and fully match the source, or if they simply attributed a few sentences of the source text to multiple sources. This method also helped screen for false positives (e.g. highlighting non-plagiarized parts or common phrases as plagiarism).

Data analysis

For the original, unedited document, we calculated the total score for each tool. We weighted the source types based on their relevance for students and academics. The more relevant source types received weight 2. The less relevant ones, namely recent news articles and small websites, received weight 1.

The total possible score for the unedited document was 560. We then calculated the percentage of detected plagiarism for each tool, taking into account the weights.

For the edited texts, we also calculated the percentage of detected plagiarism for each plagiarism checker. However, this time the total possible score was 38, since we only included the 19 source texts that had been found by all plagiarism checkers.

Results

The following results indicate how much plagiarism the tools were able to detect in the unedited, lightly edited, moderately edited and heavily edited document.

For the edited categories, we only used sources from the unedited document that were found by all plagiarism checkers during the first round. Therefore, some checkers were able to find 100% of these sources, even though they did not score 100% for the unedited document with more sources.

Plagiarism checker Unedited Lightly edited Moderately edited Heavily edited
1. Scribbr (in partnership
with Turnitin)
88% 100% 100% 95%
2. QueText 86% 71% 61% 21%
3. Viper 80% 63% 47% 18%
4. Grammarly 80% 47% 37% 18%
5. PlagScan 52% 76% 84% 37%
6. PrePostSEO 47% 47% 21% 8%
7. UniCheck 43% 58% 50% 32%
8. Plagiarism Detector 24% 37% 39% 29%
9. ProWritingAid 7% 37% 34% 13%

Evaluating the plagiarism checker tools

Our next step was a qualitative analysis, during which the quality of matches, usability, and trustworthiness were assessed, with the help of pre-set criteria. These contributed to a more standardized, objective evaluation. All plagiarism checkers were evaluated the same way.

The selected criteria cover a great deal of users’ needs:

  • Quality of matches: it is crucial for plagiarism checkers to be able to match the entire plagiarized section to the right source. Partial matches, where the checker matches individual sentences to multiple sources, result in a messy, hard to interpret report. False positives, where common phrases are incorrectly marked as plagiarism, are also important to consider, because these skew the plagiarism percentages.
  • Usability: it is essential that plagiarism checkers show a clear overview of potential plagiarism issues. The report should be clear and cohesive, with a clean design. It is also important that users can instantly resolve the issues, for example by adding automatically generated citations.
  • Trustworthiness: it is important for students and academics that their documents are not stored or sold to third parties. This way, they know for sure that the plagiarism check will not result in plagiarism issues when they submit their text to their educational institution or for publication. It is also important that the tool offers customer support if problems occur.
If you are interested in a review for each tool, check out our article about the findings of our research.
Is this article helpful?
Julia Merkus

Julia has a bachelor in Dutch language and culture and two masters in Linguistics and Language and Speech pathology. After a few years as an editor, she now writes articles about everything you'll encounter during your thesis writing process!