Navigationsweiche Anfang

Navigationsweiche Ende

Select language


Click on the image to download the thesis (PDF, 10MB)


Doctoral Thesis of Norman Meuschke


 

 

Below you find the data, source code, and other resources for the doctoral thesis:

Analyzing Non-Textual Content Elements to Detect Academic Plagiarism

Norman Meuschke, University of Konstanz, 2021.

PDF  |  DOI  |   BibTeX

 


Doctoral Defense


 

Recording

 

 

Slides

Click on the image to download the slides for the defense talk (PDF, 7MB).

 

 


Data & Source Code


 

Hybrid Plagiarism Detection Systems HyPlag

 

  • Demo system
    (user: guest@hyplag.org | pw: hybridPD)

  • Source code 
    (login to GitHub first! user: hyplag-guest | pw: hybridPD20)

 


 

Citation-based Plagiarism Detection

 

  • Source Code: see HyPlag source code above

  • Data:

    • Reference collection: 185,170 documents from PMC OAS collection, provided as part of the CITREC dataset (5 GB zipped, ~20 GB raw) — includes document metadata, citation data and pre-computed similarity scores
    • User-perceived cases of plagiarism (available upon request)

 


 

Image-based Plagiarism Detection

 

  • Source Code

  • Data:15 test cases embedded into a reference collection of 10,000 images extracted from PMC OAS documents (547 MB zipped) 

 

 


 

Mathematics-based Plagiarism Detection

 

  • Source Code: see HyPlag source code above

  • Data:

    • Test cases: 10 confirmed cases of plagiarism available as PDF and TEI
      (login to GitHub first! user: hyplag-guest | pw: hybridPD20)
    • Reference collection: 105,120 arXiv documents converted to XHMTL

 

 

zuletzt bearbeitet am: 28.11.2021