Doctoral Thesis Norman Meuschke
Doctoral Defense
Data & Source Code
Hybrid Plagiarism Detection System HyPlag
- Demo system (user: guest@hyplag.org | pw: hybridPD)
- Source code (login to GitHub first! user: hyplag-guest | pw: hybridPD20)
Citation-based Plagiarism Detection
- Source Code: see HyPlag source code above
- Data:
- Reference collection: 185,170 documents from PMC OAS collection, provided as part of the CITREC dataset (5 GB zipped, ~20 GB raw) — includes document metadata, citation data and pre-computed similarity scores
- User-perceived cases of plagiarism (available upon request)
Image-based Plagiarism Detection
- Source Code
- Data:15 test cases embedded into a reference collection of 10,000 images extracted from PMC OAS documents (547 MB zipped)
Mathematics-based Plagiarism Detection
- Source Code: see HyPlag source code above
- Data:
- Test cases: 10 confirmed cases of plagiarism available as PDF and TEI
(login to GitHub first! user: hyplag-guest | pw: hybridPD20) - Reference collection: 105,120 arXiv documents converted to XHMTL
zuletzt bearbeitet am: 03.12.2021