Top > GLO-DAAD > Alumni > Projects > Project-Gipp

Global Liaison Office (GLO) - DAAD


Semantic Document Analysis for Plagiarism Detection

Project by Dr. Bela Gipp between April 2014 and March 2015.


Today’s practically available plagiarism detection systems exclusively perform literal text string comparisons. These systems capably identify copies, but fail to detect disguised plagiarism, such as paraphrases, translations, or idea plagiarism. Researchers have recognized that identifying disguised plagiarism requires detection approaches that include semantic analysis. Proposed semantic-based detection approaches have consistently outperformed text-based approaches in terms of detection accuracy, yet the computational effort of semantic approaches makes them infeasible to apply for most practical plagiarism detection scenarios.

The proposed project will develop a hybrid detection approach that uses Citation-based Plagiarism Detection as a preliminary heuristic with a reasonably low loss in detection accuracy to reduce the retrieval space, which will then enable a subsequent computationally more expensive semantic analysis step. If successful, this approach would allow semantic plagiarism detection methods to become practically feasible for the first time.

Publications at NII


  • J. Beel, S. Langer, M. Genzmehr, and B. Gipp: Utilizing Mind-Maps for Information Retrieval and User Modelling, in Proceedings of the 22nd Conference on User Modelling, Adaption and Personalization (UMAP’14), Aalborg, Denmark, 2014
  • B. Gipp: Citation-based Plagiarism Detection – Detecting Disguised and Cross-language Plagiarism using Citation Pattern Analysis, Springer Vieweg Research, 2014
  • B. Gipp, N. Meuschke, and C. Breitinger: Citation-based Plagiarism Detection: Practicability on a Large-scale Scientific Corpus, Journal of the American Society for Information Science and Technology, 2014
  • B. Gipp, N. Meuschke, C. Breitinger, J. Pitman, and A. Nuernberger: Web-based Demonstration of Semantic Similarity Detection using Citation Pattern Visualization for a Cross Language Plagiarism Case, in Special Session on Information Systems Security within Proceedings of the 16th International Conference on Enterprise Information Systems (ICEIS 2014), Lisbon, Portugal, 2014, pp. 677-683