Cosine similarity-based plagiarism detection on electronic documents

Lidia, Permata Sari (2023) Cosine similarity-based plagiarism detection on electronic documents. Journal of Computer Science Application and Engineering, 1 (2): 4. pp. 44-48. ISSN 3031-2272

[thumbnail of 3031-2272_1_2_2023-4.pdf]
Preview
Text
3031-2272_1_2_2023-4.pdf - Published Version
Available under License Creative Commons Attribution Share Alike.

Download (741kB) | Preview

Abstract

This study addresses the prevalent issue of plagiarism in academic theses documents, recognizing the potential for undetected similarities within various sections of documents, escaping supervisor oversight. Proposing a solution utilizing the cosine similarity method—a robust technique in natural language processing and document analysis—this research aims to mitigate plagiarism occurrences. The method's benefits, such as independence from document length and high accuracy, advocate for its adoption in plagiarism detection. The study delineates the Waterfall model employed for systematic development, showcasing its structured but inflexible nature in accommodating evolving software requirements. Additionally, the elucidation of cosine similarity mechanics elucidates its pivotal role in quantifying textual resemblance between documents. Practical demonstrations using TF-IDF vectorization and cosine similarity computation offer a step-by-step understanding of the method's implementation. System design, illustrated through UML diagrams and system interface depictions, underscores the comprehensive approach taken in creating a plagiarism detection application. Lastly, successful Black Box testing confirms the application's adherence to functional criteria, validating its efficiency in identifying potential instances of plagiarism. This study contributes significantly to addressing plagiarism concerns through a robust detection mechanism.

Item Type: Article
Additional Information: Validasi: ldy
Uncontrolled Keywords: Plagiarism detection, Cosine similarity, Electronic documents, Similarity thresholds, Academic theses, Plagiarism in literature, Dissertations, Academic, Artificial intelligence, Natural language processing (Computer science)
Subjects: Computers, Control & Information Theory > Control Systems & Control Theory
Computers, Control & Information Theory > Data Files
Social and Political Sciences > Education, Law, & Humanities
Depositing User: Djaenudin djae Mohamad
Date Deposited: 05 Nov 2024 00:12
Last Modified: 05 Nov 2024 00:12
URI: https://karya.brin.go.id/id/eprint/36267

Actions (login required)

View Item
View Item