DETAIL DOCUMENT
Penerapan Cosine Similarity Dalam Deteksi Plagiasi Dokumen Teks Bahasa Indonesia Berdasarkan Fitur N-Gram Words
Total View This Week0
Institusion
Institut Teknologi Telkom Purwokerto
Author
Anggyta Hanani, Rusyadi
Subject
T Technology (General) 
Datestamp
2022-07-08 09:18:22 
Abstract :
In the absence of general tools for lecturers to detect plagiarism in student assignments at the Telkom Purwokerto Institute of Technology. Meanwhile, the library at the Telkom Purwokerto Institute of Technology only accommodates plagiarism detection for the Final Project. Students sometimes copy assignments from their friends in one class or another. Based on a survey of 41 lecturers at the Telkom Institute of Technology Purwokerto, 48.8% of lecturers often found plagiarism in student assignments. In addition, 73.3% of lecturers also strongly agree if there is a tool for detecting plagiarism in student assignments at the Telkom Purwokerto Institute of Technology. This study aims to create a plagiarism detection model using the Cosine Similarity method based on N-Gram words. Cosine similarity is used to find the similarity value in the document. While N-Gram words is a term/word feature that takes words according to a string of n numbers. The data used to build the model is Indonesian text document data from research methodology course assignments, totaling 114 documents in the form of doc, docx and pdf. From the research conducted, the results obtained are Cosine similarity and N-gram succeeded in detecting similarity in documents and the use of N values in N-Gram has an effect on model performance in detecting document similarity. And also the level of similarity on the N-Gram can also be influenced by the document that is entered. Keywords : Cosine Similarity, N-Gram, Plagiasi, Text Mining 
Institution Info

Institut Teknologi Telkom Purwokerto