Abstract :
In the absence of general tools for lecturers to detect plagiarism in student assignments at the Telkom Purwokerto Institute of Technology. Meanwhile, the library at the Telkom Purwokerto Institute of Technology only accommodates plagiarism detection for the Final Project. Students sometimes copy assignments from their friends in one class or another. Based on a survey of 41 lecturers at the Telkom Institute of Technology Purwokerto, 48.8% of lecturers often found plagiarism in student assignments. In addition, 73.3% of lecturers also strongly agree if there is a tool for detecting plagiarism in student assignments at the Telkom Purwokerto Institute of Technology. This study aims to create a plagiarism detection model using the Cosine Similarity method based on N-Gram words. Cosine similarity is used to find the similarity value in the document. While N-Gram words is a term/word feature that takes words according to a string of n numbers. The data used to build the model is Indonesian text document data from research methodology course assignments, totaling 114 documents in the form of doc, docx and pdf. From the research conducted, the results obtained are Cosine similarity and N-gram succeeded in detecting similarity in documents and the use of N values in N-Gram has an effect on model performance in detecting document similarity. And also the level of similarity on the N-Gram can also be influenced by the document that is entered. Keywords : Cosine Similarity, N-Gram, Plagiasi, Text Mining