Similarity identification based on word trigrams using exact string matching algorithms

Abdul, Fadlil and Sunardi, Sunardi and Rezki, Ramdhani (2022) Similarity identification based on word trigrams using exact string matching algorithms. Intensif : Jurnal Ilmiah Penelitian Teknologi dan Penerapan Sistem Informasi, 6 (2): 8. pp. 253-270. ISSN 2549-6824 (Online) 2580-409X (Print)

[thumbnail of Jurnal_Abdul Fadlil,_Universitas Ahmad Dahlan_2022-8.pdf]
Preview
Text
Jurnal_Abdul Fadlil,_Universitas Ahmad Dahlan_2022-8.pdf

Download (587kB) | Preview

Abstract

—Several studies regarding excellent exact string matching algorithms can be used to identify similarity, including the Rabin-Karp, Winnowing, and Horspool Boyer-Moore algorithms. In determining similarities, the Rabin-Karp and Winnowing algorithms use fingerprints, while the Horspool Boyer-Moore algorithm uses a bad-character table. However, previous research focused on identifying similarities using these algorithms based on character n-gram. In contrast, identification based on the word n-gram to determine the similarity based on its linguistic meaning, especially for longer strings, had not been covered yet. Therefore, a word-level trigram was proposed to identify similarities based on the word trigrams using the three algorithms and compare each performance. Based on precision, recall, and running time comparison, the Rabin-Karp algorithm results were 100%, 100%, and 0.19 ms, respectively; the Winnowing algorithm results with the smallest window were 100%, 56%, and 0.18 ms, respectively; and the Horspool algorithm results were 100%, 100%, and 0.06 ms. From these results, it can be concluded that the performance of the Horspool Boyer-Moore algorithm is better in terms of precision, recall, and running time.

Item Type: Article
Uncontrolled Keywords: String-matching, Algorithm, Performance, N-Gram, Similarity
Subjects: Computers, Control & Information Theory > Applications Software
Depositing User: - Dina -
Date Deposited: 11 Jul 2023 07:19
Last Modified: 11 Jul 2023 07:19
URI: https://karya.brin.go.id/id/eprint/19147

Actions (login required)

View Item
View Item