Volume 3, Issue 1, 2023
Articles

Advanced Naive Bayes Machine Learning System for Document Similarity Checking

M. Karthica
Ph.D.Research Scholar, Department of Computer Science ,Erode Arts and Science College(Autonomous),Erode,Tamilnadu
Dr.K. MeenakshiSundaram
Associate Professor and Head, Department of Computer Science, Erode Arts and Science College(Autonomous),Erode, Tamilnadu

Published 2023-12-31

Keywords

  • Content-based document classification advanced Naive Bayes method, citation analysis, document similarity, and natural language processing.

How to Cite

Karthica, M., & MeenakshiSundaram, D. (2023). Advanced Naive Bayes Machine Learning System for Document Similarity Checking. Kristu Jayanti Journal of Computational Sciences (KJCS), 3(1), 81–90. https://doi.org/10.59176/kjcs.v3i1.2315

Abstract

Content-based text analysis is recent development in machine learning that is clearly a technological advancement. In the digital age, everything is done quickly and instantly, giving business people greater insight. Classification is a crucial step in using machine learning in education to address these problems. Document writing appropriateness can be assessed by categorizing the topic-specific labeled training data. A content-based system is a tool that assists operators in locating content and overcoming the deluge of information. It assists in anticipating users' interests and provides recommendations based on the interest model of consumers. The evolution of collaborative filtering and the initial content-based recommender system are both continued, requires none of the user’s checking appropriately.

Downloads

Download data is not yet available.

References

[1] Ahera, S.B., Lobo, L.M.R.J.: Combination of machine learning algorithmsfor recommendation ofcoursesin E-Learning System based on historical data. Knowl.-Based Syst. 51, 1–14 2013.

[2 AKR 19] Akromunnisa K and R.Hidayat,“Klasifikasi Dokumen Tugas Akhir (Skripsi) Menggunakan K-Nearest Neighbor,” JISKA J. Inform. Sunan Kalijaga, vol. 4, no. 1, p. 69, May2019, doi: 10.14421/jiska.2019.41-07, 2019.

[3] Andrew Collins and Joeran Beel. 2019. Document Embeddings vs. Keyphrases vs. Terms: An Online Evaluation in Digital Library Recommender Systems. In ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 130–133, 2019.

[4] Anshul Kanakia, Zhihong Shen, Darrin Eide, and KuansanWang. 2019. A Scalable Hybrid Research Paper Recommender System for Microsoft Academic. In The World Wide Web Conference on - WWW ‟19, pages 2893–2899, New York, NewYork, USA. ACM Press, 2019.

[5] Asril H and I. Kamila, “Klasifikasi Dokumen Tugas Akhir Berbasis Text Mining menggunakan Metode Naïve Bayes Classifier dan K-Nearest Neighbor,” p. 10, 2019.

[6] Atmaja D.M.U and R. Mandala, “AnalisaJudulSkripsi untuk Menentukan Peminatan Mahasiswa Menggunakan Vector Space Model dan Metode K-Nearest Neighbor,” IT Soc., vol. 4,no. 2, Aug. 2020, doi: 10.33021/itfs.v4i2.1182, 2020.

[7] A. Deolika, K. Kusrini, and E. T. Luthfi, “Analisis Pembobotan Kata Pada Klasifikasi Text Mining,” J. Teknol. Inf., vol.3, no.2,p.179,Dec.2019,doi:10.36294/jurti.v3i2.1077, 2019.

[8] Feldman R and J. Sanger, The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge; New York: Cambridge University Press, 2007. Accessed:Feb. 20, 2021.

[9] Hidayatullah A.F and M. R. Ma‟arif, “Penerapan Text Mining dalam Klasifikasi Judul Skripsi,” p. 4, 2016.

[10] Kalokasari, D.H I. M. Shofi, and A. H. Setyaningrum, “Implementasi Algoritma Multinomial Naive Bayes Classifier Pada Sistem Klasifikasi Surat Keluar (Studi Kasus: DISKOMINFO Kabupaten Tangerang),” J. Tek. Inform., vol. 10, no. 2, pp. 109–118, Oct. 2017, doi: 10.15408/jti.v10i2.6199, 2017.

[11] Krol, Ed S etal.: Association between prerequisitesand academic success at a Canadian university‟s pharmacy program. Am. J. Pharm. Educ. 83(1) 2019.

[12] Liu, Q., Jia, X., Yang, W., Tu, F., Wu, L.: Research on entity relation extraction based on BiLSTM-CRF classical probability word problems. In: 13th International Conference on Education Technology and Computers. Association for Computing Machinery, pp. 62–68. NewYork, NY, USA 2021.

[13] Malte Schwarzer, Moritz Schubotz, NormanMeuschke, and Corinna Breitinger. 2016. Evaluating Link-based Recommendations for Wikipedia. Proceedings of the 16th ACM/IEEE Joint Conference on Digital Libraries (JCDL„16), pages 191–200, 2016.

[14] F. O. Reynaldi And N. Hikmah, “Implementasi Machine Learning Pada Sistem Pets Identification Menggunakan Python Berbasis Ubuntu,” p. 6, 2020.

[15] Sama, R., Thamarai, L., Dr. Paul, P. Victer.: A survey on predictive models of learning analytics. Proc. Comput. Sci. 167, 37–46, 2020.

[17] O. Somantri, S. Wiyono, and D. Dairoh, “Metode K- Means untuk Optimasi Klasifikasi TemaTugas Akhir Mahasiswa Menggunakan Support Vector Machine (SVM),” Sci. J. Inform., vol.3, no.1, pp. 34–45, Jun. 2016, doi: 10.15294/sji.v3i1.5845, 2016.

[18] Talbi, O., Chelik, N., Ouared, A., Ali, N.: Additive explanations for student fails detected from course prerequisites. In: International Conference of Women in Data Science, pp.1–7. Taif University (WiDSTaif),2021.