Machine Learning Classifiers Based Classification For IRIS Recognition

Bahzad Taha Chicho; Adnan Mohsin Abdulazeez; Diyar Qader Zeebaree; Dilovan Assad Zebari

doi:10.48161/qaj.v1n2a48

Authors

Bahzad Taha Chicho Duhok Polytechnic University Duhok, Iraq
Adnan Mohsin Abdulazeez President of Duhok Polytechnic University Duhok, Iraq
Diyar Qader Zeebaree Research Center Duhok Polytechnic University, Duhok, Iraq
Dilovan Assad Zebari Research Center Duhok Polytechnic University, Duhok, Iraq

DOI:

https://doi.org/10.48161/qaj.v1n2a48

Keywords:

Data Mining, Classification, Decision Tree, Random Forest, K-nearest neighbors

Abstract

Classification is the most widely applied machine learning problem today, with implementations in face recognition, flower classification, clustering, and other fields. The goal of this paper is to organize and identify a set of data objects. The study employs K-nearest neighbors, decision tree (j48), and random forest algorithms, and then compares their performance using the IRIS dataset. The results of the comparison analysis showed that the K-nearest neighbors outperformed the other classifiers. Also, the random forest classifier worked better than the decision tree (j48). Finally, the best result obtained by this study is 100% and there is no error rate for the classifier that was obtained.

Downloads

Download data is not yet available.

References

M. J. H. Mughal, “Data Mining: Web Data Mining Techniques, Tools and Algorithms: An Overview,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 6, 2018, doi: 10.14569/IJACSA.2018.090630.

D. Q. Zeebaree, A. M. Abdulazeez, O. M. S. Hassan, D. A. Zebari, and J. N. Saeed, Hiding Image by Using Contourlet Transform. press, 2020.

R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” J. Appl. Sci. Technol. Trends, vol. 1, no. 2, pp. 56–70, 2020.

M. A. Sulaiman, “Evaluating Data Mining Classification Methods Performance in Internet of Things Applications,” J. Soft Comput. Data Min., vol. 1, no. 2, pp. 11–25, 2020.

D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, “Machine learning and region growing for breast cancer segmentation,” in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 88–93.

S. H. Haji and A. M. Abdulazeez, “COMPARISON OF OPTIMIZATION TECHNIQUES BASED ON GRADIENT DESCENT ALGORITHM: A REVIEW,” PalArchs J. Archaeol. Egypt Egyptol., vol. 18, no. 4, Art. no. 4, Feb. 2021.

I. Ibrahim and A. Abdulazeez, “The Role of Machine Learning Algorithms for Diagnosing Diseases,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 10–19, 2021.

P. Galdi and R. Tagliaferri, “Data mining: accuracy and error measures for classification and prediction,” Encycl. Bioinforma. Comput. Biol., pp. 431–6, 2018.

D. Maulud and A. M. Abdulazeez, “A Review on Linear Regression Comprehensive in Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 1, no. 4, pp. 140–147, 2020.

G. Gupta, “A self explanatory review of decision tree classifiers,” in International conference on recent advances and innovations in engineering (ICRAIE-2014), 2014, pp. 1–7.

N. S. Ahmed and M. H. Sadiq, “Clarify of the random forest algorithm in an educational field,” in 2018 international conference on advanced science and engineering (ICOASE), 2018, pp. 179–184.

T. Bahzad and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021.

D. Q. Zeebaree, H. Haron, and A. M. Abdulazeez, “Gene selection and classification of microarray data using convolutional neural network,” in 2018 International Conference on Advanced Science and Engineering (ICOASE), 2018, pp. 145–150.

N. M. Abdulkareem and A. M. Abdulazeez, “Machine Learning Classification Based on Radom Forest Algorithm: A Review,” Int. J. Sci. Bus., vol. 5, no. 2, pp. 128–142, 2021.

A. S. Eesa, Z. Orman, and A. M. A. Brifcani, “A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems,” Expert Syst. Appl., vol. 42, no. 5, pp. 2670–2679, 2015.

A. S. Eesa, A. M. Abdulazeez, and Z. Orman, “A DIDS Based on The Combination of Cuttlefish Algorithm and Decision Tree,” Sci. J. Univ. Zakho, vol. 5, no. 4, pp. 313–318, 2017.

K. Rai, M. S. Devi, and A. Guleria, “Decision tree based algorithm for intrusion detection,” Int. J. Adv. Netw. Appl., vol. 7, no. 4, p. 2828, 2016.

M. Czajkowski and M. Kretowski, “Decision tree underfitting in mining of gene expression data. An evolutionary multi-test tree approach,” Expert Syst. Appl., vol. 137, pp. 392–404, 2019.

D. M. Abdulqader, A. M. Abdulazeez, and D. Q. Zeebaree, “Machine Learning Supervised Algorithms of Gene Selection: A Review,” Mach. Learn., vol. 62, no. 03, 2020.

S. Dahiya, R. Tyagi, and N. Gaba, “Comparison of ML classifiers for Image Data,” EasyChair, 2020.

S. F. Khorshid and A. M. Abdulazeez, “BREAST CANCER DIAGNOSIS BASED ON K-NEAREST NEIGHBORS: A REVIEW,” PalArchs J. Archaeol. EgyptEgyptology, vol. 18, no. 4, pp. 1927–1951, 2021.

D. A. Zebari, D. Q. Zeebaree, A. M. Abdulazeez, H. Haron, and H. N. A. Hamed, “Improved Threshold Based and Trainable Fully Automated Segmentation for Breast Cancer Boundary and Pectoral Muscle in Mammogram Images,” IEEE Access, vol. 8, pp. 203097–203116, 2020.

A. Torfi, “Nearest Neighbor Classifier–From Theory to Practice,” 2020.

D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, “Trainable model based on new uniform LBP feature to identify the risk of the breast cancer,” in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 106–111.

Y. Lakhdoura and R. Elayachi, “Comparative Analysis of Random Forest and J48 Classifiers for ‘IRIS’ Variety Prediction,” Glob. J. Comput. Sci. Technol., 2020.

M. M. Mijwil and R. A. Abttan, “Utilizing the Genetic Algorithm to Pruning the C4. 5 Decision Tree Algorithm,” Asian J. Appl. Sci. ISSN 2321–0893, vol. 9, no. 1, 2021.

D. Rana, S. P. Jena, and S. K. Pradhan, “Performance Comparison of PCA and LDA with Linear Regression and Random Forest for IRIS Flower Classification,” PalArchs J. Archaeol. EgyptEgyptology, vol. 17, no. 9, pp. 2353–2360, 2020.

C. Gong, Z. Su, P. Wang, and Q. Wang, “Cumulative belief peaks evidential K-nearest neighbor clustering,” Knowl.-Based Syst., vol. 200, p. 105982, 2020.

A. Shukla, A. Agarwal, H. Pant, and P. Mishra, “Flower Classification using Supervised Learning,” vol. 9, no. 05, pp. 757–762, May 2020.

E. Sugiharti and A. T. Putra, “Facial recognition using two-dimensional principal component analysis and k-nearest neighbor: a case analysis of facial images,” in Journal of Physics: Conference Series, 2020, vol. 1567, no. 3, p. 032028.

J. Quist, L. Taylor, J. Staaf, and A. Grigoriadis, “Random Forest Modelling of High-Dimensional Mixed-Type Data for Breast Cancer Classification,” Cancers, vol. 13, no. 5, p. 991, 2021.

M. S. KADHM, H. AYAD, and M. J. MOHAMMED, “PALMPRINT RECOGNITION SYSTEM BASED ON PROPOSED FEATURES EXTRACTION AND (C5. 0) DECISION TREE, K-NEAREST NEIGHBOUR (KNN) CLASSIFICATION APPROACHES,” J. Eng. Sci. Technol., vol. 16, no. 1, pp. 816–831, 2021.

R. O. Ogundokun, P. O. Sadiku, S. Misra, O. E. Ogundokun, J. B. Awotunde, and V. Jaglan, “Diagnosis of Long Sightedness Using Neural Network and Decision Tree Algorithms,” in Journal of Physics: Conference Series, 2021, vol. 1767, no. 1, p. 012021.

K. Sarpatwar et al., “Privacy Enhanced Decision Tree Inference,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 34–35.

Z. Bilgin and M. Gunestas, “Explaining Inaccurate Predictions of Models through k-Nearest Neighbors,” 2021.

Y. A. Yakub, “DATA MINING COURSEWORK,” 2019.

M. S. Abirami and J. Vasavi, “A Qualitative Performance Comparison Of Supervised Machine Learning Algorithms For Iris Recognition,” Eur. J. Mol. Clin. Med., vol. 7, no. 6, pp. 1937–1946, 2020.

L. Dhanabal and S. P. Shantharajah, “A study on NSL-KDD dataset for intrusion detection system based on classification algorithms,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 4, no. 6, pp. 446–452, 2015.

A. Viloria, G. C. Acuña, D. J. A. Franco, H. Hernández-Palma, J. P. Fuentes, and E. P. Rambal, “Integration of data mining techniques to PostgreSQL database manager system,” Procedia Comput. Sci., vol. 155, pp. 575–580, 2019.

S. Raschka, “Naive Bayes and Text Classification I - Introduction and Theory,” ArXiv14105329 Cs, Feb. 2017, Accessed: Apr. 03, 2021. [Online]. Available: http://arxiv.org/abs/1410.5329.

R. Kumar and R. Verma, “Classification algorithms for data mining: A survey,” Int. J. Innov. Eng. Technol. IJIET, vol. 1, no. 2, pp. 7–14, 2012.

K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text classification algorithms: A survey,” Information, vol. 10, no. 4, p. 150, 2019.

S. Singh and P. Gupta, “Comparative study ID3, cart and C4. 5 decision tree algorithm: a survey,” Int. J. Adv. Inf. Sci. Technol. IJAIST, vol. 27, no. 27, pp. 97–103, 2014.

Y. Zhao and Y. Zhang, “Comparison of decision tree methods for finding active objects,” Adv. Space Res., vol. 41, no. 12, pp. 1955–1959, 2008.

Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,” Int. J. Inf. Decis. Sci., vol. 12, no. 3, pp. 246–269, 2020.

V. Cheushev, D. A. Simovici, V. Shmerko, and S. Yanushkevich, “Functional entropy and decision trees,” in Proceedings. 1998 28th IEEE International Symposium on Multiple-Valued Logic (Cat. No. 98CB36138), 1998, pp. 257–262.

T. Maszczyk and W. Duch, “Comparison of Shannon, Renyi and Tsallis entropy used in decision trees,” in International Conference on Artificial Intelligence and Soft Computing, 2008, pp. 643–651.

Y. Liu, L. Hu, F. Yan, and B. Zhang, “Information gain with weight based decision tree for the employment forecasting of undergraduates,” in 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, 2013, pp. 2210–2213.

D. Bui, B. Pradhan, O. Löfman, and I. Revhaug, “Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models,” Math. Probl. Eng., vol. 2012, p. 26, Apr. 2012, doi: 10.1155/2012/97/46/38.

A. M. Abdulazeez, D. M. Hajy, D. Q. Zeebaree, and D. A. Zebari, “Robust watermarking scheme based LWT and SVD using artificial bee colony optimization,” Indones. J. Electr. Eng. Comput. Sci., vol. 21, no. 2, pp. 1218–1229, 2021.

S. Sathyadevan and R. R. Nair, “Comparative analysis of decision tree algorithms: ID3, C4. 5 and random forest,” in Computational intelligence in data mining-volume 1, Springer, 2015, pp. 549–562.

I. Reis, D. Baron, and S. Shahaf, “Probabilistic random forest: A machine learning algorithm for noisy data sets,” Astron. J., vol. 157, no. 1, p. 16, 2018.

A. Wadoux, D. Brus, and G. Heuvelink, “Sampling design optimization for soil mapping with random forest,” Geoderma, vol. 355C, Aug. 2019, doi: 10.1016/j.geoderma.2019.113913.

L. Demidova and M. Ivkina, “Defining the Ranges Boundaries of the Optimal Parameters Values for the Random Forest Classifier,” in 2019 1st International Conference on Control Systems, Mathematical Modelling, Automation and Energy Efficiency (SUMMA), 2019, pp. 518–522.

C. Iwendi et al., “COVID-19 patient health prediction using boosted random forest algorithm,” Front. Public Health, vol. 8, p. 357, 2020.

A. Liaw and M. Wiener, “Classification and regression by randomForest,” R News, vol. 2, no. 3, pp. 18–22, 2002.

D. Devetyarov and I. Nouretdinov, “Prediction with confidence based on a random forest classifier,” in IFIP International Conference on Artificial Intelligence Applications and Innovations, 2010, pp. 37–44.

Y. He, C. Wang, F. Chen, H. Jia, D. Liang, and A. Yang, “Feature comparison and optimization for 30-m winter wheat mapping based on Landsat-8 and Sentinel-2 data using random forest algorithm,” Remote Sens., vol. 11, no. 5, p. 535, 2019.

J. Gou, T. Xiong, and Y. Kuang, “A Novel Weighted Voting for K-Nearest Neighbor Rule,” J. Comput., vol. 6, no. 5, pp. 833–840, May 2011, doi: 10.4304/jcp.6.5.833-840.

J. Gou, L. Du, Y. Zhang, and T. Xiong, “A New Distance-weighted k -nearest Neighbor Classifier,” J Inf Comput Sci, vol. 9, Nov. 2011.

Zhe Zhou, Chenglin Wen, and Chunjie Yang, “Fault Detection Using Random Projections and k-Nearest Neighbor Rule for Semiconductor Manufacturing Processes,” IEEE Trans. Semicond. Manuf., vol. 28, no. 1, pp. 70–79, Feb. 2015, doi: 10.1109/TSM.2014.2374339.

D. Q. Zeebaree, A. M. Abdulazeez, D. A. Zebari, H. Haron, and H. N. A. Hamed, “Multi-Level Fusion in Ultrasound for Cancer Detection Based on Uniform LBP Features,” 2021.

S. Rodríguez González et al., Eds., Distributed Computing and Artificial Intelligence, Special Sessions, 17th International Conference, vol. 1242. Cham: Springer International Publishing, 2021.

M. Khan, Q. Ding, and W. Perrizo, “k-nearest neighbor classification on spatial data streams using P-trees,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2002, pp. 517–528.

A. Kataria and M. D. Singh, A Review of Data Classification Using K-Nearest Neighbour Algorithm, vol. 3. 2013.

P. Čech, J. Lokoč, and Y. N. Silva, “Pivot-based approximate k-NN similarity joins for big high-dimensional data,” Inf. Syst., vol. 87, p. 101410, Jan. 2020, doi: 10.1016/j.is.2019.06.006.

B. Bratić, M. E. Houle, V. Kurbalija, V. Oria, and M. Radovanović, “NN-Descent on High-Dimensional Data,” in Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, Novi Sad Serbia, Jun. 2018, pp. 1–8, doi: 10.1145/3227609.3227643.

Y. Yang, H.-G. Yeh, W. Zhang, C. J. Lee, E. N. Meese, and C. G. Lowe, “Feature Extraction, Selection, and K-Nearest Neighbors Algorithm for Shark Behavior Classification Based on Imbalanced Dataset,” IEEE Sens. J., vol. 21, no. 5, pp. 6429–6439, Mar. 2021, doi: 10.1109/JSEN.2020.3038660.

D. H. B. Kekre, T. Management, S. D. Thepade, M. P. S. Of, A. Parkar, and T. S. Engineering, A Comparison of Haar Wavelets and Kekre’s Wavelets for Storing Colour Information in a Greyscale Image. .

A. Pandey, “MACHINE LEARNING BASED DDoS ATTACK DEDUCTION USING WEKA,” 2020.

R. R. Bouckaert et al., “WEKA—Experiences with a Java Open-Source Project,” p. 9, 2010.

K. P. S. Attwal and A. S. Dhiman, “Exploring data mining tool-Weka and using Weka to build and evaluate predictive models,” Adv. Appl. Math. Sci., vol. 19, no. 6, pp. 451–469, 2020.

S. B. Aher and L. Lobo, “Data mining in educational system using weka,” in International Conference on Emerging Technology Trends (ICETT), 2011, vol. 3, pp. 20–25.