A Comparison of Classification Algorithms for Predicting Dis-tinctive Characteristics in Fine Aroma Cocoa Flowers Using WE-KA Modeler

Authors

  • Daniel Tineo Yanayacu Experimental Center, Supervision and Monitoring Directorate at Agricultural Experimental Stations, National Institute of Agricultural Innovation (INIA), Jaén San Ignacio Highway KM 23.7, Jaén 06801, Cajamarca, Peru; Institute for Research on Sustainable Development of the Ceja de Selva (INDES-CES), National University Toribio Rodríguez de Mendoza, Chachapoyas 01001, Amazonas, Peru;
  • Yuriko S. Murillo Biology Laboratory, Department of Basic and Applied Sciences, National University of Jaén, Jaén 00000, Peru;
  • Mercedes Marín Yanayacu Experimental Center, Supervision and Monitoring Directorate at Agricultural Experimental Stations, National Institute of Agricultural Innovation (INIA), Jaén San Ignacio Highway KM 23.7, Jaén 06801, Cajamarca, Peru;
  • Darwin Gomez Yanayacu Experimental Center, Supervision and Monitoring Directorate at Agricultural Experimental Stations, National Institute of Agricultural Innovation (INIA), Jaén San Ignacio Highway KM 23.7, Jaén 06801, Cajamarca, Peru;
  • Victor H. Taboada Yanayacu Experimental Center, Supervision and Monitoring Directorate at Agricultural Experimental Stations, National Institute of Agricultural Innovation (INIA), Jaén San Ignacio Highway KM 23.7, Jaén 06801, Cajamarca, Peru;
  • Malluri Goñas Yanayacu Experimental Center, Supervision and Monitoring Directorate at Agricultural Experimental Stations, National Institute of Agricultural Innovation (INIA), Jaén San Ignacio Highway KM 23.7, Jaén 06801, Cajamarca, Peru; Institute for Research on Sustainable Development of the Ceja de Selva (INDES-CES), National University Toribio Rodríguez de Mendoza, Chachapoyas 01001, Amazonas, Peru;
  • Lenin Quiñones Huatangari Institute for Data Science Research, Engineering, National University of Jaén, Jaén 00000, Peru.

DOI:

https://doi.org/10.48161/qaj.v4n3a571

Abstract

The expression of crop functional traits is influenced by environmental and management conditions, which in turn is reflected in genetic diversity. This study employed a data mining approach to determine the functional traits of flowers that influence cocoa diversity. A total of 1,140 flowers from 228 trees were utilized in this study, with 177 representing fine aroma cocoa trees and 51 trees belonging to other commercial cultivars. Three attribute evaluators (InfoGainAttributeEval, CorrelationAttributeEval and GainRatioAttributeEval), and six algorithms (Naive Bayes, Multinomial Logistic Regression, J48, Random Forest, LTM and Simple Logistic) were employed in this study. The findings indicated that the GainRatioAttributeEval attribute generator was the most efficacious in discerning the functional trait in cocoa diversity flowers. The algorithms Simple Logistic and LMT were the most accurate and specific, while Naive Bayes was the most efficient in terms of computational complexity for model building. This research provides a comprehensive overview of the use of machine learning to analyze functional traits of flowers that most influence cocoa genetic diversity. It also highlights the need to further improve these models by integrating additional techniques to increase their efficiency and extend the data mining approach to other agricultural sectors.

Downloads

Download data is not yet available.

References

Gómez, J. M., Perfectti, F., Armas, C., Narbona, E., González-Megías, A., Navarro, L., DeSoto, L., & Torices, R. (2020). Within-individual phenotypic plasticity in flowers fosters pollination niche shift. Nat. Commun., 11(1), 4019.

Buchanan, S., Isaac, M. E., Van den Meersche, K., & Martin, A. R. (2019). Functional traits of coffee along a shade and fertility gradient in coffee agroforestry systems. Agrofor. Syst., 93, 1261-1273.

Isaac, M. E., Martin, A. R., de Melo Virginio Filho, E., Rapidel, B., Roupsard, O., & Van den Meersche, K. (2017). Intraspecific trait variation and coordination: Root and leaf economics spectra in coffee across environmental gradients. Front. Plant. Sci., 8, 1196.

Montazeaud, G., Violle, C., Roumet, P., Rocher, A., Ecarnot, M., Compan, F., Maillet, G., Florián, F., & Fréville, H. (2020). Multifaceted functional diversity for multifaceted crop yield: Towards ecological assembly rules for varietal mixtures. J. Appl. Ecol., 57(11), 2285-2295.

Motamayor, J. C., Lachenaud, P., Da Silva e Mota, J. W., Loor, R., Kuhn, D. N., Brown, J. S., & Schnell, R. J. (2008). Geographic and genetic population differentiation of the Amazonian chocolate tree (Theobroma cacao L). PloS One, 3(10), e3311.

Schwarzkopf, E. J., Motamayor, J. C., & Cornejo, O. E. (2020). Genetic differentiation and intrinsic genomic features explain variation in recombination hotspots among cocoa tree populations. BMC Genomics, 21, 1-16.

Lachenaud, P., & Zhang, D. (2008). Genetic diversity and population structure in wild stands of cacao trees (Theobroma cacao L.) in French Guiana. Ann. For. Sci., 65, 310.

Thomas, E., van Zonneveld, M., Loo, J., Hodgkin, T., Galluzzi, G., & van Etten, J. (2012). Present spatial diversity patterns of Theobroma cacao L. in the neotropics reflect genetic differentiation in Pleistocene refugia followed by human-influenced dispersal. PLoS One, 7(10), e47676.

Chumacero de Schawe, C., Durka, W., Tscharntke, T., Hensen, I., & Kessler, M. (2013). Gene flow and genetic diversity in cultivated and wild cacao (Theobroma cacao) in Bolivia. Am. J. Bot., 100(11), 2271-2279.

Sereno, M. L., Albuquerque, P. S. B., Vencovsky, R., & Figueira, A. (2006). Genetic diversity and natural population structure of cacao (Theobroma cacao L.) from the Brazilian Amazon evaluated by microsatellite markers. Conserv. Genet., 7, 13-24.

Zhang, D., Martínez, W. J., Johnson, E. S., Somarriba, E., Phillips-Mora, W., Astorga, C., Mischke, S., & Meinhardt, L. W. (2012). Genetic diversity and spatial structure in a new distinct Theobroma cacao L. population in Bolivia. Genet. Resour. Crop. Evol., 59, 239-252.

Boadi, S. A., Olwig, M. F., Asare, R., Bosselmann, A. S., & Owusu, K. (2022). The role of innovation in sustainable cocoa cultivation: Moving beyond mitigation and adaptation. In Climate-induced innovation: Mitigation and adaptation to climate change (pp. 47-80). Cham: Springer International Publishing.

Subasi, A., Balfaqih, M., Balfagih, Z., & Alfawwaz, K. (2021). A comparative evaluation of ensemble classifiers for malicious webpage detection. Procedia Comput. Sci., 194, 272-279.

Sher, C. (2000). The CRISP-DM model: The new blueprint for data mining. J. Data Warehousing, 5(4), 13-22.

Altaleb, M., Deeken, H., & Hertzberg, J. (2022). A data mining process for building recommendation systems for agricultural machines based on big data. Lecture Notes in Informatics (LNI), Proceedings-Series of the Gesellschaft für Informatik (GI).

Mazon, B., Jaramillo, M., Romero, O., Borja, A., Aguirre, M., & Contento, M. (2018). Tecnologías de Inteligencia de Negocios y Minería de datos para el análisis de la producción y comercialización de cacao. Revista Espacios, 39(32).

Angelia, R. E., & Linsangan, N. B. (2018). Fermentation level classification of cross cut cacao beans using k-NN algorithm. In Proceedings of the 5th International Conference on Bioinformatics Research and Applications (pp. 64-68).

Herrera-Rocha, F., Fernández-Niño, M., Cala, M. P., Duitama, J., & Barrios, A. F. G. (2023). Omics approaches to understand cocoa processing and chocolate flavor development: A review. Food Res. Int., 165, 112555.

Wood, J. E., Allaway, D., Boult, E., & Scott, I. M. (2010). Operationally realistic validation for prediction of cocoa sensory qualities by high-throughput mass spectrometry. Anal. Chem., 82(14), 6048-6055.

Chlingaryan, A., Sukkarieh, S., & Whelan, B. (2018). Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review. Comput. Electron. Agric., 151, 61-69.

Brenes, E. R., Martinez, O., Lopez, M. F., Ciravegna, L., & Pichardo, C. A. (2023). Cacao Oro. Int. Food Agribus. Manag. Rev., 26(5), 783-799.

SENAMHI. (2020). Mapa Climático del Perú. Retrieved from https://www.senamhi.gob.pe/?p=mapa-climatico-del-peru

Rrmoku, K., Selimi, B., & Ahmedi, L. (2022). Application of trust in recommender systems—utilizing naive Bayes classifier. Computation, 10(1), 6.

Ragazou, K., Passas, I., Garefalakis, A., Kourgiantakis, M., & Xanthos, G. (2022). Youth’s entrepreneurial intention: A multinomial logistic regression analysis of the factors influencing Greek HEI students in time of crisis. Sustainability, 14(20), 13164.

Choi, L. K., Rii, K. B., & Park, H. W. (2023). K-means and J48 algorithms to categorize student research abstracts. IJCITSM, 3(1), 61-64.

Rigatti, S. J. (2017). Random forest. J. Insur Med., 47(1), 31-39.

Li, N., Zare, M., Yi, C., & Jimenez, R. (2022). Stability risk assessment of underground rock pillars using logistic model trees. Int. J. Environ. Res. Public Health, 19(4), 2136.

Gouda, M., Lugnan, A., Dambre, J., van den Branden, G., Posch, C., & Bienstman, P. (2023). Improving the classification accuracy in label-free flow cytometry using event-based vision and simple logistic regression. IEEE J. Sel. Top. Quantum Electron., 29(2), 1-8.

Fekner, S., Austerlitz, F., Cuguen, J., & Arnaud, J. F. (2007). Long distance pollen-mediated gene flow at a landscape level: The weed beet as a case study. Mol. Ecol., 16(18), 3801-3813.

Ha, L. T. V., Hang, P. T., Everaert, H., Rottiers, H., Anh, L. P. T., Dung, T. N., & Messens, K. (2016). Characterization of leaf, flower, and pod morphology among Vietnamese cocoa varieties (Theobroma cacao L.). Pak. J. Bot., 48(6), 2375-2383.

Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260.

Martinez, I., Viles, E., & Olaizola, I. G. (2021). Data science methodologies: Current challenges and future approaches. Big Data Res., 24, 100183.

Cinar, I., & Koklu, M. (2019). Classification of rice varieties using artificial intelligence methods. Int. J. Intell. Syst. Appl. Eng., 7(3), 188-194.

Sabancı, K., & Akkaya, M. (2016). Classification of different wheat varieties by using data mining algorithms. Int. J. Intell. Syst. Appl. Eng., 4(2), 40-44.

Golcuk, A., & Yasar, A. (2023). Classification of bread wheat genotypes by machine learning algorithms. J. Food Compos. Anal., 119, 105253.

Ismael, H. R., Abdulazeez, A. M., & Hasan, D. A. (2021). Comparative study for classification algorithms performance in crop yields prediction systems. Qubahan Acad. J., 1(2), 119-124.

Agrawal, D., & Dahiya, P. (2018). Comparisons of classification algorithms on seeds dataset using machine learning algorithm. Compusoft, 7(5), 2760-2765.

León, L., Campos, C., & Hirzel, J. (2024). Deep learning for broadleaf weed seedlings classification incorporating data variability and model flexibility across two contrasting environments. Artif. Intell. Agric., 12, 29-43.

Dyrmann, M., Karstoft, H., & Midtiby, H. S. (2016). Plant species classification using deep convolutional neural network. Biosyst. Eng., 151, 72-80.

Makanapura, N., Sujatha, C., Patil, P. R., & Desai, P. (2022). Classification of plant seedlings using deep convolutional neural network architectures. J. Phys. Conference Series, 2161, No. 1, p. 012006.

Olsen, A., Konovalov, D. A., Philippa, B., Ridd, P., Wood, J. C., Johns, J., Banks, W., Girgenti, B., Kenny, O., Whinney, J., Calvert, B., Azghadi, M. R., & White, R. D. (2019). DeepWeeds: A multiclass weed species image dataset for deep learning. Sci. Rep., 9(1), 2058.

Sener, O., & Savarese, S. (2017). Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489.

Zhang, W., Chen, K., Wang, J., Shi, Y., & Guo, W. (2021). Easy domain adaptation method for filling the species gap in deep learning-based fruit detection. Hortic. Res., 8.

Published

2024-09-20

How to Cite

Tineo, D., Murillo, Y. S. ., Marín, M. ., Gomez, D. ., Taboada, V. H. ., Goñas, M. ., & Quiñones Huatangari, L. . (2024). A Comparison of Classification Algorithms for Predicting Dis-tinctive Characteristics in Fine Aroma Cocoa Flowers Using WE-KA Modeler. Qubahan Academic Journal, 4(3), 713–724. https://doi.org/10.48161/qaj.v4n3a571

Issue

Section

Articles