TWTFPOS-IDF: Thematic Term Weighting Scheme for Enhanced Question Classification Using Bloom's Taxonomy

Sucipto Sucipto; Didik Dwi Prasetya; Triyanna Widiyaningtyas

doi:10.48161/qaj.v5n1a1569

Authors

Sucipto Department of Electrical Engineering and Informatics, Universitas Negeri Malang, Malang 65145, Indonesia; Department Information System, Universitas Nusantara PGRI Kediri, Kediri 64112, Indonesia.
Didik Dwi Prasetya Department of Electrical Engineering and Informatics, Universitas Negeri Malang, Malang 65145, Indonesia;
Triyanna Widiyaningtyas Department Information System, Universitas Nusantara PGRI Kediri, Kediri 64112, Indonesia.

DOI:

https://doi.org/10.48161/qaj.v5n1a1569

Abstract

Creating question text using a cognitive approach based on Bloom’s Taxonomy (BT) is essential for maintaining question quality in learning assessment. Various studies have explored term weighting schemes to improve BT-based question classification accuracy. However, achieving higher accuracy in classifying cognitive levels requires more than just analyzing verbs—it must also incorporate thematic terms relevant to BT. Existing approaches primarily assign weights to verbs and supporting verbs, often neglecting thematic terms that provide crucial context for classification. This study introduces a novel thematic term weighting scheme, TWTFPOS-IDF, which assigns the highest weight to thematic terms compared to verbs and other supporting words. Thematic terms are identified using the BT word database, with feature extraction, selection, and model tuning optimized to enhance classification accuracy. To ensure robustness, the model is evaluated using a newly constructed, larger dataset that includes a diverse set of educational questions across multiple domains. Machine Learning (ML) and Deep Neural Networks (DNN) are employed for classification, with performance assessed using standard metrics and ANOVA statistical testing. The experimental results demonstrate that the proposed model significantly outperforms previous schemes, achieving an average accuracy of 0.905 and a k-fold value of 0.886. The highest-performing ML algorithm recorded an accuracy of 0.977 and a k-fold value of 0.970. The use of a larger dataset ensures greater generalizability and stability of the model across different question structures. The ANOVA test confirms that model optimization and the expanded dataset significantly improve classification accuracy compared to prior research. This research addresses key challenges in automated question classification, enhancing the precision of cognitive level identification in educational assessment. Future studies will focus on automating weight identification and leveraging deep learning techniques to further refine classification performance and scalability.

Downloads

Download data is not yet available.

References

Jansen, T., & Möller, J. (2022). Teacher judgments in school exams: Influences of students' lower-order-thinking skills on the assessment of students' higher-order-thinking skills. Teaching and Teacher Education, 111, 103616.

Glaesser, J. (2019). Competence in educational theory and practice: A critical discussion. Oxford Review of Education, 45(1), 70–85.

Kumar, D., Jaipurkar, R., Shekhar, A., Sikri, G., & Srinivas, V. (2021). Item analysis of multiple choice questions: A quality assurance test for an assessment tool. Medical Journal of Armed Forces India, 77, S85–S89.

Tomlinson, C. A., & Jarvis, J. M. (2023). Differentiation: Making curriculum work for all students through responsive planning & instruction. In Systems and models for developing programs for the gifted and talented (2nd ed., pp. 599–628).

Chiu, T. K. F., Meng, H., Chai, C. S., King, I., Wong, S., & Yam, Y. (2022). Creation and evaluation of a pretertiary artificial intelligence (AI) curriculum. IEEE Transactions on Education, 65(1), 30–39.

Lavidas, K., et al. (2024). Determinants of humanities and social sciences students' intentions to use artificial intelligence applications for academic purposes. Information, 15(6), 314.

Gani, M. O., Ayyasamy, R. K., Sangodiah, A., & Fui, Y. T. (2023). Bloom's taxonomy-based exam question classification: The outcome of CNN and optimal pre-trained word embedding technique. Education and Information Technologies, 28(12), 15893–15914.

Awouda, A., Traini, E., Asranov, M., & Chiabert, P. (2024). Bloom’s IoT taxonomy towards an effective Industry 4.0 education: Case study on Open-source IoT laboratory. Education and Information Technologies, 1–23.

West, J. (2023). Utilizing Bloom’s taxonomy and authentic learning principles to promote preservice teachers’ pedagogical content knowledge. Social Sciences & Humanities Open, 8(1), 100620.

Goh, T. T., Jamaludin, N. A. A., Mohamed, H., Ismail, M. N., & Chua, H. S. (2022). A comparative study on part-of-speech taggers’ performance on examination questions classification according to Bloom’s taxonomy. Journal of Physics: Conference Series, 2224(1), 012001.

West, J. (2023). Utilizing Bloom’s taxonomy and authentic learning principles to promote preservice teachers’ pedagogical content knowledge. Social Sciences & Humanities Open, 8(1), 100620.

Waite, L. H., Zupec, J. F., Quinn, D. H., & Poon, C. Y. (2020). Revised Bloom’s taxonomy as a mentoring framework for successful promotion. Currents in Pharmacy Teaching and Learning, 12(11), 1379–1382.

Callaghan-Koru, J. A., & Aqil, A. R. (2022). Theory-informed course design: Applications of Bloom’s taxonomy in undergraduate public health courses. Pedagogy in Health Promotion, 8(1), 75–83.

Muhayimana, T., Kwizera, L., & Nyirahabimana, M. R. (2022). Using Bloom’s taxonomy to evaluate the cognitive levels of Primary Leaving English Exam questions in Rwandan schools. Curriculum Perspectives, 42(1), 51–63.

Lavidas, K., Papadakis, S., Manesis, D., Grigoriadou, A. S., & Gialamas, V. (2022). The effects of social desirability on students’ self-reports in two social contexts: Lectures vs. lectures and lab classes. Information, 13(10), 491.

Makhlouf, K., Amouri, L., Chaabane, N., & El-Haggar, N. (2020). Exam questions classification based on Bloom’s taxonomy: Approaches and techniques. 2020 2nd International Conference on Computer and Information Sciences (ICCIS).

Masapanta-Carrión, S., & Velázquez-Iturbide, J. Á. (2019). Evaluating instructors’ classification of programming exercises using the revised Bloom’s taxonomy. Annual Conference on Innovation and Technology in Computer Science Education (ITiCSE), 541–547.

Sucipto, S., Prasetya, D. D., & Widiyaningtyas, T. (2024). A review questions classification based on Bloom taxonomy using a data mining approach. ITEGAM-JETIA, 10(48), 161–170.

Silva, V. A., Bittencourt, I. I., & Maldonado, J. C. (2019). Automatic question classifiers: A systematic review. IEEE Transactions on Learning Technologies, 12(4), 485–502.

Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., & Gao, J. (2021). Deep learning-based text classification. ACM Computing Surveys, 54(3).

Selva Birunda, S., & Kanniga Devi, R. (2021). A review on word embedding techniques for text classification. Lecture Notes on Data Engineering and Communications Technologies, 59, 267–281.

Huang, J., et al. (2021). Automatic classroom question classification based on Bloom’s taxonomy. ACM International Conference Proceeding Series, 33–39.

Li, Q., et al. (2022). A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology, 13(2), 31.

Wang, X., et al. (2022). Comparisons of deep learning and machine learning while using text mining methods to identify suicide attempts of patients with mood disorders. Journal of Affective Disorders, 317, 107–113.

Yahya, A. A., Toukal, Z., & Osman, A. (2012). Bloom’s taxonomy-based classification for item bank questions using support vector machines. Studies in Computational Intelligence, 431, 135–140.

Mohammed, M., & Omar, N. (2018). Question classification based on Bloom’s Taxonomy using enhanced TF-IDF. International Journal of Advanced Science, Engineering and Information Technology, 8(4–2).

Gani, M. O., Ayyasamy, R. K., Alhashmi, S. M., Sangodiah, A., & Fui, Y. T. (2022). ETFPOS-IDF: A novel term weighting scheme for examination question classification based on Bloom’s Taxonomy. IEEE Access, 10, 132777–132785.

Sharma, H., Mathur, R., Chintala, T., Dhanalakshmi, S., & Senthil, R. (2023). An effective deep learning pipeline for improved question classification into Bloom’s taxonomy’s domains. Education and Information Technologies, 28(5).

Aninditya, A., Hasibuan, M. A., & Sutoyo, E. (2019). Text mining approach using TF-IDF and naive Bayes for classification of exam questions based on cognitive level of Bloom’s taxonomy. Proceedings - 2019 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), 112–117.

Mohammedid, M., & Omar, N. (2020). Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec. PLoS ONE, 15(3), e0230442.

Alammary, A. S. (2021). Arabic questions classification using modified TF-IDF. IEEE Access, 9, 95109–95122.

Liang, M., & Niu, T. (2022). Research on text classification techniques based on improved TF-IDF algorithm and LSTM inputs. Procedia Computer Science, 208, 460–470.

Kavi Priya, S., & Pon Karthika, K. (2023). An embedded feature selection approach for depression classification using short text sequences. Applied Soft Computing, 147, 110828.

Okkalioglu, M. (2023). TF-IGM revisited: Imbalance text classification with relative imbalance ratio. Expert Systems with Applications, 217, 119578.

Li, Q., Zhao, S., He, T., & Wen, J. (2024). A simple and efficient filter feature selection method via document-term matrix unitization. Pattern Recognition Letters, 181, 23–29.

Gani, M. O., Ayyasamy, R. K., Fui, T., & Sangodiah, A. (2022). USTW vs. STW: A comparative analysis for exam question classification based on Bloom’s Taxonomy. Mendel, 28(2).

Tong, G., Shao, W., & Li, Y. (2024). ReverseGAN: An intelligent reverse generative adversarial networks system for complex image captioning generation. Displays, 82, 102653.

Mikko, M., Stein, Ø., & Jaakko, S. (2022). Machine learning and the identification of Smart Specialisation thematic networks in Arctic Scandinavia. Regional Studies, 56(9), 1429–1441.

Patel, D., & Chhinkaniwala, H. (2018). Fuzzy logic-based single document summarisation with improved sentence scoring technique. International Journal of Knowledge Engineering and Data Mining, 5(1/2), 125.

Widyassari, A. P., Noersasongko, E., Syukur, A., & Affandy. (2022). An extractive text summarization based on candidate summary sentences using fuzzy-decision tree. International Journal of Advanced Computer Science and Applications, 13(7).

Gupta, P., Nigam, S., & Singh, R. (2023). Automatic extractive text summarization using multiple linguistic features. ACM Transactions on Asian and Low-Resource Language Information Processing.

Oyebode, O., Alqahtani, F., & Orji, R. (2020). Using machine learning and thematic analysis methods to evaluate mental health apps based on user reviews. IEEE Access, 8, 111141–111158.

Waseemullah, et al. (2022). A novel approach for semantic extractive text summarization. Applied Sciences, 12(9), 4479.

Zhou, H., Yip, W. S., Ren, J., & To, S. (2022). Thematic analysis of sustainable ultra-precision machining by using text mining and unsupervised learning method. Journal of Manufacturing Systems, 62, 218–233.

Wang, T., Cai, Y., Leung, H. F., Lau, R. Y. K., Xie, H., & Li, Q. (2021). On entropy-based term weighting schemes for text categorization. Knowledge and Information Systems, 63(9).

Listiowarni, I., & Dewi, N. P. (2020). Pemanfaatan klasifikasi soal biologi cognitive domain Bloom’s taxonomy menggunakan KNN chi-square sebagai penyusunan naskah soal. Digital Zone: Jurnal Teknologi Informasi dan Komunikasi, 11(2), 185–195.

Haris, S. S., & Omar, N. (2015). Bloom’s taxonomy question categorization using rules and N-gram approach. Journal of Theoretical and Applied Information Technology, 76(3).

Sangodiah, A., Fui, Y. T., Heng, L. E., Jalil, N. A., Ayyasamy, R. K., & Meian, K. H. (2021). A comparative analysis on term weighting in exam question classification. 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 199–206.

Razzaghnoori, M., Sajedi, H., & Jazani, I. K. (2018). Question classification in Persian using word vectors and frequencies. Cognitive Systems Research, 47, 16–27.

Chotirat, S., Meesad, P., & Unger, H. (2022). Question classification from Thai sentences by considering word context to question generation. 2022 Research, Invention, and Innovation Congress: Innovative Electricals and Electronics (RI2C), 9–14.

Antonio, T., & Paramita, A. S. (2015). Feature selection technique impact for internet traffic classification using Naïve Bayesian. Jurnal Teknologi, 72(5), 141–145.

Sucipto, S., Kusrini, K., & Taufiq, E. L. (2016). Classification method of multi-class on C4.5 algorithm for fish diseases. 2nd International Conference on Science in Information Technology (ICSITech).

Salma, F. S., Pratiwi, O. N., & Farifah, R. Y. (2022). Classification of high school history questions based on cognitive level revised Bloom’s taxonomy using K-nearest neighbor method. International Conference Advancement in Data Science, E-Learning and Information Systems (ICADEIS).

Gupta, V., & Rattan, P. (2023). Improving Twitter sentiment analysis efficiency with SVM-PSO classification and EFWS heuristic. Procedia Computer Science, 230, 698–715.

Jamil, F., & Hameed, I. A. (2023). Toward intelligent open-ended questions evaluation based on predictive optimization. Expert Systems with Applications, 231, 120640.

Ifham, M., Banujan, K., Kumara, B. T. G. S., & Wijeratne, P. M. A. K. (2022). Automatic classification of questions based on Bloom’s taxonomy using artificial neural network. 2022 International Conference on Decision Aid Sciences and Applications (DASA), 311–315.

Thomas, B., & Chandra, J. (2020). Random forest application on cognitive level classification of e-learning content. International Journal of Electrical and Computer Engineering, 10(4), 4372–4380.

Sheelam, D. (2024). Blooms data set. Kaggle. Retrieved April 3, 2024, from https://www.kaggle.com/datasets/dineshsheelam/blooms-data-set

Jayakodi, K., Bandara, M., Perera, I., & Meedeniya, D. (2016). WordNet and cosine similarity based classifier of exam questions using Bloom’s taxonomy. International Journal of Emerging Technologies in Learning, 11(4).

Goh, T. T., Jamaludin, N. A. A., Mohamed, H., Ismail, M. N., & Chua, H. (2023). Semantic similarity analysis for examination questions classification using WordNet. Applied Sciences, 13(14), 8323.

Widyassari, A. P., et al. (2022). Review of automatic text summarization techniques & methods.

Sudarma, M., Sulaksono, J., Informasi, R., & Intensif, T.-I. (2020). Implementation of TF-IDF algorithm to detect human eye factors affecting the health service system. INTENSIF: Jurnal Ilmiah Penelitian dan Penerapan Teknologi Sistem Informasi, 4(1), 123–130.

Mahan, F., Mohammadzad, M., Rozekhani, S. M., & Pedrycz, W. (2021). Chi-MFlexDT: Chi-square-based multi flexible fuzzy decision tree for data stream classification. Applied Soft Computing, 105, 107301.

Wadud, M. A. H., Kabir, M. M., Mridha, M. F., Ali, M. A., Hamid, M. A., & Monowar, M. M. (2022). How can we manage offensive text in social media - A text classification approach using LSTM-BOOST. International Journal of Information Management Data Insights, 2(2), 100095.

Singh, K. N., Devi, S. D., Devi, H. M., & Mahanta, A. K. (2022). A novel approach for dimension reduction using word embedding: An enhanced text classification approach. International Journal of Information Management Data Insights, 2(1), 100061.

Callista, A. S., Pratiwi, O. N., & Sutoyo, E. (2021). Questions classification based on revised Bloom’s taxonomy cognitive level using Naive Bayes and Support Vector Machine. Proceedings of the 4th International Conference on Computer and Informatics Engineering (IC2IE 2021), 260–265.

Khurana, A., & Verma, O. P. (2023). Optimal feature selection for imbalanced text classification. IEEE Transactions on Artificial Intelligence, 4(1), 135–147.

Rupapara, V., Rustam, F., Shahzad, H. F., Mehmood, A., Ashraf, I., & Choi, G. S. (2021). Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model. IEEE Access, 9, 78621–78634.

Gunawan, I., Widyaningtyas, T., Wibawa, A. P., Haviluddin, Darusalam, D., & Pranolo, A. (2018). The performance of correlation-based support vector machine in illiteracy dataset. Proceedings of the 2nd East Indonesia Conference on Computer and Information Technology (EIConCIT 2018), 96–99.

Meissner, R., Jenatschke, D., & Thor, A. (2021). Evaluation of approaches for automatic e-assessment item annotation with levels of Bloom’s taxonomy. Lecture Notes in Computer Science (LNCS), 12511, 57–69.

Sucipto, S., Prasetya, D. D., & Widiyaningtyas, T. (2024). Educational data mining: Multiple choice question classification in vocational school. Matrik: Jurnal Manajemen, Teknik Informatika, dan Rekayasa Komputer, 23(2), 367–376.

Prasetya, D. D., Wibawa, A. P., & Hirashima, T. (2018). The performance of text similarity algorithms. International Journal of Advances in Intelligent Informatics, 4(1), 63–69.

Sangodiah, A., San, T. J., Fui, Y. T., Heng, L. E., Ayyasamy, R. K., & Jalil, N. B. A. (2022). Identifying optimal baseline variant of unsupervised term weighting in question classification based on Bloom taxonomy. Mendel, 28(1).

Wang, P., et al. (2020). Classification of proactive personality: Text mining based on Weibo text and short-answer questions text. IEEE Access, 8, 97370–97382.

Hartmann, J., Huppertz, J., Schamp, C., & Heitmann, M. (2019). Comparing automated text classification methods. International Journal of Research in Marketing, 36(1), 20–38.

Hasmawati, Romadhony, A., & Abdurohman, R. (2022). Primary and high school question classification based on Bloom’s taxonomy. Proceedings of the 10th International Conference on Information and Communication Technology (ICoICT 2022), 234–239.

Saifudin, I., & Widiyaningtyas, T. (2024). Systematic literature review on recommender system: Approach, problem, evaluation techniques, datasets. IEEE Access, 1–1.

Ilmawan, L. B., Muladi, M., & Prasetya, D. D. (2023). Feature space augmentation for negation handling on sentiment analysis. ILKOM Jurnal Ilmiah, 15(2), 353–357.

Zhang, J., Wong, C., Giacaman, N., & Luxton-Reilly, A. (2021). Automated classification of computing education questions using Bloom’s taxonomy. ACM International Conference Proceeding Series, 58–65.

Wong, T. T., & Yang, N. Y. (2017). Dependency analysis of accuracy estimates in k-fold cross validation. IEEE Transactions on Knowledge and Data Engineering, 29(11), 2417–2427.

Zhang, X., & Liu, C. A. (2023). Model averaging prediction by k-fold cross-validation. Journal of Econometrics, 235(1), 280–301.

Althnian, A., et al. (2021). Impact of dataset size on classification performance: An empirical evaluation in the medical domain. Applied Sciences, 11(2), 796.

Durden, J. M., Hosking, B., Bett, B. J., Cline, D., & Ruhl, H. A. (2021). Automated classification of fauna in seabed photographs: The impact of training and validation dataset size, with considerations for the class imbalance. Progress in Oceanography, 196, 102612.