Development and Validation of A Critical Thinking Assessment on Temperature and Heat for Secondary Physics Education
Article Number: e2025264 | Available Online: June 2025 | DOI: 10.22521/edupij.2025.16.264
Bangun Sartono , Widha Sunarno , Baskoro Adi Prayitno , Nurma Yunita Indriyanti
Full text PDF |
2937 |
3585
Abstract
|
Background/purpose. Critical thinking (CT) is fundamental in science education, but instruments to measure CT in specific domains, such as physics, are still limited. The present study aims to develop and validate the Critical Thinking Test in Temperature and Heat (CTTH), an instrument designed to measure critical thinking skills in the topic of temperature and heat in physics. Methods. The development of CTTH refers to a common critical thinking test framework. The validation process involved expert review from seven experts in physics education, a small-scale pilot test with 33 students, and final validation with 720 secondary school students, ensuring the test items' clarity, relevance, and psychometric quality. Instrument reliability was measured using Cronbach's alpha for internal consistency and Fleiss' kappa for inter-rater reliability. Results. The CTTH exhibits adequate reliability, with internal consistency and inter-rater reliability, confirming the instrument's effectiveness in measuring critical thinking skills in the context of physics. CTTH is a measurement instrument that can measure critical thinking skills related to temperature and heat in physics learning. |
Conclusion. The CTTH instrument offers a resource for further research on the integration of critical thinking in physics education. Future research is suggested to expand the critical thinking assessment framework and answer research questions for wider application in science education.
Keywords: Critical thinking test, temperature and heat, instrument development, physics education
ReferencesAffandy, H., Nugraha, D. A., Pratiwi, S. N., & Cari, C. (2021). Calibration for instrument argumentation skills on the subject of fluid statics using item response theory. Journal of Physics: Conference Series, 1842(1), 1–10. https://doi.org/10.1088/1742-6596/1842/1/012032
Affandy, H., Sunarno, W., Suryana, R., & Harjana. (2024). Integrating creative pedagogy into problem-based learning: The effects on higher order thinking skills in science education. Thinking Skills and Creativity, 53, Article 101575. https://doi.org/10.1016/j.tsc.2024.101575
Aiken, L. R. (1980). Content validity and reliability of single items or questionnaires. Educational and Psychological Measurement, 40(4), 955–959. https://doi.org/10.1177/001316448004000419
Alpizar, D., Vo, T., French, B. F., & Hand, B. (2022). Growth of critical thinking skills in middle school immersive science learning environments. Thinking Skills and Creativity, 46, Article 101192. https://doi.org/10.1016/j.tsc.2022.101192
Alwan, A. A. (2011). Misconception of heat and temperature among physics students. Procedia - Social and Behavioral Sciences, 12, 600–614. https://doi.org/10.1016/j.sbspro.2011.02.074
Arnold, M., & Millar, R. (1994). Children’s and lay adults’ views about thermal equilibrium. International Journal of Science Education, 16(4), 405–419. https://doi.org/10.1080/0950069940160403
Bajracharya, R. R., Emigh, P. J., & Manogue, C. A. (2019). Students’ strategies for solving a multirepresentational partial derivative problem in thermodynamics. Physical Review Physics Education Research, 15(2), Article 20124. https://doi.org/10.1103/PhysRevPhysEducRes.15.020124
Bernard, R. M., Zhang, D., Abrami, P. C., Sicoly, F., Borokhovski, E., & Surkes, M. A. (2008). Exploring the structure of the Watson-Glaser Critical Thinking Appraisal: One scale or many subscales? Thinking Skills and Creativity, 3(1), 15–22. https://doi.org/10.1016/j.tsc.2007.11.001
Bond, T. G., Yan, Z., & Heene, M. (2021). Applying the Rasch Model: Fundamental Measurement in the Human Sciences (4th ed.). Routledge. https://doi.org/10.4324/9781315814698
Cari, C., Pratiwi, S. N., Affandy, H., & Nugraha, D. A. (2020). Investigation of undergraduate student concept understanding on Hydrostatic Pressure using two-tier test. Journal of Physics: Conference Series, 1511(1). https://doi.org/10.1088/1742-6596/1511/1/012085
Cascella, C., Giberti, C., & Bolondi, G. (2020). An analysis of Differential Item Functioning on INVALSI tests, designed to explore gender gap in mathematical tasks. Studies in Educational Evaluation, 64, Article 100819. https://doi.org/10.1016/j.stueduc.2019.100819
Cetin-Dindar, A., & Geban, O. (2011). Development of a three-tier test to assess high school students’ understanding of acids and bases. Procedia - Social and Behavioral Sciences, 15, 600–604. https://doi.org/10.1016/j.sbspro.2011.03.147
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates. https://doi.org/10.4324/9780203771587
Collado, S., Evans, G. W., Corraliza, J. A., & Sorrel, M. A. (2015). The role played by age on children’s pro-ecological behaviors: An exploratory analysis. Journal of Environmental Psychology, 44, 85–94. https://doi.org/10.1016/j.jenvp.2015.09.006
Cvenic, M. K., Planinic, M., Susac, A., Ivanjek, L., Jelicic, K., & Hopf, M. (2022). Development and validation of the Conceptual Survey on Wave Optics. Physical Review Physics Education Research, 18(1), 10103. https://doi.org/10.1103/PhysRevPhysEducRes.18.010103
Danday, B. A., & Monterola, S. L. C. (2019). Effects of microteaching multiple-representation physics lesson study on pre-service teachers’ critical thinking. Journal of Baltic Science Education, 18(5), 692–707. https://doi.org/10.33225/jbse/19.18.692
Engelhard, G., & Wang, J. (2021). Rasch Models for Solving Measurement Problems: Invariant Measurement in the Social Sciences. SAGE Publications. https://doi.org/10.4135/9781071878675
Ennis, R. H. (1958). An appraisal of the watson-glaser critical thinking appraisal. Journal of Educational Research, 52(4), 155–158. https://doi.org/10.1080/00220671.1958.10882558
Facione, P. A. (2000). The disposition toward critical thinking: Its character, measurement, and relationship to critical thinking skill. Informal Logic, 20(1), 61–84. https://doi.org/10.22329/il.v20i1.2254
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378–382. https://doi.org/10.1037/h0031619
Hsu, F. H., Lin, I. H., Yeh, H. C., & Chen, N. S. (2022). Effect of Socratic Reflection Prompts via video-based learning system on elementary school students’ critical thinking skills. Computers and Education, 183, Article 104497. https://doi.org/10.1016/j.compedu.2022.104497
Kaltakci, D., Eryilmaz, A., & McDermott, L. C. (2016). Identifying pre-service physics teachers’ misconceptions and conceptual difficulties about geometrical optics. European Journal of Physics, 37(4), Article 045705.
Karaca-Atik, A., Meeuwisse, M., Gorgievski, M., & Smeets, G. (2023). Uncovering important 21st-century skills for sustainable career development of social sciences graduates: A systematic review. Educational Research Review, 39, Article 100528. https://doi.org/10.1016/j.edurev.2023.100528
Kassiavera, S., Suparmi, A., Cari, C., & Sukarmin, S. (2024). Application of Rasch Model in two-tier test for assessing critical thinking in physics education. Journal of Baltic Science Education, 23(6), 1227–1242. https://doi.org/10.33225/jbse/24.23.1227
Kaur, R., Mantri, A., Nagabhushan, P., & Singh, G. (2024). Rasch Computing Analysis of Two Tier Concept Inventory to Assess Engineering Students’ Conceptual Knowledge. SN Computer Science, 5(5), 643–656. https://doi.org/10.1007/s42979-024-02955-6
Khoiriza, I., Aminatun, T., Pramusinta, W., & Hujatulatif, A. (2021). Science learning and environment: Analysis of student’s scientific literacy based on Indonesia’s waste problem. Proceedings of the 6th International Seminar on Science Education, 541(Isse 2020), 775–779. https://doi.org/10.2991/assehr.k.210326.111
Kim, L., Imjai, N., Kaewjomnong, A., Dowpiset, K., & Aujirapongpan, S. (2025). Does experiential learning matter to strategic intuition skills of MBA students? Implications of diagnostic capabilities and critical thinking skills. International Journal of Management Education, 23(2), Article 101138. https://doi.org/10.1016/j.ijme.2025.101138
Kinoshita, H. (2022). Teaching of critical thinking skills by science teachers in Japanese primary schools. Journal of Baltic Science Education, 21(5), 801–816. https://doi.org/10.33225/jbse/22.21.801
Lawson, A. E. (1992). What do tests of “formal” reasoning actually measure? Journal of Research in Science Teaching, 29(9), 965–983. https://doi.org/10.1002/tea.3660290906
Leach, S. M., Immekus, J. C., French, B. F., & Hand, B. (2020). The factorial validity of the Cornell Critical Thinking Tests: A multi-analytic approach. Thinking Skills and Creativity, 37, Article 100676. https://doi.org/10.1016/j.tsc.2020.100676
Mafinejad, M. K., Arabshahi, S. K. S., Monajemi, A., Jalili, M., Soltani, A., & Rasouli, J. (2017). Use of Multi-Response format test in the assessment of medical students’ critical thinking ability. Journal of Clinical and Diagnostic Research, 11(9), LC10–LC13. https://doi.org/10.7860/JCDR/2017/24884.10607
Mundilarto, & Ismoyo, H. (2017). Effect of problem-based learning on improvement physics achievement and critical thinking of senior high school student. Journal of Baltic Science Education, 16(5), 761–779. https://doi.org/10.33225/jbse/17.16.761
Nurhuda, T., Rusdiana, D., & Setiawan, W. (2017). Analyzing students’ level of understanding on Kinetic theory of gases. Journal of Physics: Conference Series, 812, 12105. https://doi.org/10.1088/1742-6596/812/1/012105
Penfield, R. D., & Giacobbi, P. R. (2004). Applying a score confidence interval to Aiken’s item content-relevance index. Measurement in Physical Education and Exercise Science, 8(4), 213–225. https://doi.org/10.1207/s15327841mpee0804_3
Ridho, A. (2018). Does Multidimensionality Cause DIF? ANIMA Indonesian Psychological Journal, 33(2), 125. https://doi.org/10.24123/aipj.v33i2.1583
Sapia, P., Napoli, F., & Bozzo, G. (2022). The Lawson’s test for scientific reasoning as a predictor for University formative success: A prospective study. Education Sciences, 12(11), 1–15. https://doi.org/10.3390/educsci12110814
Sarigoz, O. (2012). Assessment of the high school students’ critical thinking skills. Procedia - Social and Behavioral Sciences, 46, 5315–5319. https://doi.org/10.1016/j.sbspro.2012.06.430
Stolk, J. D., Gross, M. D., & Zastavker, Y. V. (2021). Motivation, pedagogy, and gender: examining the multifaceted and dynamic situational responses of women and men in college STEM courses. International Journal of STEM Education, 8, 1–19. https://doi.org/10.1186/s40594-021-00283-2
Suwita, S., Saputro, S., Sajidan, S., & Sutarno, S. (2024). Assessing lower-secondary school students’ critical thinking skills in photosynthesis: A Rasch Model approach. Journal of Baltic Science Education, 23(6), 1278–1290. https://doi.org/10.33225/jbse/24.23.1278
Svedholm-Häkkinen, A. M., Forzani, E., Coiro, J., & Kiili, C. (2025). Online credibility evaluation skills in upper secondary students: The role of grade level, argument evaluation, and analytic thinking dispositions. Learning and Individual Differences, 118, Article 102640. https://doi.org/10.1016/j.lindif.2025.102640
Treagust, D. F. (1988). Development and use of diagnostic tests to evaluate students’ misconceptions in science. International Journal of Science Education, 10(2), 159–169. https://doi.org/10.1080/0950069880100204
Wind, S. A. (2019). Nonparametric Evidence of Validity, Reliability, and Fairness for Rater-Mediated Assessments: An Illustration Using Mokken Scale Analysis. Journal of Educational Measurement, 56(3), 478–504. https://doi.org/10.1111/jedm.12222
Zakwandi, R., Istiyono, E., & Dwandaru, W. S. B. (2024). A two-tier computerized adaptive test to measure student computational thinking skills. Education and Information Technologies, 29(7), 8579–8608. https://doi.org/10.1007/s10639-023-12093-w
Zehirlioglu, L., & Mert, H. (2020). Validity and reliability of the Heart Disease Fact Questionnaire (HDFQ): a Rasch measurement model approach. Primary Care Diabetes, 14(2), 154–160. https://doi.org/10.1016/j.pcd.2019.06.006