Mapping Teacher Produced Tests to a Usefulness Model
Abstract views: 47 / PDF downloads: 40
Keywords:ELT, Test Usefulness, Measurement and Evaluation, Teacher Produced Tests
Tests are designed as an integral part of the teaching process, necessarily including stakeholders from the onset of preparations to grade allocation, the administration of the test, and the interpretation of the results. The process commences with selecting content to evaluate, deciding upon the skills to be tested, and to meet course objectives (Giraldo & Murcia Quintero, 2019; O’Louglin, 2013; Vogt and Tsagari, 2014). Several questions arise in terms of how to standardize the development process and to evaluate their usefulness. Typically: What is the best test for our context? What does this test actually test?, What relevant information does the test provide?, How does this test affect teaching and learning behavior? and In what ways is the test useful?. Although each language program’s particular needs may differ, the answers given to the questions above provide a basis for institutional decisions. None are set in stone and at their root is the critical role testing plays in facilitating what gets learned. The current study initiated action to develop and analyze an achievement test specifically designed for a compulsory A1 level English course delivered to all freshmen students enrolled in Turkish-medium departments at state universities across Türkiye. 150 students who are enrolled in several undergraduate programs at the Faculty of Education at a state university constituted the universe of the study. The researcher analyzed the test after administration and mapped the qualities according to a test usefulness model aiming to address the research gap regarding quality teacher produced tests.
Abraham, R. G., Chapelle, C. A. (1992). The meaning of cloze test scores: An item difficulty perspective. Modern Language Journal, 76(4), 468–479. https://doi.org/10.1111/j.1540-4781.1992.tb05394.x DOI: https://doi.org/10.1111/j.1540-4781.1992.tb05394.x
Adams, R. J., Griffin, P. E., Martin, L. (1987). A latent trait method for measuring a dimension in second language proficiency. Language Testing. 4(1), 9–28. DOI: https://doi.org/10.1177/026553228700400102
Ahmad, S., Rao. (2012). A Review of the Pedagogical Implications of Examination Washback. Research on Humanities and Social Sciences, 2(7), 11- 20.
Alan, B. (2003). Novice teachers’ perceptions of an in-service teacher training course at Anadolu University. Unpublished master’s thesis. Bilkent University, Ankara.
Alderson, J. C., Hamp-Lyons, L. (1996). TOEFL preparation courses: a study of washback. Language Testing. 13(3), 280–97. DOI: https://doi.org/10.1177/026553229601300304
Aschbacher, P. R. (1991). Performance assessment: state of activity, interest, and concerns. Applied Measurement in Education. 4, 275–88. DOI: https://doi.org/10.1207/s15324818ame0404_2
Atay, D. (2008). Teacher research for professional development. ELT Journal, 62(2), 139-147. DOI: https://doi.org/10.1093/elt/ccl053
Bachman, L. F. (1991). ‘What Does Language Testing Have to Offer?’ TESOL Quarterly. 25: 671-704. DOI: https://doi.org/10.2307/3587082
Bachman, L. F., Lynch, B. K. & Mason, M. (1995). Investigating variability in tasks and rater judgments in a performance test of foreign language speaking. Language Testing 12, 238-257. DOI: https://doi.org/10.1177/026553229501200206
Bachman, L. F. & Palmer, A.S. (1996). Language testing in practice. Oxford: Oxford University Press.
Bachman, L.F. & Eignor, D.R. (1997). Recent advances in quantitative test analysis. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 227–42 DOI: https://doi.org/10.1007/978-1-4020-4489-2_21
Bachman, L.F. (1997). Generalizability theory. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic Publishers, 255–62. DOI: https://doi.org/10.1007/978-1-4020-4489-2_23
Bachman, L. F. (2000). ‘Modern language testing at the turn of the century: Assuring that what we count counts.’ Language Testing. 17:1-42. DOI: https://doi.org/10.1177/026553220001700101
Bailey, K.M. (1996). Working for washback: a review of the washback concept in language testing. Language Testing. 13(3), 257–79. DOI: https://doi.org/10.1177/026553229601300303
Balbay, S., & Pamuk, I., Temir, T. Doğan, C. (2018). Issues in pre-service and in-service teacher training programs for university English instructors in Turkey. Journal of Language and Linguistic Studies, 14(2), 48-60.
Ballıdağ, S. (2020). Exploring the Language Assessment Literacy of Turkish In-service EFL Teachers. https://doi.org/10.06.2020/
Ballıdağ, S., & Inan Karagül, B. (2021). Exploring The Language Assessment Literacy of Turkish In-service EFL Teachers. Balıkesir Üniversitesi Sosyal Bilimler Enstitüsü Dergisi. https://doi.org/10.31795/baunsobed.909953 DOI: https://doi.org/10.31795/baunsobed.909953
Banerjee, J., Luoma, S. (1997). Qualitative approaches to test validation. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 275–87. DOI: https://doi.org/10.1007/978-1-4020-4489-2_25
Bell, R. T. (1981). An introduction to applied linguistics. London: Batsford Academic Ltd.
Bolt, R.F. (1992). Cross Validation of item response curve models using TOEFL data. Language Testing. 9, 79–95. DOI: https://doi.org/10.1177/026553229200900106
Bonner, S. M., Torres Rivera, C., & Chen, P. P. (2018). Standards and assessment: Coherence from the teacher’s perspective.Educational Assessment, Evaluation and Accountability,30, 71-92.https://doi.org/10.1007/s11092-017-9272-2 DOI: https://doi.org/10.1007/s11092-017-9272-2
Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing. 12(1), 1–15. DOI: https://doi.org/10.1177/026553229501200101
Brown, H. D. (2004). Language assessment: Principles and classroom practices. New York: Longman.
Brown, J. D. (2013). My twenty-five years of cloze testing research: So, what? International Journal of Language Studies, 7(1), 1-32.
Brown, H. D., Abeywickrama, P. (2010). Language Assessment. Principles and Classroom Practices . White Plains, NY.
Bull, M. and Yoneda, M. (2012). Designing assessment tools: The principles of language assessment. Humanities and Social Sciences, 60, 41－49.
Canale, M. & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 8, 67–84. DOI: https://doi.org/10.1093/applin/I.1.1
Cheng, L. (1999). Changing assessment: washback on teacher perceptions and actions. Teaching and Teacher Education. 15, 253–71. DOI: https://doi.org/10.1016/S0742-051X(98)00046-8
Cizek, G. J. (2000). Pockets of resistance in the assessment revolution. Educational Measurement: Issues and Practice, 19(2), 16-23. https://doi.org/10.1111/J.1745-3992.2000.TB00026.X DOI: https://doi.org/10.1111/j.1745-3992.2000.tb00026.x
Clapham, C. (1996). The development of IELTS: a study of the effect of background knowledge on reading comprehension. Cambridge: University of Cambridge Local Examinations Syndicate/Cambridge University Press.
Cochran, J. L., McCallum, R. S., Bell, S. M. (2010). Three A’s: How Do Attributions, Attitudes, and Aptitude Contribute to Foreign Language Learning? Foreign Language Annals, 43(4), 566–582. DOI: https://doi.org/10.1111/j.1944-9720.2010.01102.x
Coniam, D. (2009). Investigating the quality of teacher-produced tests for EFL students and the effects of training in test development principles and practices on improving test quality. System. 37: 226–242. DOI: https://doi.org/10.1016/j.system.2008.11.008
Coombs, A., DeLuca, C., LaPointe-McEwan, D., & Chalas, A. (2018). Changing approaches to classroom assessment: An empirical study across teacher career stages.Teaching and Teacher Education,71, 134-144.https://doi.org/10.1016/J.TATE.2017.12.010 DOI: https://doi.org/10.1016/j.tate.2017.12.010
Darwesh, A. J. A. (2010). Cloze tests: An integrative approach. Journal of the College of Basic Education, 15(64), 105-116.
Davies, A. (1997). The construction of language tests. London: Oxford University Press.
Douglas, D. (2000). Assessing language for specific purposes: theory and practice. Cambridge: Cambridge University Press.
Ekşi, G. (2010). An assessment of the professional development needs of English language instructors working at a state university. (Unpublished master’s thesis). Middle East Technical University, Ankara.
Fanrong, W., & Bin, S. (2022). Language Assessment Literacy of Teachers. In Frontiers in
Fulcher, G. (2012). Assessment literacy for the language classroom, Language Assessment Quarterly, 9(2), 113-132, https://doi.org/10.1080/15434303.2011.642041 DOI: https://doi.org/10.1080/15434303.2011.642041
Galluzzo, G. R. (2005). Performance assessment and renewing teacher education the possibilities of the NBPTS standards.The Clearing House: A Journal of EducationalStrategies, Issues,and Ideas, 78(4), 142-145.https://doi.org/10.3200/TCHS.78.4.142-145 DOI: https://doi.org/10.3200/TCHS.78.4.142-145
Ginther, A., Stevens, J. (1998). Language background and ethnicity, and the internal construct validity of the Advanced Placement Spanish Language Examination. In Kunnan, A.J., editor, Validation in language assessment. Mahwah, NJ: Lawrence Erlbaum, 169–94.
Giraldo, F., & Murcia Quintero, D. (2019). Language Assessment Literacy and the Professional
Development of Pre-Service Language Teachers. Colombian Applied Linguistics Journal,
Grondlund, N. E., Linn, R. L. (1990). Measurement and Evaluation in Teaching. New York: Macmillan.
Gruba, P. and Corbel, C. (1997). Computer-based testing. In Clapham, C., Corson, D., editors, Language testing and assessment. Language testing and assessment. Dordrecht: Kluwer Academic, 141–49. DOI: https://doi.org/10.1007/978-1-4020-4489-2_14
Gültekin, İ. (2007). The analysis of the perceptions of English language instructors at TOBB University of Economics and Technology regarding inset content. Unpublished master’s thesis. Middle East Technical University, Ankara.
Halliday, M. A. K. (1973). Explorations in the functions of language. London: Edward Arnold.
Heaton, J. B. (1990). Writing English language tests. New York: Longman.
Hoekje, B. & Linnell, K. (1996). Authenticity in language testing: Evaluating language tests
Horwitz, E. (2001). Language anxiety and achievement. Annual review of applied linguistics. 21(1), 112-126. DOI: https://doi.org/10.1017/S0267190501000071
Höl, D. (2023). Standardized testing in Turkey: EFL teachers’ perceptions and experiences on Cambridge Young Learner Exams (YLE). International Journal of Curriculum and Instruction. 15(2), 984-1007.
Hughes, A. (2010). Testing for language teachers. Cambridge: CUP.
Huot, B. (1990). The literature of direct writing assessment: Major concerns and prevailing trends. Review of Educational Research. 60, 237–63. DOI: https://doi.org/10.3102/00346543060002237
Hymes, D. H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics. Harmondsworth, UK: Penguin Books.
Kirschner, M., Wexler, C., Specto, E. (1992). Avoiding obstacles to student comprehension of test questions. TESOL Quarterly. 26: 537-556. DOI: https://doi.org/10.2307/3587177
Kirschner, M., Wexler, C., Specto, E. (1996). A Teacher Education Workshop on the Construction of EFL Tests and Materials. TESOL Quarterly. 30: 85-111. DOI: https://doi.org/10.2307/3587608
Kirkgöz, Y. (2008). Globalization and English Language Policy in Turkey. Educational Policy, 23 (5), 663-684. doi: 10.1177/0895904808316319. DOI: https://doi.org/10.1177/0895904808316319
Köksal, D., Erten, İ. H., Zehir Topkaya, E., Yavuz, A., Yüksel, G., Aksu, İ. E. A., Şirin, E. (2006). English Course for Young Adults: Campus Life, Nobel Yayın Dağıtım, Ankara.
Krashen, S. D. (1982). Principles and practice in second language acquisition. New York: Pergamon Press.
Lado, R. (1964). Language testing: The construction and use of foreign language tests. New York: McGraw-Hill.
Lan, C., & Fan, S. (2019). Developing classroom-based language assessment literacy for in-service
EFL teachers: The gaps. Studies in Educational Evaluation, 61, 112–122.
Lantolf, J., Frawley, W. (1985). Oral proficiency testing: a critical analysis. Modern Language Journal 69, 337–45. DOI: https://doi.org/10.1111/j.1540-4781.1985.tb04801.x
Latif, M. W., & Wasim, A. (2022). Teacher beliefs, personal theories and conceptions of assessment literacy—a tertiary EFL perspective. Language Testing in Asia, 12(1). https://doi.org/10.1186/s40468-022-00158-5 DOI: https://doi.org/10.1186/s40468-022-00158-5
Lewkowicz, J. (2000). Authenticity in language testing: Some outstanding questions. Language Testing. 17(1), 43‐64. DOI: https://doi.org/10.1177/026553220001700102
Llosa, L. (2011). Standards-based classroom assessments of English proficiency: A review of issues, current developments, and future directions for research. Language Testing. 28 (3), 367-382. DOI: https://doi.org/10.1177/0265532211404188
Lumley, T., McNamara, T. F. (1995). Rater characteristics and rater bias: implications for training. Language Testing. 12(1), 54–71. DOI: https://doi.org/10.1177/026553229501200104
Lynch, B. K., Davidson, F., Henning, G. (1988). Person dimensionality in language test validation. Language Testing. 5(2), 206–19. DOI: https://doi.org/10.1177/026553228800500206
Madsen, H. (1983). Techniques in testing. Oxford: Oxford Pub.
Marzano, R. J. (2000). Transforming classroom grading. Alexandria, VA: Association for Supervision and Curriculum Development.
McNamara, T.F. (1991). Test dimensionality: IRT analysis of an ESP listening test. Language Testing. 8(2), 139–59. DOI: https://doi.org/10.1177/026553229100800204
Mertler, C. A. (2004). Secondary teachers’assessment literacy: Does classroom experience make a difference?American Secondary Education,33(1),49-64.https://www.jstor.org/stable/41064623
Miles, M. B. and Huberman, M. A. (1994). An expanded sourcebook: Qualitative data analysis. Newbury Park, CA: Sage.
O’Loughlin, K. (2013). Developing the assessment literacy of university proficiency test users. Language Testing, 30(3), 363-380.https://doi.org/10.1177/0265532213480336 DOI: https://doi.org/10.1177/0265532213480336
Oller, J. W., Jr. (1976). Evidence for a general language proficiency factor. DieNeuren Sprachen, 76, 165–174.
Ölmezer-Öztürk, E., & Aydin, B. (2019). Investigating language assessment knowledge of efl teachers. Hacettepe Egitim Dergisi, 34(3), 602–620. https://doi.org/10.16986/HUJE.2018043465 DOI: https://doi.org/10.16986/HUJE.2018043465
Pill, J. (2016). Drawing on indigenous criteria for more authentic assessment in a specific‐purpose language test: Health professionals interacting with patients. Language Testing, 33(2), 175–193. DOI: https://doi.org/10.1177/0265532215607400
Pollitt, A. (1997). Rasch measurement in latent trait models. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 243–54. DOI: https://doi.org/10.1007/978-1-4020-4489-2_22
Read, J., Chapelle, C. A. (2001). A framework for second language vocabulary assessment. DOI: https://doi.org/10.1191/026553201666879851
Language Testing, 18(1), 1-32.
Purpura, J. E. (1997). An analysis of the relationships between test takers’ cognitive and metacognitive strategy use and second language test performance. Language Learning. 47, 289–325. DOI: https://doi.org/10.1111/0023-8333.91997009
Richards, J. (2002). 30 Years of TEFL/TESL experience: A personal reflection. RELC Journal, 33:1-35. DOI: https://doi.org/10.1177/003368820203300201
Sahlberg, P. (2006). Education reform for raising economic competitiveness. Journal of Educational Change,7(4), 259-287.https://doi.org/10.1007/S10833-005-4884-6 DOI: https://doi.org/10.1007/s10833-005-4884-6
Sariyildiz, G. (2018). Department of Foreign Language Education English Language Teaching A Study Into Language Assessment Literacy Of Preservice English As A Foreign Language Teachers In Turkish Context.
Sasaki, M. (1996). Second language proficiency, foreign language aptitude, and intelligence: quantitative and qualitative analyses. Peter Lang.
Shohamy, E. (2001). The power of tests. London: Longman.
Shohamy, E. (2020). The Power of Tests: A critical perspective on the uses of language tests(1st ed.). Routledge. DOI: https://doi.org/10.4324/9781003062318
Shohamy, E., Donitsa-Schmidt, S., Ferman, I. (1997). Test impact revisited: washback effect over time. Language Testing 13(3), 298– 317. DOI: https://doi.org/10.1177/026553229601300305
Spolsky, B. (1978). Introduction: Linguists and language testers. In Spolsky, B. (ed.) Advances in language testing research: Approaches to language testing. Vol. 2. Washington, DC: Center for Applied Linguistics.
Spolsky, B. (2002). Prospects for the survival of the Navajo language. Anthropology and Education Quarterly, 33(2), 139-162. DOI: https://doi.org/10.1525/aeq.2002.33.2.139
Stansfield CW. Lecture (2008). Where we have been and where we should go. Language Testing. 25(3), 311-326. doi:10.1177/0265532208090155 DOI: https://doi.org/10.1177/0265532208090155
Şentuna, E. (2002). The interests of EFL instructors in Turkey regarding inset content. Unpublished master’s thesis. Bilkent University, Ankara.
Terwilliger, J. (1998). Semantics, psychometrics and assessment reform: a close look at ‘authentic’ assessments. Educational Researcher. 26(8), 24–27. DOI: https://doi.org/10.3102/0013189X026008024
Tomak, B., Karaman, A. C. (2013). Mentoring in a professional development program for novice teachers at a state university in Turkey: a qualitative inquiry. The International Journal of Research in Teacher Education, 4 (2), 1-13.
Ur, P. (1996). A course in language teaching: Practice and theory. Cambridge: CUP.
Vogt, K., Tsagari, D., & Spanoudis, G. (2020). What Do Teachers Think They Want? A Comparative Study of In-Service Language Teachers’ Beliefs on LAL Training Needs. Language Assessment Quarterly, 386–409. https://doi.org/10.1080/15434303.2020.1781128 DOI: https://doi.org/10.1080/15434303.2020.1781128
Wall, D. (1996). Introducing new tests into traditional systems: insights from general education and from innovation theory. Language Testing. 13(3), 334–54. DOI: https://doi.org/10.1177/026553229601300307
Wall, D., Alderson, J. C. (1993). Examining washback: the Sri Lankan impact study. Language Testing. 10, 41–69. DOI: https://doi.org/10.1177/026553229301000103
Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing. 15(2), 263–67. DOI: https://doi.org/10.1177/026553229801500205
Willis, D. (2003). Rules, patterns and words. Cambridge: Cambridge University Press.
Wu, W. M., Stansfield, C. W. (2001). Towards authenticity of task in test development. Language Testing, 18(2), 187-206. American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association. DOI: https://doi.org/10.1191/026553201678777077
Yamtim, V., & Wongwanich, S. (2014). A Study of Classroom Assessment Literacy of Primary
School Teachers. Procedia - Social and Behavioral Sciences, 116, 2998–3004.
How to Cite
Copyright (c) 2023 Cemile Dogan
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.