Mapping Teacher Produced Tests to a Usefulness Model

Cemile Dogan

doi:10.52380/ijcer.2023.10.3.456

Authors

Cemile Dogan Necmettin Erbakan University https://orcid.org/0000-0002-5246-6692

DOI:

https://doi.org/10.52380/ijcer.2023.10.3.456

Keywords:

ELT, Test Usefulness, Measurement and Evaluation, Teacher Produced Tests

Abstract

Tests are designed as an integral part of the teaching process, necessarily including stakeholders from the onset of preparations to grade allocation, the administration of the test, and the interpretation of the results. The process commences with selecting content to evaluate, deciding upon the skills to be tested, and to meet course objectives (Giraldo & Murcia Quintero, 2019; O’Louglin, 2013; Vogt and Tsagari, 2014). Several questions arise in terms of how to standardize the development process and to evaluate their usefulness. Typically: What is the best test for our context? What does this test actually test?, What relevant information does the test provide?, How does this test affect teaching and learning behavior? and In what ways is the test useful?. Although each language program’s particular needs may differ, the answers given to the questions above provide a basis for institutional decisions. None are set in stone and at their root is the critical role testing plays in facilitating what gets learned. The current study initiated action to develop and analyze an achievement test specifically designed for a compulsory A1 level English course delivered to all freshmen students enrolled in Turkish-medium departments at state universities across Türkiye. 150 students who are enrolled in several undergraduate programs at the Faculty of Education at a state university constituted the universe of the study. The researcher analyzed the test after administration and mapped the qualities according to a test usefulness model aiming to address the research gap regarding quality teacher produced tests.

References

Abraham, R. G., Chapelle, C. A. (1992). The meaning of cloze test scores: An item difficulty perspective. Modern Language Journal, 76(4), 468–479. https://doi.org/10.1111/j.1540-4781.1992.tb05394.x DOI: https://doi.org/10.1111/j.1540-4781.1992.tb05394.x

Adams, R. J., Griffin, P. E., Martin, L. (1987). A latent trait method for measuring a dimension in second language proficiency. Language Testing. 4(1), 9–28. DOI: https://doi.org/10.1177/026553228700400102

Ahmad, S., Rao. (2012). A Review of the Pedagogical Implications of Examination Washback. Research on Humanities and Social Sciences, 2(7), 11- 20.

Alan, B. (2003). Novice teachers’ perceptions of an in-service teacher training course at Anadolu University. Unpublished master’s thesis. Bilkent University, Ankara.

Alderson, J. C., Hamp-Lyons, L. (1996). TOEFL preparation courses: a study of washback. Language Testing. 13(3), 280–97. DOI: https://doi.org/10.1177/026553229601300304

Aschbacher, P. R. (1991). Performance assessment: state of activity, interest, and concerns. Applied Measurement in Education. 4, 275–88. DOI: https://doi.org/10.1207/s15324818ame0404_2

Atay, D. (2008). Teacher research for professional development. ELT Journal, 62(2), 139-147. DOI: https://doi.org/10.1093/elt/ccl053

Bachman, L. F. (1991). ‘What Does Language Testing Have to Offer?’ TESOL Quarterly. 25: 671-704. DOI: https://doi.org/10.2307/3587082

Bachman, L. F., Lynch, B. K. & Mason, M. (1995). Investigating variability in tasks and rater judgments in a performance test of foreign language speaking. Language Testing 12, 238-257. DOI: https://doi.org/10.1177/026553229501200206

Bachman, L. F. & Palmer, A.S. (1996). Language testing in practice. Oxford: Oxford University Press.

Bachman, L.F. & Eignor, D.R. (1997). Recent advances in quantitative test analysis. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 227–42 DOI: https://doi.org/10.1007/978-1-4020-4489-2_21

Bachman, L.F. (1997). Generalizability theory. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic Publishers, 255–62. DOI: https://doi.org/10.1007/978-1-4020-4489-2_23

Bachman, L. F. (2000). ‘Modern language testing at the turn of the century: Assuring that what we count counts.’ Language Testing. 17:1-42. DOI: https://doi.org/10.1177/026553220001700101

Bailey, K.M. (1996). Working for washback: a review of the washback concept in language testing. Language Testing. 13(3), 257–79. DOI: https://doi.org/10.1177/026553229601300303

Balbay, S., & Pamuk, I., Temir, T. Doğan, C. (2018). Issues in pre-service and in-service teacher training programs for university English instructors in Turkey. Journal of Language and Linguistic Studies, 14(2), 48-60.

Ballıdağ, S. (2020). Exploring the Language Assessment Literacy of Turkish In-service EFL Teachers. https://doi.org/10.06.2020/

Ballıdağ, S., & Inan Karagül, B. (2021). Exploring The Language Assessment Literacy of Turkish In-service EFL Teachers. Balıkesir Üniversitesi Sosyal Bilimler Enstitüsü Dergisi. https://doi.org/10.31795/baunsobed.909953 DOI: https://doi.org/10.31795/baunsobed.909953

Banerjee, J., Luoma, S. (1997). Qualitative approaches to test validation. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 275–87. DOI: https://doi.org/10.1007/978-1-4020-4489-2_25

Bell, R. T. (1981). An introduction to applied linguistics. London: Batsford Academic Ltd.

Bolt, R.F. (1992). Cross Validation of item response curve models using TOEFL data. Language Testing. 9, 79–95. DOI: https://doi.org/10.1177/026553229200900106

Bonner, S. M., Torres Rivera, C., & Chen, P. P. (2018). Standards and assessment: Coherence from the teacher’s perspective.Educational Assessment, Evaluation and Accountability,30, 71-92.https://doi.org/10.1007/s11092-017-9272-2 DOI: https://doi.org/10.1007/s11092-017-9272-2

Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing. 12(1), 1–15. DOI: https://doi.org/10.1177/026553229501200101

Brown, H. D. (2004). Language assessment: Principles and classroom practices. New York: Longman.

Brown, J. D. (2013). My twenty-five years of cloze testing research: So, what? International Journal of Language Studies, 7(1), 1-32.

Brown, H. D., Abeywickrama, P. (2010). Language Assessment. Principles and Classroom Practices [2003]. White Plains, NY.

Bull, M. and Yoneda, M. (2012). Designing assessment tools: The principles of language assessment. Humanities and Social Sciences, 60, 41－49.

Canale, M. & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 8, 67–84. DOI: https://doi.org/10.1093/applin/I.1.1

Cheng, L. (1999). Changing assessment: washback on teacher perceptions and actions. Teaching and Teacher Education. 15, 253–71. DOI: https://doi.org/10.1016/S0742-051X(98)00046-8

Cizek, G. J. (2000). Pockets of resistance in the assessment revolution. Educational Measurement: Issues and Practice, 19(2), 16-23. https://doi.org/10.1111/J.1745-3992.2000.TB00026.X DOI: https://doi.org/10.1111/j.1745-3992.2000.tb00026.x

Clapham, C. (1996). The development of IELTS: a study of the effect of background knowledge on reading comprehension. Cambridge: University of Cambridge Local Examinations Syndicate/Cambridge University Press.

Cochran, J. L., McCallum, R. S., Bell, S. M. (2010). Three A’s: How Do Attributions, Attitudes, and Aptitude Contribute to Foreign Language Learning? Foreign Language Annals, 43(4), 566–582. DOI: https://doi.org/10.1111/j.1944-9720.2010.01102.x

Coniam, D. (2009). Investigating the quality of teacher-produced tests for EFL students and the effects of training in test development principles and practices on improving test quality. System. 37: 226–242. DOI: https://doi.org/10.1016/j.system.2008.11.008

Coombs, A., DeLuca, C., LaPointe-McEwan, D., & Chalas, A. (2018). Changing approaches to classroom assessment: An empirical study across teacher career stages.Teaching and Teacher Education,71, 134-144.https://doi.org/10.1016/J.TATE.2017.12.010 DOI: https://doi.org/10.1016/j.tate.2017.12.010

Darwesh, A. J. A. (2010). Cloze tests: An integrative approach. Journal of the College of Basic Education, 15(64), 105-116.

Davies, A. (1997). The construction of language tests. London: Oxford University Press.

Douglas, D. (2000). Assessing language for specific purposes: theory and practice. Cambridge: Cambridge University Press.

Ekşi, G. (2010). An assessment of the professional development needs of English language instructors working at a state university. (Unpublished master’s thesis). Middle East Technical University, Ankara.

Fanrong, W., & Bin, S. (2022). Language Assessment Literacy of Teachers. In Frontiers in

Psychology (Vol. 13). Frontiers Media S.A. https://doi.org/10.3389/fpsyg.2022.864582 DOI: https://doi.org/10.3389/fpsyg.2022.864582

Fulcher, G. (2012). Assessment literacy for the language classroom, Language Assessment Quarterly, 9(2), 113-132, https://doi.org/10.1080/15434303.2011.642041 DOI: https://doi.org/10.1080/15434303.2011.642041

Galluzzo, G. R. (2005). Performance assessment and renewing teacher education the possibilities of the NBPTS standards.The Clearing House: A Journal of EducationalStrategies, Issues,and Ideas, 78(4), 142-145.https://doi.org/10.3200/TCHS.78.4.142-145 DOI: https://doi.org/10.3200/TCHS.78.4.142-145

Ginther, A., Stevens, J. (1998). Language background and ethnicity, and the internal construct validity of the Advanced Placement Spanish Language Examination. In Kunnan, A.J., editor, Validation in language assessment. Mahwah, NJ: Lawrence Erlbaum, 169–94.

Giraldo, F., & Murcia Quintero, D. (2019). Language Assessment Literacy and the Professional

Development of Pre-Service Language Teachers. Colombian Applied Linguistics Journal,

(2), 243–259. https://doi.org/10.14483/22487085.14514 DOI: https://doi.org/10.14483/22487085.14514

Grondlund, N. E., Linn, R. L. (1990). Measurement and Evaluation in Teaching. New York: Macmillan.

Gruba, P. and Corbel, C. (1997). Computer-based testing. In Clapham, C., Corson, D., editors, Language testing and assessment. Language testing and assessment. Dordrecht: Kluwer Academic, 141–49. DOI: https://doi.org/10.1007/978-1-4020-4489-2_14

Gültekin, İ. (2007). The analysis of the perceptions of English language instructors at TOBB University of Economics and Technology regarding inset content. Unpublished master’s thesis. Middle East Technical University, Ankara.

Halliday, M. A. K. (1973). Explorations in the functions of language. London: Edward Arnold.

Heaton, J. B. (1990). Writing English language tests. New York: Longman.

Hoekje, B. & Linnell, K. (1996). Authenticity in language testing: Evaluating language tests

for international teaching assistants. TESOL Quarterly, 28(1), 103-126. https://doi.org/10.2307/3587201 DOI: https://doi.org/10.2307/3587201

Horwitz, E. (2001). Language anxiety and achievement. Annual review of applied linguistics. 21(1), 112-126. DOI: https://doi.org/10.1017/S0267190501000071

Höl, D. (2023). Standardized testing in Turkey: EFL teachers’ perceptions and experiences on Cambridge Young Learner Exams (YLE). International Journal of Curriculum and Instruction. 15(2), 984-1007.

Hughes, A. (2010). Testing for language teachers. Cambridge: CUP.

Huot, B. (1990). The literature of direct writing assessment: Major concerns and prevailing trends. Review of Educational Research. 60, 237–63. DOI: https://doi.org/10.3102/00346543060002237

Hymes, D. H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics. Harmondsworth, UK: Penguin Books.

Kirschner, M., Wexler, C., Specto, E. (1992). Avoiding obstacles to student comprehension of test questions. TESOL Quarterly. 26: 537-556. DOI: https://doi.org/10.2307/3587177

Kirschner, M., Wexler, C., Specto, E. (1996). A Teacher Education Workshop on the Construction of EFL Tests and Materials. TESOL Quarterly. 30: 85-111. DOI: https://doi.org/10.2307/3587608

Kirkgöz, Y. (2008). Globalization and English Language Policy in Turkey. Educational Policy, 23 (5), 663-684. doi: 10.1177/0895904808316319. DOI: https://doi.org/10.1177/0895904808316319

Köksal, D., Erten, İ. H., Zehir Topkaya, E., Yavuz, A., Yüksel, G., Aksu, İ. E. A., Şirin, E. (2006). English Course for Young Adults: Campus Life, Nobel Yayın Dağıtım, Ankara.

Krashen, S. D. (1982). Principles and practice in second language acquisition. New York: Pergamon Press.

Lado, R. (1964). Language testing: The construction and use of foreign language tests. New York: McGraw-Hill.

Lan, C., & Fan, S. (2019). Developing classroom-based language assessment literacy for in-service

EFL teachers: The gaps. Studies in Educational Evaluation, 61, 112–122.

https://doi.org/10.1016/j.stueduc.2019.03.003 DOI: https://doi.org/10.1016/j.stueduc.2019.03.003

Lantolf, J., Frawley, W. (1985). Oral proficiency testing: a critical analysis. Modern Language Journal 69, 337–45. DOI: https://doi.org/10.1111/j.1540-4781.1985.tb04801.x

Latif, M. W., & Wasim, A. (2022). Teacher beliefs, personal theories and conceptions of assessment literacy—a tertiary EFL perspective. Language Testing in Asia, 12(1). https://doi.org/10.1186/s40468-022-00158-5 DOI: https://doi.org/10.1186/s40468-022-00158-5

Lewkowicz, J. (2000). Authenticity in language testing: Some outstanding questions. Language Testing. 17(1), 43‐64. DOI: https://doi.org/10.1177/026553220001700102

Llosa, L. (2011). Standards-based classroom assessments of English proficiency: A review of issues, current developments, and future directions for research. Language Testing. 28 (3), 367-382. DOI: https://doi.org/10.1177/0265532211404188

Lumley, T., McNamara, T. F. (1995). Rater characteristics and rater bias: implications for training. Language Testing. 12(1), 54–71. DOI: https://doi.org/10.1177/026553229501200104

Luoma, S. (2001). Test review. Language Testing, 18, (2) 225–23. https://doi.org/10.1177/026553220101800207 DOI: https://doi.org/10.1177/026553220101800207

Lynch, B. K., Davidson, F., Henning, G. (1988). Person dimensionality in language test validation. Language Testing. 5(2), 206–19. DOI: https://doi.org/10.1177/026553228800500206

Madsen, H. (1983). Techniques in testing. Oxford: Oxford Pub.

Marzano, R. J. (2000). Transforming classroom grading. Alexandria, VA: Association for Supervision and Curriculum Development.

McNamara, T.F. (1991). Test dimensionality: IRT analysis of an ESP listening test. Language Testing. 8(2), 139–59. DOI: https://doi.org/10.1177/026553229100800204

Mertler, C. A. (2004). Secondary teachers’assessment literacy: Does classroom experience make a difference?American Secondary Education,33(1),49-64.https://www.jstor.org/stable/41064623

Miles, M. B. and Huberman, M. A. (1994). An expanded sourcebook: Qualitative data analysis. Newbury Park, CA: Sage.

O’Loughlin, K. (2013). Developing the assessment literacy of university proficiency test users. Language Testing, 30(3), 363-380.https://doi.org/10.1177/0265532213480336 DOI: https://doi.org/10.1177/0265532213480336

Oller, J. W., Jr. (1976). Evidence for a general language proficiency factor. DieNeuren Sprachen, 76, 165–174.

Ölmezer-Öztürk, E., & Aydin, B. (2019). Investigating language assessment knowledge of efl teachers. Hacettepe Egitim Dergisi, 34(3), 602–620. https://doi.org/10.16986/HUJE.2018043465 DOI: https://doi.org/10.16986/HUJE.2018043465

Pill, J. (2016). Drawing on indigenous criteria for more authentic assessment in a specific‐purpose language test: Health professionals interacting with patients. Language Testing, 33(2), 175–193. DOI: https://doi.org/10.1177/0265532215607400

Pollitt, A. (1997). Rasch measurement in latent trait models. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education. Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 243–54. DOI: https://doi.org/10.1007/978-1-4020-4489-2_22

Read, J., Chapelle, C. A. (2001). A framework for second language vocabulary assessment. DOI: https://doi.org/10.1191/026553201666879851

Language Testing, 18(1), 1-32.

Purpura, J. E. (1997). An analysis of the relationships between test takers’ cognitive and metacognitive strategy use and second language test performance. Language Learning. 47, 289–325. DOI: https://doi.org/10.1111/0023-8333.91997009

Richards, J. (2002). 30 Years of TEFL/TESL experience: A personal reflection. RELC Journal, 33:1-35. DOI: https://doi.org/10.1177/003368820203300201

Sahlberg, P. (2006). Education reform for raising economic competitiveness. Journal of Educational Change,7(4), 259-287.https://doi.org/10.1007/S10833-005-4884-6 DOI: https://doi.org/10.1007/s10833-005-4884-6

Sariyildiz, G. (2018). Department of Foreign Language Education English Language Teaching A Study Into Language Assessment Literacy Of Preservice English As A Foreign Language Teachers In Turkish Context.

Sasaki, M. (1996). Second language proficiency, foreign language aptitude, and intelligence: quantitative and qualitative analyses. Peter Lang.

Shohamy, E. (2001). The power of tests. London: Longman.

Shohamy, E. (2020). The Power of Tests: A critical perspective on the uses of language tests(1st ed.). Routledge. DOI: https://doi.org/10.4324/9781003062318

Shohamy, E., Donitsa-Schmidt, S., Ferman, I. (1997). Test impact revisited: washback effect over time. Language Testing 13(3), 298– 317. DOI: https://doi.org/10.1177/026553229601300305

Spolsky, B. (1978). Introduction: Linguists and language testers. In Spolsky, B. (ed.) Advances in language testing research: Approaches to language testing. Vol. 2. Washington, DC: Center for Applied Linguistics.

Spolsky, B. (2002). Prospects for the survival of the Navajo language. Anthropology and Education Quarterly, 33(2), 139-162. DOI: https://doi.org/10.1525/aeq.2002.33.2.139

Stansfield CW. Lecture (2008). Where we have been and where we should go. Language Testing. 25(3), 311-326. doi:10.1177/0265532208090155 DOI: https://doi.org/10.1177/0265532208090155

Şentuna, E. (2002). The interests of EFL instructors in Turkey regarding inset content. Unpublished master’s thesis. Bilkent University, Ankara.

Terwilliger, J. (1998). Semantics, psychometrics and assessment reform: a close look at ‘authentic’ assessments. Educational Researcher. 26(8), 24–27. DOI: https://doi.org/10.3102/0013189X026008024

Tomak, B., Karaman, A. C. (2013). Mentoring in a professional development program for novice teachers at a state university in Turkey: a qualitative inquiry. The International Journal of Research in Teacher Education, 4 (2), 1-13.

Ur, P. (1996). A course in language teaching: Practice and theory. Cambridge: CUP.

Vogt, K., Tsagari, D., & Spanoudis, G. (2020). What Do Teachers Think They Want? A Comparative Study of In-Service Language Teachers’ Beliefs on LAL Training Needs. Language Assessment Quarterly, 386–409. https://doi.org/10.1080/15434303.2020.1781128 DOI: https://doi.org/10.1080/15434303.2020.1781128

Wall, D. (1996). Introducing new tests into traditional systems: insights from general education and from innovation theory. Language Testing. 13(3), 334–54. DOI: https://doi.org/10.1177/026553229601300307

Wall, D., Alderson, J. C. (1993). Examining washback: the Sri Lankan impact study. Language Testing. 10, 41–69. DOI: https://doi.org/10.1177/026553229301000103

Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing. 15(2), 263–67. DOI: https://doi.org/10.1177/026553229801500205

Willis, D. (2003). Rules, patterns and words. Cambridge: Cambridge University Press.

Wu, W. M., Stansfield, C. W. (2001). Towards authenticity of task in test development. Language Testing, 18(2), 187-206. American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association. DOI: https://doi.org/10.1191/026553201678777077

Yamtim, V., & Wongwanich, S. (2014). A Study of Classroom Assessment Literacy of Primary

School Teachers. Procedia - Social and Behavioral Sciences, 116, 2998–3004.

https://doi.org/10.1016/j.sbspro.2014.01.696 DOI: https://doi.org/10.1016/j.sbspro.2014.01.696