State-of-the-Art Review of the Corpus Linguistics Field From the Beginning Until the Development of ChatGPT

Authors

  • Yaser M. Altameemi University of Ha’il

DOI:

https://doi.org/10.17507/tpls.1402.13

Keywords:

corpus linguistics, ChatGPT, systematic review, natural langue processing, corpus linguistics journals

Abstract

The present paper highlights the recent state of and development in the corpus linguistics (CL) field. Although several reviews have been conducted on CL, these reviews have focused on specific areas, such as education, or did not provide an overall clear overview of the future implications of the field (Baker et al., 2008; Biber & Reppen, 2020; Biber et al., 1998; G. N. Leech, 1991; Mcenery et al., 2019; McEnery & Hardie, 2012). The author begins this paper with providing an overview that can guide new researchers in this field as well as postgraduates who require a general historical and thematic map of CL. The general overview discusses the publications of scholars who have participated in this field as well as the central tools that have been applied in CL. For specific details regarding the development of the field, the author analysed 217 articles from the 3 highest-impact factor journals according to the Web of Science over the last four years (2019–2022). The findings reveal a rapid development of the field in terms of practical and methodological perspectives, specifically regarding the investigations of language uses in different contexts. Thus, this paper indicates a significantly strong correlation between CL and technological development, such as natural language processing (NLP), and how this approach could fill the research gap of utilising CL in other areas of linguistics.

Author Biography

Yaser M. Altameemi, University of Ha’il

Department of English

References

Ali, R., Khan, M. A., Ahmad, I., & Ahmad, Z. (2011). A state-of-the-art review of corpus linguistics journals, 13(1), 1-22.

Alkhalil, A., Abdallah, M. A. E., Alogali, A., & Aljaloud, A. (2021). Applying big data analytics in higher education: A systematic mapping study. International Journal of Information and Communication Technology Education, 17(3), 29–51. https://doi.org/10.4018/IJICTE.20210701.oa3

Altameemi, Y., & Altamimi, M. (2023). Thematic Analysis: A Corpus-Based Method for Understanding Themes/Topics of a Corpus through a Classification Process Using Long Short-Term Memory (LSTM). Applied Sciences, 13(5), 3308, 1-12.

Anthony, L., & Young-Scholten, M. (2014). The benefits and limitations of using corpus linguistic techniques in authorship attribution studies. Journal of English Linguistics, 42(2), 119–141.

Baker, P. (2006a). Glossary of corpus linguistics. Edinburgh University Press.

Baker, P. (2006b). Using corpora in discourse analysis. Bloomsbury Academic.

Baker, P., & McEnery, T. (2005). A corpus-based approach to discourses of refugees and asylum seekers in UN and newspaper texts. Journal of Language and Politics, 4(2), 197–226.

Baker, P., Gabrielatos, C., KhosraviNik, M., Krzyżanowski, M., McEnery, T., & Wodak, R. (2008). A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press. Discourse & Society, 19(3), 273–306. https://doi.org/10.1177/0957926508088962

Baker, P., & Egbert, J. (2018). Introduction: The emergence of corpus pragmatics. Journal of Corpus Pragmatics, 2(1), 1-6.

Biber, D. (1993a). Representativeness in corpus design. Literary and Linguistic Computing, 8(4), 243–257.

Biber, D., & Reppen, R. (2020). The Cambridge handbook of English corpus linguistics. Cambridge University Press.

Biber, D. (1991). Variation across speech and writing. Cambridge University Press.

Biber, D., Susan, C., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge University Press.

Brand, S., & Ernestus, M. (2021). Reduction of word-final obstruent-liquid-schwa clusters in Parisian French. Corpus Linguistics and Linguistic Theory, 17(1), 249-285.

Brezina, V., & Flowerdew, L. (Eds.) (2019). Learner corpus research: New perspectives and applications. International Journal of Corpus Linguistics. Bloomsbury Publishing.

Brezina, V., Mcenery, T., & Wattam, S. (2015). Collocations in context a new perspective on collocation networks*. International Journal of Corpus Linguistics, 202(2015), 139–173. https://doi.org/10.1075/ijcl.20.2.01bre

Brezina, V., & Meyerhoff, M. (2019). Corpora and discourse: Integrating pragmatics into linguistic analysis with computer-assisted methods. Journal of Pragmatics, 145, 1–7.

Conrad, S. (2005). Corpus linguistics and L2 teaching. In Handbook of research in second language teaching and learning (pp. 393–409). Routledge.

Davies, M. (2010). The corpus of contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing, 4(25), 447–464.

Davies, M. (2012). Expanding horizons in historical linguistics with the 400-million word Corpus of Historical American English. Corpora, 2(7), 121–157.

Egbert, J., Tove, L., & Biber, D. (2020). Doing linguistics with a corpus: Methodological considerations for the everyday user. Cambridge University Pres.

Gablasova, D., Brezina, V., & McEnery, T. (2017). Collocations in corpus-based language learning research: Identifying, comparing, and interpreting the evidence. Language Learning, 67(June), 155–179. https://doi.org/10.1111/lang.12225

Golinkoff, R. Michnick., & Kathy, H.-P. (2006). Baby wordsmith: From associationist to social sophisticate. Current Directions in Psychological Science, 1(15), 30-33.

Graham, S., & Milligan, I. (2015). Getting started with text mining using voyant tools. In the historian’s macroscope: Big digital history (pp. 293–308). Imperial College Press.

Gries, S. T. (2003). Multifactorial analysis in corpus linguistics: A study of particle placement. A&C Black.

Gries, S. T. (2016). Quantitative corpus linguistics with R: A practical introduction. Taylor & Francis.

Gries, S. Th. (2013). 50-something years of work on collocations: What is or should be next.... International Journal of Corpus Linguistics, 18(1), 137–166. https://doi.org/10.1075/ijcl.18.1.09gri

Groom, C., & Charles, M. (2018). Corpus pragmatics: A decade of research. Journal of Pragmatics, 130, 1–9.

Groom, N. (2019). Construction grammar and the corpus-based analysis of discourses the case of the way-in-which construction. International Journal of Corpus Linguistics, 24(3), 291-323.

Halliday, M. A. K. (1985a). An introduction to functional grammar (Second, 19). Arnold.

Halliday, M. A. K. (1985b). An introduction to functional grammar (Second, 19). Arnold.

Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. Longman.

Halliday, M., & Hasan, R. (1989). Language, context, and text: Aspects of language in a social-semiotic perspective (Second). Oxford University Press.

Hanks, E., & Egbert, J. (2022). The interplay of laughter and communicative purpose in conversational discourse: A corpus-based study of British English. Corpus Pragmatics, 6(4), 261-290.

Larsson, T., Plonsky, L., & Hancock, G. R. (2020). On the benefits of structural equation modelling for corpus linguists. Corpus Linguistics and Linguistic Theory, 3(17), 683–714.

Hundt, M., Röthlisberger, M., & Seoane, E. (2021). Predicting voice alternation across academic Englishes. Corpus Linguistics and Linguistic Theory, 17(1), 189-222.

Sanders, T. J., Demberg, V., Hoek, J., Scholman, M. C., Asr, F. T., Zufferey, S., & Evers-Vermeul, J. (2021). Unifying dimensions in coherence relations: How various annotation frameworks are related. Corpus Linguistics and Linguistic Theory, 17(1), 1-71.

Schneider, K. P. (2022). Referring to Speech Acts in Communication: Exploring Meta-Illocutionary Expressions in ICE-Ireland. Corpus Pragmatics, 6(2), 155-174.

Leech, G. (2009). Change in contemporary English: A grammatical study. Cambridge University Press.

Leech, G. N. (1991). The state of the art in corpus linguistics. In K. Aijmer & B. Altenberg (Eds.), English corpus linguistics: Studies in honour of Jan Svartvik (pp. 8–29). Longman.

Leech, G., & Rayson, P. (2014). Word frequencies in written and spoken English: Based on the British National Corpus. Routledge.

Ma, Q., & Mei, F. (2021). Review of corpus tools for vocabulary teaching and learning. Journal of China Computer-Assisted Language Learning, 1(1), 177–190. https://doi.org/10.1515/jccall-2021-2008

McEnery, T. (2019). Corpus linguistics. Edinburgh University Press.

McEnery, T., Brezina, V., & Baker, H. (2019). Usage fluctuation analysis: A new way of analysing shifts in historical discourse. International Journal of Corpus Linguistics, 24(4), 413–444. https://doi.org/10.1075/IJCL.18096.MCE

Mcenery, T., Brezina, V., Gablasova, D., & Banerjee, J. (2019). Corpus linguistics, learner corpora, and SLA: Employing technology to analyse language use. Annual Review of Applied Linguistics, 39, 74–92. https://doi.org/10.1017/S0267190519000096

McEnery, T., & Hardie, A. (2012). Corpus Linguistics: Method, theory, and practice. Cambridge University Press.

McEnery, T., Richard, X., & Yukio, T. (2006). Corpus-based language studies: An advanced resource book. Taylor & Francis.

Nartey, M., & Mwinlaaru, I. N. (2019). Towards a decade of synergising corpus linguistics and critical discourse analysis: A meta-analysis. Corpora, 14(2), 203–235. https://doi.org/10.3366/cor.2019.0169

Nurdiyani, N., & Nadra, N. (2021). Review of corpus linguistics for education: A guide for research. Corpus Pragmatics, 5(4), 543–547. https://doi.org/10.1007/s41701-021-00111-6

Reppen, R. (2001). Review of MonoConc Pro and WordSmith Tools. Language Learning & Technology, 5(3), 32-36.

Scott, M. (2001). Comparing corpora and identifying key words, collocations, and frequency distributions through the WordSmith Tools suite of computer programs. Small Corpus Studies and ELT, 47–67.

Smith, E. L. (2021). AntConc (Version 3.5. 8)/WordSmith Tools (Version 8). Early Modern Digital Review, 4(1), 200-214.

Mehl, S. (2021). What we talk about when we talk about corpus frequency: The example of polysemous verbs with light and concrete senses. Corpus Linguistics and Linguistic Theory, 17(1), 223-247.

Sinclair, J. (1991). Corpus, concordance, collocation. Oxford University Press.

Sinclair, J., & McH., C. M. (1975). Towards an analysis of discourse: The English used by teachers and pupils. Oxford University Press.

Terras, M. (2017). Text mining with Voyant tools: A review. Journal of Digital Humanities, 6(1), 1–10.

Yaylali, A. (2020). Brezina, V., & Flowerdew, L. (Eds.). (2019). Learner Corpus Research: New Perspectives and Applications. Bloomsbury Publishing.

Downloads

Published

2024-02-01

Issue

Section

Articles