Professor Dawn Knight
BA, MA, PhD (Nottingham), FLSW
Professor
School of English, Communication and Philosophy
- Available for postgraduate supervision
Overview
I am a member of the Centre for Language and Communication Research, and have been employed by Cardiff University since 2015. I have been involved, as Principal Investigator(PI)/Co-Investigator (CI) in a range of externally funded research funded projects (with circa £3.6m external funding obtained to date). Recent projects (i.e. 2021+) include the following:
- 2022-23: CI, Welsh Government funded ‘ThACC – Thesawrws Ar-lein Cymraeg Cyfoes - Using Word Embeddings to Create a Thesaurus of Contemporary Welsh’ project. Working with colleagues from the Schools of Welsh and Computer Science at Cardiff University and Lancaster University respectively, this project developed an open-access, freely available online thesaurus of the Welsh language, for Welsh speakers and learners alike. We received £90,000 for this project. For more information on this project, see here.
- 2022-23: PI, AHRC funded ‘FreeTxt: supporting bilingual free-text survey and questionnaire data analysis’. Working with colleagues from Lancaster University, and co-designed and co-constructed with partners Cadw and National Trust Wales, this project created an innovative open-source online free-text analysis tool that enables the quick and easy analysis of English and Welsh language data. We received £100,000 for this project. For more information on this project, see here.
- 2022-23: CI, AHRC-Funded ‘Wild Swimming and Blue Spaces: Mobilising interdisciplinary knowledge and partnerships to combat health inequalities at scale’ project (with Adolphs, Nottingham as PI). This project aims to develop a new mixed methods approach, drawing on corpus linguistics and narrative analysis, to create effective public health messaging (with a focus on the benefits of wild swimming) that includes content from a range of academic disciplines. Ultimately this project will benefit the many individuals and diverse communities who will be enabled to enjoy wild swimming in a safe way to improve health, and to gain an increased awareness of the nature of blue spaces and their role as a community asset. We received £178,000 for this project. Visit the project website here.
- 2021-24: Co-PI (with Anne O’Keeffe, Mary Immaculate College), AHRC/IRC funded ‘Interactional variation online: harnessing emerging technologies in the digital humanities to analyse online discourse in different workplace contexts’ project. Working with colleagues from Mary Immaculate College, Swansea University, The University of Nottingham, University College Dublin, and University of Aberdeen, the project first aimed to examine virtual workplace communication to gain depth of insight into the potential barriers to effective communication. Our second aim was to propose the next generation of frameworks for analysing online discourse and will make these frameworks available to all arts and humanities research and end user communities. We received £390,000 from AHRC +€270,000 [circa £620,700] from IRC for this project. Visit the project website here.
From 2016-2020, I was also PI on the ‘CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh): A community driven approach to linguistic corpus construction’ project. Funded by the ESRC (Economic and Social Research Council) and AHRC (Arts and Humanities Research Council), this £1.8 million inter-disciplinary and multi-institutional project led to the creation of a large-scale, open-source corpus of contemporary Welsh language. Full details of project outputs, including links to the: corpus query interface, full corpus dataset, project report, Y Tiwtiadur pedagogic toolkit, CyTag part-of-speech tagger/tag-set and CySemTag semantic tagger/tag-set can be found on the CorCenCC project website and via the CorCenCC GitHub page.
Details of my other research activities, and previously funded projects, can be found on the 'research' tab of this page.
Regarding external and professional leadership roles, I was Chair of BAAL (British Association for Applied Linguistics) from 2018-2021. BAAL is a learned society with over 1,300 members internationally, making it the most influential forum for academics and professionals interested in language and applied linguistics within the UK and beyond. For further information see: www.baal.org.uk
I am currently a member of the Economic and Social Research Council’s (ESRC) Strategic Advisory Network (SAN) - 2021-2024. The SAN is comprised of leading experts from the academic and user communities. It helps the ESRC exploit opportunities and access the voice and expertise of its communities. For further details of the SAN, see here. I am also an academic lead of the AHRC (Arts, Humanities and Research Council) Peer Review College (2022-2025) and ESRC Peer Review College (2024+), the strategic lead for the ESRC IAA at Cardiff University (2023-2026), and am currently the Director of Research Funding for ENCAP.
I am a Fellow of the Learned Society of Wales (FLSW, 2023+).
Publication
2024
- Knight, D., Khallaf, N., Rayson, P., El-Haj, M., Ezeani, I. and Morris, S. 2024. FreeTxt: A corpus-based bilingual free-text survey and questionnaire data analysis toolkit. Applied Corpus Linguistics 4(3), article number: 100103. (10.1016/j.acorp.2024.100103)
- Vilar Lluch, S., McClaughlin, E., Adolphs, S., Knight, D. and Nichele, E. 2024. The effects of modal value and imperative mood on self-predicted compliance to health guidance: The case of COVID-19. Text & Talk (10.1515/text-2023-0125)
- Knight, D. et al. 2024. Indicating engagement in online workplace meetings: The role of backchannelling head nods. International Journal of Corpus Linguistics (IJCL) (10.1075/ijcl.24060.kni)
- Morris, J., Arfon, E., Khallaf, N., El-Haj, M. and Knight, D. 2024. Datblygu thesawrws y Gymraeg drwy dechnoleg. [Online]. Gwerddon Fach: Golwg Ltd. Available at: https://golwg.360.cymru/gwerddon/2143591-datblygu-thesawrws-gymraeg-drwy-dechnoleg
- Fitzgerald, C. et al. 2024. Multi-modal considerations for social media discourse analysis: A specialised corpus of Twitter commentary on working from home. In: Coats, S. and Laippala, V. eds. Linguistics across Disciplinary Borders - The March of Data. London: Bloomsbury, pp. 187-212.
- Adolphs, S., Chen, Y. and Knight, D. 2024. Towards a speech-gesture profile of discourse markers: The case of "I mean". Lingua
- Arfon, E., Morris, J., Khalaf, N. and Knight, D. 2024. Developing the Welsh thesaurus through technology. [Online]. Golwg 360 Cymru - Gwerddon Fach: Golwg Ltd. Available at: https://golwg.360.cymru/gwerddon/2143591-datblygu-thesawrws-gymraeg-drwy-dechnoleg
- O'Keeffe, A. et al. 2024. “We’ve lost you Ian”: Multi-modal corpus innovations in capturing, processing and analysing professional online spoken interactions. Research in Corpus Linguistics 12(2), pp. 1-23. (10.32714/ricl.12.02.02)
2023
- Vilar Lluch, S., McClaughlin, E., Knight, D., Adolphs, S. and Nichele, E. 2023. The language of vaccination campaigns during COVID-19. Medical Humanities 49(3), pp. 487-496. (10.1136/medhum-2022-012583)
- Knight, D., Fitzpatrick, T., Morris, S., Tovey-Walsh, B., Prosser, H. and Davies, E. 2023. Corpus to curriculum: Developing word lists for adult learners of Welsh. Applied Corpus Linguistic 3(2), article number: 100052. (10.1016/j.acorp.2023.100052)
- Adolphs, S. et al. 2023. Communicating health threats: Linguistic evidence for effective public health messaging during the Covid-19 pandemic. University of Nottingham.
- Khallaf, N. et al. 2023. Open-source thesaurus development for under-resourced languages: a Welsh case study. Presented at: LDK 2023 – 4th Conference on Language, Data and Knowledge, Vienna, Austria, 12-15 September 2023.
2022
- McClaughlin, E. et al. 2022. The reception of public health messages during the COVID-19 pandemic. Applied Corpus Linguistics 3(1), article number: 100037. (10.1016/j.acorp.2022.100037)
- Morris, J., Ezeani, I., Gruffydd, I., Young, K., Davies, L., El-Haj, M. and Knight, D. 2022. Welsh automatic text summarisation. Presented at: Wales Academic Symposium on Language Technologies 2022, Bangor, Wales, 28/01/2022Language and Technology in Wales, Vol. 2. Bangor: Banolfan Bedwyr
- Clos, J., McClaughlin, E., Barnard, P., Nichele, E., Knight, D., McAuley, D. and Adolphs, S. 2022. PriPA: a tool for privacy-preserving analytics of linguistic data. Presented at: Legal and Ethical Issues in Human Language Technologies 2022, Marseille, France, 24 June 2022.
- El-Haj, M., Ezeani, I., Morris, J. and Knight, D. 2022. Creation of an evaluation corpus and baseline evaluation scores for Welsh text summarisation. Presented at: 4th Celtic Language Technology Workshop (CLTW 2022), Marseille, France, 20 June 2022.
- Ezeani, I., El-Haj, M., Morris, J. and Knight, D. 2022. Introducing the Welsh text summarisation dataset and baseline systems. Presented at: 13th ELRA Language Resources and Evaluation Conference (LREC 2022), Marseille, France, 20-25 June 2022.
2021
- McClaughlin, E. et al. 2021. Privacy preserving corpus linguistics: investigating the trajectories of public health messaging online. University of Nottingham.
- Muralidaran, V., Spasic, I. and Knight, D. 2021. A systematic review of unsupervised approaches to grammar induction. Natural Language Engineering 27(6), pp. 647-689. (10.1017/S1351324920000327)
- Knight, D., Morris, S., Arman, L., Needs, J. and Rees, M. 2021. Building a national corpus: a Welsh language case study. Basingstoke: Palgrave Macmillan.
- Knight, D., Loizides, F., Neale, S., Anthony, L. and Spasic, I. 2021. Developing computational infrastructure for the CorCenCC corpus - the National Corpus of Contemporary Welsh. Language Resources and Evaluation 55, pp. 789-816. (10.1007/s10579-020-09501-9)
- McClaughlin, E. et al. 2021. Public health messaging by political leaders: a corpus linguistic analysis of COVID-19 speeches delivered by Boris Johnson. University of Nottingham. Available at: https://doi.org/10.17639/3fgb-fn44
- Corcoran, P., Palmer, G., Arman, L., Knight, D. and Spasic, I. 2021. Creating Welsh language word embeddings. Applied Sciences 11(15), article number: 6896. (10.3390/app11156896)
- Espinosa-Anke, L., Palmer, G., Filimonov, M., Corcoran, P., Spasic, I. and Knight, D. 2021. English–Welsh cross-lingual embeddings. Applied Sciences 11(14), article number: 6541. (10.3390/app11146541)
- Knight, D., Morris, S. and Fitzpatrick, T. 2021. Corpus design and construction in minoritised language contexts - Cynllunio a chreu corpws mewn cyd-destunau Ieithoedd lleiafrifoledig: The National Corpus of Contemporary Welsh - Corpws Cenedlaethol Cymraeg Cyfoes. Basingstoke: Palgrave Macmillan.
- McClaughlin, E. et al. 2021. Using online news comments to gather fast feedback on issues with public health messaging: The Guardian as a case study. Project Report. [Online]. University of Nottingham. Available at: https://nottingham-repository.worktribe.com/output/5717332
- Muralidaran, V., Palmer, G., Arman, L., O'Hare, K., Knight, D. and Spasic, I. 2021. A practical implementation of a porter stemmer for Welsh. In: Prys, D. ed. Language and Technology in Wales: Volume 1. Bangor: Bangor University, pp. 30-43.
- Palmer, G., Corcoran, P., Arman, L., Knight, D. and Spasic, I. 2021. A closer look at Welsh word embeddings. In: Prys, D. ed. Language and Technology in Wales: Volume 1. Bangor: Bangor University, pp. 21-29.
2020
- Chen, Y., Adolphs, S. and Knight, D. 2020. Multimodal discourse analysis. In: Friginal, E. and Hardy, J. eds. The Routledge Handbook of Corpus Approaches to Discourse Analysis. London: Routledge
- Knight, D. and Adolphs, S. 2020. Multimodal corpora. In: Paquot, M. and Gries, S. T. eds. A Practical Handbook of Corpus Linguistics. Springer International Publishing, pp. 351-369.
- Knight, D., Morris, S., Fitzpatrick, T., Rayson, P., Spasić, I. and Môn Thomas, E. 2020. The national corpus of contemporary Welsh: project report | Y corpws cenedlaethol Cymraeg cyfoes: adroddiad y prosiect.. Project Report. CorCenCC.
- Muralidaran, V., Spasic, I. and Knight, D. 2020. A cognitive approach to parsing with neural networks. Presented at: International Conference on Statistical Language and Speech Processing (SLSP), Cardiff, UK, 14–16 Oct 2020Statistical Language and Speech Processing, Vol. 12379. Springer Verlag pp. 71-84., (10.1007/978-3-030-59430-5_6)
- Adolphs, S., Knight, D., Smith, C. and Price, D. 2020. Crowdsourcing formulaic phrases: towards a new type of spoken corpus. Corpora 15(2), pp. 141-168. (10.3366/COR.2020.0192)
- Adolphs, S. and Knight, D. eds. 2020. The Routledge handbook of English language and digital humanities. Routledge Handbooks in English Language Studies. Abingdon: Routledge.
2019
- Ezeani, I., Piao, S., Neale, S., Rayson, P. and Knight, D. 2019. Leveraging pre-trained embeddings for Welsh Taggers. Presented at: 4th Workshop on Representation Learning for NLP, Florence, Italy, July 2019ACL Anthology: Proceedings of the 4th Workshop on Representation Learning for NLP, Vol. W19-43. Association for Computational Linguistics pp. -., (10.18653/v1/W19-4332)
- Spasic, I., Owen, D., Knight, D. and Artemiou, A. 2019. Unsupervised multi-word term recognition in Welsh. Presented at: Celtic Language Technology Workshop 2019, Dublin, Ireland, 19 August 2019 Presented at Lynn, T. et al. eds.Proceedings of the Celtic Language Technology Workshop. European Association for Machine Translation
2018
- Neale, S., Donnelly, K., Watkins, G. and Knight, D. 2018. Leveraging lexical resources and constraint grammar for rule-based part-of-speech tagging in Welsh. Presented at: LREC (Language Resources Evaluation) 2018 Conference, Miyazaki, Japan, 7 - 12 May 2018.
- Piao, S., Rayson, P., Knight, D. and Watkins, G. 2018. Towards a Welsh semantic annotation system.. Presented at: LREC (Language Resources Evaluation) 2018 Conference, Miyazaki, Japan., 7 - 12 May 2018.
2017
- Neale, S. et al. 2017. The CorCenCC crowdsourcing app: a bespoke tool for the user-driven creation of the national corpus of contemporary Welsh. Presented at: The 9th International Corpus Linguistics Conference, Birmingham, UK, 24-28 July 2017.
- Knight, D., Walsh, S. and Papagiannidis, S. 2017. I’m having a spring clear out: a corpus-based analysis of e-transactional discourse. Applied Linguistics 38(2), pp. 234-257. (10.1093/applin/amv019)
2016
- Walsh, S. and Knight, D. 2016. Analysing spoken discourse in University small group teaching. In: Corrigan, K. P. and Mearns, A. eds. Creating and Digitizing Language Corpora: Volume 3: Databases for Public Engagement., Vol. 3. Basingstoke: Palgrave Macmillan, pp. 291-319.
- Knight, D. et al. 2016. Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. Presented at: LREC 2016, Tenth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Portoro, Slovenia, 23-28 May 2016.
- Seedhouse, P. and Dawn, K. 2016. Applying digital sensor technology: A problem-solving approach. Applied Linguistics 37(1), pp. 7-32. (10.1093/applin/amv065)
2015
- Knight, D. 2015. e-Language: communication in the digital age. In: Baker, P. and McEnery, T. eds. Corpora and Discourse Studies: Integrating Discourse and Corpora. Palgrave Advances in Language and Linguistics Basingstoke: Palgrave Macmillan, London, pp. 20-40., (10.1057/9781137431738_2)
- Crabtree, A., Tennent, P., Brundell, P. and Knight, D. 2015. Digital records and the digital replay system. In: Halfpenny, P. J. and Proctor, R. eds. Innovations in Digital Research Methods. London: Sage
- Dörk, M. and Knight, D. 2015. WordWanderer: A navigational approach to text visualisation. Corpora 10(1), pp. 83-94. (10.3366/cor.2015.0067)
- Adolphs, S. and Knight, D. 2015. Beyond monomodal spoken corpora. In: Baker, P. and McEnery, T. eds. Corpora and Discourse Studies: Integrating Discourse and Corpora. Palgrave Advances in Language and Linguistics Houndsmill, Basingstoke: Palgrave Macmillan, pp. 41-62.
2014
- Knight, D., Adolphs, S. and Ronald, C. 2014. CANELC – constructing an e-language corpus. Corpora 9(1), pp. 29-56. (10.3366/cor.2014.0050)
2013
- Knight, D., Adolphs, S. and Carter, R. 2013. Formality in digital discourse: a study of hedging in CANELC. In: Romero-Trillo, J. ed. Yearbook of corpus linguistics and pragmatics 2013: new domains and methodologies. Yearbook of corpus linguistics and pragmatics Vol. 1. Springer Netherlands, pp. 131-152., (10.1007/978-94-007-6250-3_7)
- Knight, D. 2013. Corpus linguistics: methods, theory and practice by Tony McEnery and Andrew Hardie [Book Review]. In: Romero-Trillo, J. ed. Yearbook of corpus linguistics and pragmatics 2013: new domains and methodologies. Yearbook of corpus linguistics and pragmatics Vol. 1. Springer Netherlands, pp. 275-277., (10.1007/978-94-007-6250-3_13)
2011
- Knight, D. 2011. Multimodality and active listenership: a corpus approach. Corpus and discourse. London: Bloomsbury.
- Adolphs, S., Knight, D. and Carter, R. 2011. Capturing context for heterogeneous corpus analysis: some first steps. International journal of corpus linguistics 16(3), pp. 305-324. (10.1075/ijcl.16.3.02ado)
- Knight, D. 2011. The future of multimodal corpora. Revista Brasileira de Linguística Aplicada 11(2), pp. 391-415. (10.1590/S1984-63982011000200006)
2010
- Knight, D., Tennent, P., Adolphs, S. and Carter, R. 2010. Developing heterogeneous corpora using the Digital Replay System (DRS).. Presented at: Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, 18 May 2010 Presented at Kipp, M. et al. eds.Proceedings of the LREC 2010 (Language Resources Evaluation Conference) Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, May 2010, Malta.. European Language Resources Association pp. 16-21.
- Adolphs, S. and Knight, D. 2010. Building a spoken corpus: What are the basics?. In: O’Keeffe, A. and McCarthy, M. eds. The Routledge handbook of corpus linguistics. Routledge handbooks in applied linguistics Oxford: Routledge
2009
- Knight, D., Evans, D., Carter, R. and Adolphs, S. 2009. HeadTalk, HandTalk and the corpus: towards a framework for multi-modal, multi-media corpus development. Corpora 4(1), pp. 1-32. (10.3366/E1749503209000203)
- Knight, D. 2009. A multi-modal corpus approach to the analysis of backchanneling behaviour. PhD Thesis, University of Nottingham.
2008
- Brundell, P. et al. 2008. Digital Replay System (DRS): a tool for interaction analysis. Presented at: ICLS2008: International Perspectives in the Learning Sciences Cre8ing a learning world, Utrecht, The Netherlands, 23-28 June 2008.
- Brundell, P. et al. 2008. The experience of using Digital Replay System for social science research. Presented at: 4th International Conference on e-Social Science (ICeSS), Manchester, UK, 18-20 June 2008Proceedings of the 4th International Conference on e-Social Science (ICeSS), Manchester, 18-20 June 2008. ICeSS pp. 1-10.
- Knight, D. and Tennent, P. 2008. Introducing DRS (The Digital Replay System): A tool for the future of corpus linguistic research and analysis. Presented at: Sixth International Conference on Language Resources and Evaluation (LREC'08, Marrakesh, Morocco, 26 May -1 June 2008 Presented at Calzolari, N. et al. eds.Proceedings of the 6th Language Resources and Evaluation Conference (LREC), Palais des Congrés, Marrakech, Morocco, 28-30th May 2008. European Language Resources Association pp. 26-31.
- Knight, D., Adolphs, S., Tennent, P. and Carter, R. 2008. The Nottingham Multi-Modal Corpus: a demonstration. Presented at: 6th Language Resources and Evaluation Conference (LREC), Marrakesh, Morocco, 28-30 May 2008Proceedings of the 6th Language Resources and Evaluation Conference (LREC), Palais des Congrés, Marrakech, Morocco, 28-30th May 2008. European Language Resources Association pp. 1-7.
- Knight, D. and Adolphs, S. 2008. Multi-modal corpus pragmatics: the case of active listenership. In: Romero-Trillo, J. ed. Pragmatics and corpus linguistics: a mutualistic entente. Mouton series in pragmatics Vol. 2. Mouton de Gruyter, pp. 175-190.
2006
- Knight, D., Bayoumi, S., Mills, S., Crabtree, A., Adolphs, S., Pridmore, T. and Carter, R. 2006. Beyond the text: construction and analysis of multi-modal linguistic corpora. Presented at: 2nd International Conference on e-Social Science, Manchester, UK, 28-30 June 2006Proceedings of the 2nd International Conference on e-Social Science, Manchester, 28 - 30 June 2006.. ICeSS pp. n/a.
Adrannau llyfrau
- Fitzgerald, C. et al. 2024. Multi-modal considerations for social media discourse analysis: A specialised corpus of Twitter commentary on working from home. In: Coats, S. and Laippala, V. eds. Linguistics across Disciplinary Borders - The March of Data. London: Bloomsbury, pp. 187-212.
- Muralidaran, V., Palmer, G., Arman, L., O'Hare, K., Knight, D. and Spasic, I. 2021. A practical implementation of a porter stemmer for Welsh. In: Prys, D. ed. Language and Technology in Wales: Volume 1. Bangor: Bangor University, pp. 30-43.
- Palmer, G., Corcoran, P., Arman, L., Knight, D. and Spasic, I. 2021. A closer look at Welsh word embeddings. In: Prys, D. ed. Language and Technology in Wales: Volume 1. Bangor: Bangor University, pp. 21-29.
- Chen, Y., Adolphs, S. and Knight, D. 2020. Multimodal discourse analysis. In: Friginal, E. and Hardy, J. eds. The Routledge Handbook of Corpus Approaches to Discourse Analysis. London: Routledge
- Knight, D. and Adolphs, S. 2020. Multimodal corpora. In: Paquot, M. and Gries, S. T. eds. A Practical Handbook of Corpus Linguistics. Springer International Publishing, pp. 351-369.
- Walsh, S. and Knight, D. 2016. Analysing spoken discourse in University small group teaching. In: Corrigan, K. P. and Mearns, A. eds. Creating and Digitizing Language Corpora: Volume 3: Databases for Public Engagement., Vol. 3. Basingstoke: Palgrave Macmillan, pp. 291-319.
- Knight, D. 2015. e-Language: communication in the digital age. In: Baker, P. and McEnery, T. eds. Corpora and Discourse Studies: Integrating Discourse and Corpora. Palgrave Advances in Language and Linguistics Basingstoke: Palgrave Macmillan, London, pp. 20-40., (10.1057/9781137431738_2)
- Crabtree, A., Tennent, P., Brundell, P. and Knight, D. 2015. Digital records and the digital replay system. In: Halfpenny, P. J. and Proctor, R. eds. Innovations in Digital Research Methods. London: Sage
- Adolphs, S. and Knight, D. 2015. Beyond monomodal spoken corpora. In: Baker, P. and McEnery, T. eds. Corpora and Discourse Studies: Integrating Discourse and Corpora. Palgrave Advances in Language and Linguistics Houndsmill, Basingstoke: Palgrave Macmillan, pp. 41-62.
- Knight, D., Adolphs, S. and Carter, R. 2013. Formality in digital discourse: a study of hedging in CANELC. In: Romero-Trillo, J. ed. Yearbook of corpus linguistics and pragmatics 2013: new domains and methodologies. Yearbook of corpus linguistics and pragmatics Vol. 1. Springer Netherlands, pp. 131-152., (10.1007/978-94-007-6250-3_7)
- Knight, D. 2013. Corpus linguistics: methods, theory and practice by Tony McEnery and Andrew Hardie [Book Review]. In: Romero-Trillo, J. ed. Yearbook of corpus linguistics and pragmatics 2013: new domains and methodologies. Yearbook of corpus linguistics and pragmatics Vol. 1. Springer Netherlands, pp. 275-277., (10.1007/978-94-007-6250-3_13)
- Adolphs, S. and Knight, D. 2010. Building a spoken corpus: What are the basics?. In: O’Keeffe, A. and McCarthy, M. eds. The Routledge handbook of corpus linguistics. Routledge handbooks in applied linguistics Oxford: Routledge
- Knight, D. and Adolphs, S. 2008. Multi-modal corpus pragmatics: the case of active listenership. In: Romero-Trillo, J. ed. Pragmatics and corpus linguistics: a mutualistic entente. Mouton series in pragmatics Vol. 2. Mouton de Gruyter, pp. 175-190.
Cynadleddau
- Khallaf, N. et al. 2023. Open-source thesaurus development for under-resourced languages: a Welsh case study. Presented at: LDK 2023 – 4th Conference on Language, Data and Knowledge, Vienna, Austria, 12-15 September 2023.
- Morris, J., Ezeani, I., Gruffydd, I., Young, K., Davies, L., El-Haj, M. and Knight, D. 2022. Welsh automatic text summarisation. Presented at: Wales Academic Symposium on Language Technologies 2022, Bangor, Wales, 28/01/2022Language and Technology in Wales, Vol. 2. Bangor: Banolfan Bedwyr
- Clos, J., McClaughlin, E., Barnard, P., Nichele, E., Knight, D., McAuley, D. and Adolphs, S. 2022. PriPA: a tool for privacy-preserving analytics of linguistic data. Presented at: Legal and Ethical Issues in Human Language Technologies 2022, Marseille, France, 24 June 2022.
- El-Haj, M., Ezeani, I., Morris, J. and Knight, D. 2022. Creation of an evaluation corpus and baseline evaluation scores for Welsh text summarisation. Presented at: 4th Celtic Language Technology Workshop (CLTW 2022), Marseille, France, 20 June 2022.
- Ezeani, I., El-Haj, M., Morris, J. and Knight, D. 2022. Introducing the Welsh text summarisation dataset and baseline systems. Presented at: 13th ELRA Language Resources and Evaluation Conference (LREC 2022), Marseille, France, 20-25 June 2022.
- Muralidaran, V., Spasic, I. and Knight, D. 2020. A cognitive approach to parsing with neural networks. Presented at: International Conference on Statistical Language and Speech Processing (SLSP), Cardiff, UK, 14–16 Oct 2020Statistical Language and Speech Processing, Vol. 12379. Springer Verlag pp. 71-84., (10.1007/978-3-030-59430-5_6)
- Ezeani, I., Piao, S., Neale, S., Rayson, P. and Knight, D. 2019. Leveraging pre-trained embeddings for Welsh Taggers. Presented at: 4th Workshop on Representation Learning for NLP, Florence, Italy, July 2019ACL Anthology: Proceedings of the 4th Workshop on Representation Learning for NLP, Vol. W19-43. Association for Computational Linguistics pp. -., (10.18653/v1/W19-4332)
- Spasic, I., Owen, D., Knight, D. and Artemiou, A. 2019. Unsupervised multi-word term recognition in Welsh. Presented at: Celtic Language Technology Workshop 2019, Dublin, Ireland, 19 August 2019 Presented at Lynn, T. et al. eds.Proceedings of the Celtic Language Technology Workshop. European Association for Machine Translation
- Neale, S., Donnelly, K., Watkins, G. and Knight, D. 2018. Leveraging lexical resources and constraint grammar for rule-based part-of-speech tagging in Welsh. Presented at: LREC (Language Resources Evaluation) 2018 Conference, Miyazaki, Japan, 7 - 12 May 2018.
- Piao, S., Rayson, P., Knight, D. and Watkins, G. 2018. Towards a Welsh semantic annotation system.. Presented at: LREC (Language Resources Evaluation) 2018 Conference, Miyazaki, Japan., 7 - 12 May 2018.
- Neale, S. et al. 2017. The CorCenCC crowdsourcing app: a bespoke tool for the user-driven creation of the national corpus of contemporary Welsh. Presented at: The 9th International Corpus Linguistics Conference, Birmingham, UK, 24-28 July 2017.
- Knight, D. et al. 2016. Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. Presented at: LREC 2016, Tenth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Portoro, Slovenia, 23-28 May 2016.
- Knight, D., Tennent, P., Adolphs, S. and Carter, R. 2010. Developing heterogeneous corpora using the Digital Replay System (DRS).. Presented at: Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, 18 May 2010 Presented at Kipp, M. et al. eds.Proceedings of the LREC 2010 (Language Resources Evaluation Conference) Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, May 2010, Malta.. European Language Resources Association pp. 16-21.
- Brundell, P. et al. 2008. Digital Replay System (DRS): a tool for interaction analysis. Presented at: ICLS2008: International Perspectives in the Learning Sciences Cre8ing a learning world, Utrecht, The Netherlands, 23-28 June 2008.
- Brundell, P. et al. 2008. The experience of using Digital Replay System for social science research. Presented at: 4th International Conference on e-Social Science (ICeSS), Manchester, UK, 18-20 June 2008Proceedings of the 4th International Conference on e-Social Science (ICeSS), Manchester, 18-20 June 2008. ICeSS pp. 1-10.
- Knight, D. and Tennent, P. 2008. Introducing DRS (The Digital Replay System): A tool for the future of corpus linguistic research and analysis. Presented at: Sixth International Conference on Language Resources and Evaluation (LREC'08, Marrakesh, Morocco, 26 May -1 June 2008 Presented at Calzolari, N. et al. eds.Proceedings of the 6th Language Resources and Evaluation Conference (LREC), Palais des Congrés, Marrakech, Morocco, 28-30th May 2008. European Language Resources Association pp. 26-31.
- Knight, D., Adolphs, S., Tennent, P. and Carter, R. 2008. The Nottingham Multi-Modal Corpus: a demonstration. Presented at: 6th Language Resources and Evaluation Conference (LREC), Marrakesh, Morocco, 28-30 May 2008Proceedings of the 6th Language Resources and Evaluation Conference (LREC), Palais des Congrés, Marrakech, Morocco, 28-30th May 2008. European Language Resources Association pp. 1-7.
- Knight, D., Bayoumi, S., Mills, S., Crabtree, A., Adolphs, S., Pridmore, T. and Carter, R. 2006. Beyond the text: construction and analysis of multi-modal linguistic corpora. Presented at: 2nd International Conference on e-Social Science, Manchester, UK, 28-30 June 2006Proceedings of the 2nd International Conference on e-Social Science, Manchester, 28 - 30 June 2006.. ICeSS pp. n/a.
Erthyglau
- Knight, D., Khallaf, N., Rayson, P., El-Haj, M., Ezeani, I. and Morris, S. 2024. FreeTxt: A corpus-based bilingual free-text survey and questionnaire data analysis toolkit. Applied Corpus Linguistics 4(3), article number: 100103. (10.1016/j.acorp.2024.100103)
- Vilar Lluch, S., McClaughlin, E., Adolphs, S., Knight, D. and Nichele, E. 2024. The effects of modal value and imperative mood on self-predicted compliance to health guidance: The case of COVID-19. Text & Talk (10.1515/text-2023-0125)
- Knight, D. et al. 2024. Indicating engagement in online workplace meetings: The role of backchannelling head nods. International Journal of Corpus Linguistics (IJCL) (10.1075/ijcl.24060.kni)
- Adolphs, S., Chen, Y. and Knight, D. 2024. Towards a speech-gesture profile of discourse markers: The case of "I mean". Lingua
- O'Keeffe, A. et al. 2024. “We’ve lost you Ian”: Multi-modal corpus innovations in capturing, processing and analysing professional online spoken interactions. Research in Corpus Linguistics 12(2), pp. 1-23. (10.32714/ricl.12.02.02)
- Vilar Lluch, S., McClaughlin, E., Knight, D., Adolphs, S. and Nichele, E. 2023. The language of vaccination campaigns during COVID-19. Medical Humanities 49(3), pp. 487-496. (10.1136/medhum-2022-012583)
- Knight, D., Fitzpatrick, T., Morris, S., Tovey-Walsh, B., Prosser, H. and Davies, E. 2023. Corpus to curriculum: Developing word lists for adult learners of Welsh. Applied Corpus Linguistic 3(2), article number: 100052. (10.1016/j.acorp.2023.100052)
- McClaughlin, E. et al. 2022. The reception of public health messages during the COVID-19 pandemic. Applied Corpus Linguistics 3(1), article number: 100037. (10.1016/j.acorp.2022.100037)
- Muralidaran, V., Spasic, I. and Knight, D. 2021. A systematic review of unsupervised approaches to grammar induction. Natural Language Engineering 27(6), pp. 647-689. (10.1017/S1351324920000327)
- Knight, D., Loizides, F., Neale, S., Anthony, L. and Spasic, I. 2021. Developing computational infrastructure for the CorCenCC corpus - the National Corpus of Contemporary Welsh. Language Resources and Evaluation 55, pp. 789-816. (10.1007/s10579-020-09501-9)
- Corcoran, P., Palmer, G., Arman, L., Knight, D. and Spasic, I. 2021. Creating Welsh language word embeddings. Applied Sciences 11(15), article number: 6896. (10.3390/app11156896)
- Espinosa-Anke, L., Palmer, G., Filimonov, M., Corcoran, P., Spasic, I. and Knight, D. 2021. English–Welsh cross-lingual embeddings. Applied Sciences 11(14), article number: 6541. (10.3390/app11146541)
- Adolphs, S., Knight, D., Smith, C. and Price, D. 2020. Crowdsourcing formulaic phrases: towards a new type of spoken corpus. Corpora 15(2), pp. 141-168. (10.3366/COR.2020.0192)
- Knight, D., Walsh, S. and Papagiannidis, S. 2017. I’m having a spring clear out: a corpus-based analysis of e-transactional discourse. Applied Linguistics 38(2), pp. 234-257. (10.1093/applin/amv019)
- Seedhouse, P. and Dawn, K. 2016. Applying digital sensor technology: A problem-solving approach. Applied Linguistics 37(1), pp. 7-32. (10.1093/applin/amv065)
- Dörk, M. and Knight, D. 2015. WordWanderer: A navigational approach to text visualisation. Corpora 10(1), pp. 83-94. (10.3366/cor.2015.0067)
- Knight, D., Adolphs, S. and Ronald, C. 2014. CANELC – constructing an e-language corpus. Corpora 9(1), pp. 29-56. (10.3366/cor.2014.0050)
- Adolphs, S., Knight, D. and Carter, R. 2011. Capturing context for heterogeneous corpus analysis: some first steps. International journal of corpus linguistics 16(3), pp. 305-324. (10.1075/ijcl.16.3.02ado)
- Knight, D. 2011. The future of multimodal corpora. Revista Brasileira de Linguística Aplicada 11(2), pp. 391-415. (10.1590/S1984-63982011000200006)
- Knight, D., Evans, D., Carter, R. and Adolphs, S. 2009. HeadTalk, HandTalk and the corpus: towards a framework for multi-modal, multi-media corpus development. Corpora 4(1), pp. 1-32. (10.3366/E1749503209000203)
Gosodiad
- Knight, D. 2009. A multi-modal corpus approach to the analysis of backchanneling behaviour. PhD Thesis, University of Nottingham.
Gwefannau
- Morris, J., Arfon, E., Khallaf, N., El-Haj, M. and Knight, D. 2024. Datblygu thesawrws y Gymraeg drwy dechnoleg. [Online]. Gwerddon Fach: Golwg Ltd. Available at: https://golwg.360.cymru/gwerddon/2143591-datblygu-thesawrws-gymraeg-drwy-dechnoleg
- Arfon, E., Morris, J., Khalaf, N. and Knight, D. 2024. Developing the Welsh thesaurus through technology. [Online]. Golwg 360 Cymru - Gwerddon Fach: Golwg Ltd. Available at: https://golwg.360.cymru/gwerddon/2143591-datblygu-thesawrws-gymraeg-drwy-dechnoleg
Llyfrau
- Knight, D., Morris, S., Arman, L., Needs, J. and Rees, M. 2021. Building a national corpus: a Welsh language case study. Basingstoke: Palgrave Macmillan.
- Knight, D., Morris, S. and Fitzpatrick, T. 2021. Corpus design and construction in minoritised language contexts - Cynllunio a chreu corpws mewn cyd-destunau Ieithoedd lleiafrifoledig: The National Corpus of Contemporary Welsh - Corpws Cenedlaethol Cymraeg Cyfoes. Basingstoke: Palgrave Macmillan.
- Adolphs, S. and Knight, D. eds. 2020. The Routledge handbook of English language and digital humanities. Routledge Handbooks in English Language Studies. Abingdon: Routledge.
- Knight, D. 2011. Multimodality and active listenership: a corpus approach. Corpus and discourse. London: Bloomsbury.
Monograffau
- Adolphs, S. et al. 2023. Communicating health threats: Linguistic evidence for effective public health messaging during the Covid-19 pandemic. University of Nottingham.
- McClaughlin, E. et al. 2021. Privacy preserving corpus linguistics: investigating the trajectories of public health messaging online. University of Nottingham.
- McClaughlin, E. et al. 2021. Public health messaging by political leaders: a corpus linguistic analysis of COVID-19 speeches delivered by Boris Johnson. University of Nottingham. Available at: https://doi.org/10.17639/3fgb-fn44
- McClaughlin, E. et al. 2021. Using online news comments to gather fast feedback on issues with public health messaging: The Guardian as a case study. Project Report. [Online]. University of Nottingham. Available at: https://nottingham-repository.worktribe.com/output/5717332
- Knight, D., Morris, S., Fitzpatrick, T., Rayson, P., Spasić, I. and Môn Thomas, E. 2020. The national corpus of contemporary Welsh: project report | Y corpws cenedlaethol Cymraeg cyfoes: adroddiad y prosiect.. Project Report. CorCenCC.
Research
Research interests:
I am an applied linguist whose research interests lie in the areas of corpus linguistics, discourse analysis, and multimodality. I have expertise in conceptualising, theorising and applying innovative interdisciplinary approaches/methodologies for extracting and predicting language patterning within/across social and linguistic contexts (within the broad scope of the aforementioned research areas). While located at its core in the area of Linguistics and Digital Humanities, my research is fundamentally interdisciplinary, and this is reflected in the multi-authored nature of my publications and interdisciplinary research projects.
My work on Welsh language resource development, supported by major AHRC, ESRC and Welsh Government grants (e.g. CorCenCC, also see here for further information), is aiming to change the landscape of minoritised language research and the potential real-world applications of corpora/corpus-based enquiry.
I have (co)presented 104 papers and posters, and delivered 48 keynotes at seminars and conferences since 2006.
Externally funded research projects:
- 2023: £20,000 received from the Welsh Government to create the GDC-WDG Welsh language resource site.
- 2022: £90,000 received from the Welsh Government for the ‘ThACC – Thesawrws Ar-lein Cymraeg Cyfoes - Using Word Embeddings to Create a Thesaurus of Contemporary Welsh’ project. Working with colleagues from WELSH and Computer Science at Cardiff and Lancaster Universities (with Morris as PI - I am one of the CIs), the project developed an open-access, freely available online thesaurus of the Welsh language, for Welsh speakers and learners alike.
- 2022: £178,000 received from the AHRC for the ‘Wild Swimming and Blue Spaces: Mobilising interdisciplinary knowledge and partnerships to combat health inequalities at scale’ project (with Adolphs, Nottingham as PI - I am one of the CIs). This project developed a new mixed methods approach, drawing on corpus linguistics and narrative analysis, for effective public health messaging (with a focus on the benefits of wild swimming) that includes content from a range of academic disciplines. Visit the project website here.
- 2022: £100,000 received from the AHRC for the 'FreeTxt: supporting bilingual free-text survey and questionnaire data analysis’ project. I was PI on this project. Working with colleagues from Lancaster University, and co-designed and co-constructed with partners Cadw and National Trust Wales, this project created an innovative open-source online free-text analysis tool that enables the quick and easy analysis of English and Welsh language data: FreeTxt. Visit the project website here.
- 2021-24: Co-PI (with Anne O’Keeffe, Mary Immaculate College), AHRC/IRC funded ‘Interactional variation online: harnessing emerging technologies in the digital humanities to analyse online discourse in different workplace contexts’ project. Working with colleagues from Mary Immaculate College, Swansea University, The University of Nottingham, University College Dublin, and University of Aberdeen, the project first aimed to examine virtual workplace communication to gain depth of insight into the potential barriers to effective communication. Our second aim was to propose the next generation of frameworks for analysing online discourse and will make these frameworks available to all arts and humanities research and end user communities. We received £390,000 from AHRC +€270,000 [circa £620,700] from IRC for this project. Visit the project website here.
- 2021: £14,988 received from the ESRC Impact Acceleration Account (IAA). This was for a project, working with the National Centre for Learning Welsh, that supported the creation of vocabulary lists, based on data extracted from CorCenCC (National Corpus of Contemporary Welsh).
- 2021: £90,000 received from the Welsh-Government for the ‘Welsh Automatic Text Summarisation’ project. Working with colleagues from WELSH and Computer Science at Cardiff and Lancaster Universities, the project team built a summarisation tool that will allow professionals to quickly summarise long documents for efficient presentation. Visit the project website here.
- 2021: £450,000 received from AHRC for the 'Coronavirus Discourses: linguistic evidence for effective public health messaging' project. Developed in partnership with Public Health England, Public Health Wales and NHS Education for Scotland, this project addressed key challenges that the coronavirus pandemic presents in relation to understanding the flow and impact of public health messages as reflected in public and private discourses. Led by Svenja Adolphs (Nottingham - I was CI on this project), this interdisciplinary project carried out the first large scale analysis of the trajectories of public health messages relating to the coronavirus pandemic in the UK [£465,000]. Visit the project website here.
- 2020: £90,000 received from the Welsh Government for the 'Learning English-Welsh bilingual embeddings and applications in text categorisation' project. This was an interdisciplinary project involving Irena Spasić, Padraig Corcoran, Luis Espinosa-Anke (School of Computer Science and Informatics – COMSC) and Geraint Palmer (School of Mathematics) as Co-Investigators (CIs). In was PI on this project. For more information, see here.
- 2019: £90,000 received from the Welsh Government for the ‘Welsh words by numbers: “Wales” + “capital” = “Cardiff”’ project (focusing on word embeddings for Welsh). I am a CI on this project.
- 2019: £2,100 received for the internally funded CUROP project entitled ‘FreeTxt: analysing free-text comments using a corpus-based approach’. I was PI on this project.
- 2019: £20,000 received from the Welsh Government for the Welsh Stemmer project, I was CI on this project with Irena Spasić (Cardiff) as PI.
- 2018: £2,100 received for the internally funded CUROP project entitled ‘Corpws Cenedlaethol Cymraeg Cyfoes: National Corpus of Contemporary Welsh – a focus on spoken data’. I was PI on this project (with Lowri Williams).
- 2018: £2,100 received for the internally funded CUROP project entitled ‘Corpws Cenedlaethol Cymraeg Cyfoes: National Corpus of Contemporary Welsh – semantic tagging and data annotation’. I was PI on this project (with Paul Rayson).
- 2017: £19,964 received from the Grant Cymraeg 2050 fund to automatically construct a WordNet for Welsh, a lexical database in which words are grouped into sets of synonyms (synsets), which are then organised into a network of lexico-semantic relationships. I was CI on this project.
- 2017: £2,000 received (as PI) from the British Council in support of a launch event for the CorCenCC project (held on 28th February 2017).
- 2016-19: £1,800,000 received from the ESRC and AHRC for the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh): A community driven approach to linguistic corpus construction). I am PI on this project.
- 2016: £1,600 received for the internally funded CUROP project entitled ‘Analysis on non-verbal communication in construction industry interactions’. I was CI on this project (with Mike Handford).
- 2015: £24,999 received from the AHSS (College of Arts, Humanities and Social Sciences) Network Digital Humanities Initiator Bid. The aim of this network Bid is to bring build significant capacity in Digital Humanities at Cardiff University. I was CI on this project.
- 2014: £3,850 received from the Newcastle University Faculty Research Fund for a project entitled ‘Crowdsourcing data collection for corpus compilation: Scoping methods for the future’ (with Patrick Olivier).
- 2013: £3900 received from the Newcastle University Faculty Bid Preparation Fund for Corpws Cenedlaethol Cymraeg (CorCenCC) to support the development of the bid application.
- 2013: £17,500 funding received from the British Council Aptis Research Grants for a project entitled ‘Characterising interactional competence in higher education small group talk’. I am a Co-I on this project with Steve Walsh (PI) and Paul Seedhouse.
- 2012: £3,920 received from the Newcastle University Faculty Research Fund for a pilot project entitled ‘Gesture and talk ‘in the wild’ (with Professor Olivier).
Research experience/positions:
- Research Fellow on Crowd Sourcing: A Toolkit-based Approach (2010-2011). RCUK Grant EP/G065802/1 Horizon Digital Economy Research. Work carried out at The University of Nottingham.
- Research Associate on DReSS II (Understanding Digital Records for eSocial Science (2008-2011). ESRC Grant No. RES-149-25-1067. Work carried out at The University of Nottingham.
- Research Assistant on DReSS I (Understanding Digital Records for eSocial Science (2005-2008). ESRC Grant No. RES-149-25-0035 on Headtalk (2005-2006). ESRC Grant No. RES-149-25-1016. Work carried out at The University of Nottingham.
- I have also been involved in work with the Cambridge University Press (CUP) on the English Profile (EP) Project and from 2009-2012 I was involved in the construction of CANELC, the Cambridge and Nottingham e-Language Corpus (working with CUP and staff from the University of Nottingham), the first large-scale corpus of digital discourse.
Biography
- 2015: Certificate in Advanced Studies in Academic Practice, Newcastle University
- 2004 – 2009: PhD in Applied Linguistics, The University of Nottingham
- Thesis title: A multi-modal corpus approach to the analysis of backchanneling behaviour
- Funding: ESRC +3 award winner
- 2003 – 2004: MA in Applied Linguistics, The University of Nottingham
- 2000 – 2003: BA in English Studies, The University of Nottingham
Professional memberships
- Fellow, Learned Society of Wales (FLSW), 2023-present.
- Associate Fellow of the Higher Education Academy (AFHEA), 2013 – present.
- Member, BAAL (British Association for Applied Linguistics).
- Executive Committee member, CRiLLS (Centre for Research in Linguistics and Language Sciences, Newcastle University), 2011 – 2015.
- Member, CRAL (Centre for Research in Applied Linguistics), 2006 – 2011.
- Member, IVACS (Inter-Varietal Applied Corpus Studies), 2004 – present
- Member, AILA (International Association of Applied Linguistics), 2004 – present
- Member, Language Teaching and Technology; Language Learning and Teaching and iLaB (ICT) research clusters in ECLS, 2012 – 2015.
Academic positions
- 2016 – present: Reader in Applied Linguistics, Cardiff University
- 2015 – 2016: Senior Lecturer in Applied Linguistics, Cardiff University.
- 2014 – 2015: Senior Lecturer in Applied Linguistics, Newcastle University.
- 2011 – 2014: Lecturer in Applied Linguistics, Newcastle University.
- 2009 – 2011: Part-time Research Fellow and lecturer on BA and M-Level home and distance learning modules, The University of Nottingham.
- 2006 – 2009: Part-time Research Assistant and lecturer on BA and M-Level home and distance learning modules, The University of Nottingham.
- 2005 – 2006: Full-time Research Assistant, ESRC funded HeadTalk interdisciplinary project, The University of Nottingham.
- 2004 – 2005: Resident Hall Tutor, Hugh Stewart Hall, The University of Nottingham.
Committees and reviewing
- Editorial board member of Applied Linguistics (journal, 2021+)
- Ambassador of the Data Innovation Research Institute (DIRI) at Cardiff University. In this role I lead a special interest group (SIG) that facilitates deep interdisciplinary collaboration across the University in the area of data science (2018+).
- Editorial board member of Elements in Corpus Linguistics (book series) published by Cambridge University Press.
- Lead organiser and Chair of the 2020 online BAAL conference. Over 400 members of the association registered to participate in this conference.
- Lead organiser of the biannual International Corpus Linguistic Conference (CL2019), a 5-day globally leading conference for academics working within this discipline (2018-2019).
- Member of the ESRC’s Centres for Doctoral Training (CDT) Peer Review College (2016+)
- Honorary Visiting Fellow at the Centre for Research in Applied Linguistics (CRAL), The University of Nottingham (May–July 2018, during Research Leave)
- Visiting Researcher at the Department of English Language and Applied Linguistics, Swansea University (April–July 2018, during Research Leave)
- General Secretary for BAAL, the British Association for Applied Linguistics (2013 - 2018); Meetings Secretary for BAAL (2010-2013); Postgraduate Development and Liaison Officer for BAAL (2007-2009).
- Co-organiser of the IVACS (Inter-Varietal and Applied Corpus Studies) 2006 and IVACS 2014 conferences.
- Editor (with Professor Svenja Adolphs) of the Routledge Handbook of English Language and the Digital Humanities [under contract].
- Reviews Editor for the Yearbook of Corpus Linguistics and Pragmatics, 2012-2015 (Springer Verlag).
- Editorial board member for the journal Discourse, Context and Media
- Reviewer for International Journal of Corpus Linguistics (IJCL), Journal of Pragmatics, Context and Discourse, Corpora Journal and the BAAL annual book prize.
- Programme committee member: Big Data and Natural Language Processing workshop hosted at IEEE Big Data, December 2016.
- Programme committee member: 9th International Corpus Linguistics conference, July 2017, University of Birmingham; Challenges in the Management of Large Corpora + Big Data and Natural Language Processing joint meeting, July 2017, University of Birmingham.
- Advisory Editorial Board member for the Journal of Corpus Linguistics and Pragmatics (Springer Verlag).
- Advisory board member for Language, Texts and Society (LTS) – a journal produced at the University of Nottingham.
- Advisory board member for CLiC – a corpus tool for the analysis of literary texts, led by Professor Mahlberg, University of Birmingham (funded by the AHRC).
Supervisions
- Corpus linguistics
- Corpus pragmatics
- Language use in context
- Non-verbal communication
- Discourse analysis
- Digital interaction (‘E-language’)
Current supervision
Jen Jordan-Grote
Research student
Debbie Cabral Lima
Research student
Yipei Kou
Research student
Charlie Brookes
Research student
Past projects
In addiition to the students listed above, I also supervised the RAs involved in work on the CorCenCC, IVO and FreeTxt projects and co-supervised the following PhD students to completion (at 50%, unless otherwise stated):
- Shanru Yang (30:70 with Steve Walsh, Newcastle University)
- Rezan Alharbi (with Mei Lin, Newcastle University)
- Vigneshwaran Muralidaran (with Irena Spasic, COMSC)
- David Griffin (with Christopher Heffer, ENCAP)
- Emily Powell (with Christopher Heffer, ENCAP)
- Kate Barber (with Amanda Potts, ENCAP)
Contact Details
+44 29208 76325
John Percival Building, Room 3.57, Colum Drive, Cardiff, CF10 3EU
Research themes
Specialisms
- Applied linguistics and educational linguistics
- mutlimodal discourse analysis
- Discourse and pragmatics
- Corpus linguistics