Dr Luis Espinosa-Anke

: Available for postgraduate supervision

Teams and roles for Luis Espinosa-Anke

Senior Lecturer
School of Computer Science and Informatics

I am a lecturer at the School of Computer Science and Informatics. Prior to joining Cardiff, I was a Natural Language Processing (NLP) scientist at Savana Médica (a Madrid-based company focused on delivering data-driven healthcare solutions), and before that I was a PhD student at Pompeu Fabra University (Barcelona). My research, in the area of Artificial Intelligence and NLP, is centered on meaning representation, computational semantics, multilingual NLP and computational lexicography. I am laCaixa Fellow, and Fulbright and Erasmus Mundus alumni.

I have been PI of the Don't Patronize Me! project, supported by a Kaggle Open Research grant (2,000 USD). I am also CO-I in a project funded by Snap Inc. on modeling meaning shift in social media (10,000 USD), with Jose Camacho-Collados (PI, COMSC), Daniel Loureiro (University of Porto), and Francesco Barbieri and Leonardo Neves from Snap Inc. I am also CO-I in the £90,000 Welsh Government funded 'Learning English-Welsh bilingual embeddings and applications in text categorisation' project (2020-2021), an interdisciplinary project with Dr. Dawn Knight (PI), Irena Spasic and Padraig Corcoran from the School of Computer Science and Informatics and Geraint Palmer, from the School of Mathematics.

Date
Type

2025

Siddique, Z., Turner, L. and Espinosa-Anke, L. 2025. Dialz: A Python toolkit for steering vectors. Presented at: The 63rd Annual Meeting of the Association for Computational Linguistics, Vienna, Austria, 27 July - 1 August 2025Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), Vol. 3. Vienna, Austria: Association for Computational Linguistics pp. 363-375.

2024

Borkakoty, H. and Espinosa-Anke, L. 2024. HOAXPEDIA: A unified Wikipedia hoax articles dataset. Presented at: 2024 Conference on Empirical Methods in Natural Language Processing, Miami, Florida, 12-16 November 2024 Presented at Lucie-Aimée, L. et al. eds.Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia. Association for Computational Linguistics pp. 53–66.
Es, S., Janes, J., Espinosa-Anke, L. and Schockaert, S. 2024. RAGAs: Automated evaluation of retrieval augmented generation. Presented at: The 18th Conference of the European Chapter of the Association for Computational Linguistics (System Demonstrations), St Julian's, Malta, 17-22 March 2024 Presented at Aletras, N. and De Clercq, O. eds.Proceedings of the EACL 2024. pp. 150-158.
Siddique, Z., Turner, L. and Espinosa-Anke, L. 2024. Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models. Presented at: The 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12-16 November 2024Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 18601-18619., (10.18653/v1/2024.emnlp-main.1035)
Almeman, F., Schockaert, S. and Espinosa-Anke, L. 2024. WordNet under scrutiny: Dictionary examples in the era of large language models. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italy, 20-24 May 2024 Presented at Calzolari, N. et al. eds.Main conference proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation. ELRA pp. 17683-17695.
Gajbhiye, A., Bouraoui, Z., Espinosa-Anke, L. and Schockaert, S. 2024. AMenDeD: Modelling concepts by aligning mentions, definitions and decontextualised embeddings. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italy, 20-25 May 2024Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. European Language Resources Association pp. 801-811.

2023

Antypas, D. et al. 2023. SuperTweetEval: A challenging, unified and heterogeneous benchmark for social media NLP research. Presented at: The 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6 - 10 December 2023 Presented at Bouamor, H., Pino, J. and Bali, K. eds.Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics pp. 12590-12697., (10.18653/v1/2023.findings-emnlp.838)
Boisson, J., Espinosa-Anke, L. and Camacho Collados, J. 2023. Construction artifacts in metaphor identification datasets. Presented at: 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-10 December 2023 Presented at Bouamor, H., Pino, J. and Bali, K. eds.Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 6581–6590., (10.18653/v1/2023.emnlp-main.406)
Doval, Y., Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2023. Meemi: a simple method for post-processing and integrating cross-lingual word embeddings. Natural Language Engineering 29(3), pp. 746-768. (10.1017/S1351324921000280)
Owen, D., Antypas, D., Hassoulas, A., Pardinas, A., Espinosa-Anke, L. and Camacho Collados, J. 2023. Enabling early health care intervention by detecting depression in users of web-based forums using Language models: longitudinal analysis and evaluation. JMIR AI 2, article number: e41205. (10.2196/41205)
Gajbhiye, A., Bouraoui, Z., Li, N., Chatterjee, U., Espinosa-Anke, L. and Schockaert, S. 2023. What do deck chairs and sun hats have in common? Uncovering shared properties in large concept vocabularies. Presented at: Conference on Empirical Methods in Natural Language Processing, EMNLP, Singapore, 6-10 December 2023 Presented at Bouamor, H., Pino, J. and Bali, K. eds.Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 10587–10596., (10.18653/v1/2023.emnlp-main.654)
Borkakoty, H. and Espinosa-Anke, L. 2023. WIKITIDE: A Wikipedia-based timestamped definition pairs dataset. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing, 4-6 September 2023Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd pp. 207-216., (10.26615/978-954-452-092-2_023)
Almeman, F., Sheikhi, H. and Espinosa-Anke, L. 2023. 3D-EX: A unified dataset of definitions and dictionary examples. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing, 4-6 September 2023Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd pp. 69-79., (10.26615/978-954-452-092-2_008)

2022

Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2022. Self-supervised intermediate fine-tuning of biomedical language models for interpreting patient case descriptions. Presented at: 29th International Conference on Computational Linguistics (COLING), Gyeongju, Republic of Korea, 12-17 October 2022Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics
Gajbhiye, A., Espinosa-Anke, L. and Schockaert, S. 2022. Modelling commonsense properties using pre-trained bi-encoders. Presented at: 29th International Conference on Computational Linguistics (COLING), 12-17 October 2022Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics pp. 3971-3983.
Perez-Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2022. SemEval-2022 task 4: patronizing and condescending language detection. Presented at: 16th International Workshop on Semantic Evaluation (SemEval-2022), Seattle, United States, July 2022 Presented at Emerson, G. et al. eds.Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Association for Computational Linguistics pp. 298–307., (10.18653/v1/2022.semeval-1.38)
Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2022. Interpreting patient descriptions using distantly supervised similar case retrieval. Presented at: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 11-15 July 2022Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM pp. 460-470., (10.1145/3477495.3532003)
Loureiro, D., Barbieri, F., Neves, L., Espinosa-Anke, L. and Camacho-collados, J. 2022. TimeLMs: Diachronic Language Models from Twitter. Presented at: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 251-260), Dublin, Ireland, 22 - 27 May 2022 Presented at Basile, V., Kozareva, Z. and Stajner, S. eds.Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics pp. 251-260., (10.18653/v1/2022.acl-demo.25)
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2022. Pre-training language models for identifying patronizing and condescending language: an analysis. Presented at: LREC 2022, Marseille; France, 20-25 June 2022Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association pp. 3902-3911.
Wang, Y., Bouraoui, Z., Espinosa-Anke, L. and Schockaert, S. 2022. Sentence selection strategies for distilling word embeddings from BERT. Presented at: LREC 2022, Marseille, France, 20-25 June 2022Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association (ELRA) pp. 2591-2600.

2021

Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2021. Probing pre-trained language models for disease knowledge. Findings 2021(August), pp. 3023-3033. (10.18653/v1/2021.findings-acl.266)
Espinosa-Anke, L., Palmer, G., Filimonov, M., Corcoran, P., Spasic, I. and Knight, D. 2021. English–Welsh cross-lingual embeddings. Applied Sciences 11(14), article number: 6541. (10.3390/app11146541)
Davies, C. et al. 2021. Multi-scale user migration on Reddit. Presented at: Workshop on Cyber Social Threats at the 15th International AAAI Conference on Web and Social Media (ICWSM 2021), Virtual, 07 June 2021. AAAI, (10.36190/2021.13)
Ushio, A., Espinosa-Anke, L., Schockaert, S. and Camacho Collados, J. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies?. Presented at: 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), Bangkok, Thailand, 1-6 August 2021.
Li, N., Bouraoui, Z., Camacho Collados, J., Espinosa-Anke, L., Gu, Q. and Schockaert, S. 2021. Modelling general properties of nouns by selectively averaging contextualised embeddings. Presented at: 30th International Joint Conference on Artificial Intelligence (IJCAI 2021), Virtual, 21-26 August 2021.
Wang, Y., Bouraoui, Z., Espinosa-Anke, L. and Schockaert, S. 2021. Deriving word vectors from contextualized language models using topic-aware mention selection. Presented at: 6th Workshop on Representation Learning for NLP (RepL4NLP 2021), Virtual / Bangkok, Thailand, 05 August 2021RepL4NLP 2021 - 6th Workshop on Representation Learning for NLP, Proceedings of the Workshop. Association for Computational Linguistics pp. 185-194.

2020

Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! an annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020), Virtual, 8-13 December 2020Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: International Committee on Computational Linguistics pp. 5891–5902., (10.18653/v1/2020.coling-main.518)
Owen, D., Camacho Collados, J. and Espinosa-Anke, L. 2020. Towards preemptive detection of depression and anxiety in Twitter. Presented at: Social Media Mining for Health Applications Workshop & Shared Task 2020, Barcelona, Spain, 8-13 December 2020 Presented at Gonzalez-Hernandez, G. et al. eds.Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task. Association for Computational Linguistics pp. 82-89.
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2020. Don’t patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: 28th International Conference on Computational Linguistics (COLING), Barcelona, Spain, 13-18 December 2020 Presented at Scott, D., Bel, N. and Zong, C. eds.Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics pp. 5891–5902., (10.18653/v1/2020.coling-main.518)
Espinosa-Anke, L., Martin-Vide, C. and Spasic, I. eds. 2020. Statistical language and speech processing: 8th International Conference, SLSP 2020, Cardiff, UK, October 14–16, 2020. Springer.
Tuxworth, D., Antypas, D., Espinosa-Anke, L., Camacho-Collados, J., Preece, A. and Rogers, D. 2020. Deriving disinformation insights from geolocalized Twitter callouts. Presented at: Workshop On Deriving Insights From User-Generated Text @KDD2021, 14 -18 August 2021.
Pérez-Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020), Virtual, 8-13 December 2020.
Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2020. Combining BERT with static word embeddings for categorizing social media. Presented at: 6th Workshop on Noisy User-generated Text (W-NUT 2020), Virtual, 19 November 2020.
Jeawak, S., Espinosa-Anke, L. and Schockaert, S. 2020. Cardiff University at SemEval-2020 Task 6: fine-tuning BERT for domain-specific definition classification. Presented at: International Workshop on Semantic Evaluation (SemEval 2020), Barcelona, Spain, 12-13 December 2020.
Hee Lee, J., Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2020. Capturing word order in averaging based sentence embeddings. Presented at: European Conference on Artificial Intelligence (ECAI2020), Santiago de Compostela, Spain, 29 August - 2 September.
Camacho Collados, J., Doval, Y., Martínez-Cámara, E., Espinosa-Anke, L., Barbieri, F. and Schockaert, S. 2020. Learning cross-lingual word embeddings from Twitter via distant supervision. Proceedings of the International AAAI Conference on Web and Social Media 14(1), pp. 72-82.
Bouraoui, Z., Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2020. Modelling semantic categories using conceptual neighborhood. Presented at: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, USA, 7-12 February 2020. pp. -.

2019

Camacho Collados, J., Espinosa-Anke, L., Jameel, S. and Schockaert, S. 2019. A latent variable model for learning distributional relation vectors. Presented at: IJCAI-19: International Joint Conference on Artificial Intelligence, Macau, China, 10-16 August 2019Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Main track. IJCAI pp. 4911-4917., (10.24963/ijcai.2019/682)
Zhou, Y., Shah, J. A. and Schockaert, S. 2019. Learning household task knowledge from WikiHow descriptions. Presented at: 5th Workshop on Semantic Deep Learning, Macau, China, 10-16 August 2019 Presented at Espinosa-Anke, L. et al. eds.Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5). Association for Computational Linguistics pp. 50-56.
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2019. Cardiff University at SemEval-2019 Task 4: Linguistic features for hyperpartisan news detection. Presented at: SemEval-2019: International Workshop on Semantic Evaluation, Minneapolis, Minnesota, USA, 6-7 June 2019. Association for Computational Linguistics pp. 929-933., (10.18653/v1/S19-2158)
Espinosa-Anke, L., Wanner, L. and Schockaert, S. 2019. Collocation classification with unsupervised relation vectors. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, 28 July - 2 August 2019.
Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2019. Relational word embeddings. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, 28 July - 2 August 2019.

2018

Espinosa-Anke, L. and Schockaert, S. 2018. SeVeN: Augmenting word embeddings with unsupervised relation vectors. Presented at: 27th International Conference on Computational Linguistics (COLING 2018), Santa Fe, NM, USA, 20-26 August 2018.
Espinosa-Anke, L. and Schockaert, S. 2018. Syntactically aware neural architectures for definition extraction. Presented at: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, 1-6 June 2018.

2016

Oramas, S., Espinosa-Anke, L., Sordo, M., Saggion, H. and Serra, X. 2016. Information extraction for knowledge base construction in the music domain. Data and Knowledge Engineering 106, pp. 70-83., article number: 6. (10.1016/j.datak.2016.06.001)
Espinosa-Anke, L., Saggion, H., Ronzano, F. and Navigli, R. 2016. ExTaSem! Extending, taxonomizing and semantifying domain terminologies. Presented at: Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12-17 February 2016Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI pp. 2594-2600.
Espinosa-Anke, L., Camacho-Collados, J., Delli Bovi, C. and Saggion, H. 2016. Supervised distributional hypernym discovery via domain adaptation. Presented at: EMNLP2016: Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1-5 November 2016Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 424-435., (10.18653/v1/D16-1041)

Articles

Doval, Y., Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2023. Meemi: a simple method for post-processing and integrating cross-lingual word embeddings. Natural Language Engineering 29(3), pp. 746-768. (10.1017/S1351324921000280)
Owen, D., Antypas, D., Hassoulas, A., Pardinas, A., Espinosa-Anke, L. and Camacho Collados, J. 2023. Enabling early health care intervention by detecting depression in users of web-based forums using Language models: longitudinal analysis and evaluation. JMIR AI 2, article number: e41205. (10.2196/41205)
Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2021. Probing pre-trained language models for disease knowledge. Findings 2021(August), pp. 3023-3033. (10.18653/v1/2021.findings-acl.266)
Espinosa-Anke, L., Palmer, G., Filimonov, M., Corcoran, P., Spasic, I. and Knight, D. 2021. English–Welsh cross-lingual embeddings. Applied Sciences 11(14), article number: 6541. (10.3390/app11146541)
Camacho Collados, J., Doval, Y., Martínez-Cámara, E., Espinosa-Anke, L., Barbieri, F. and Schockaert, S. 2020. Learning cross-lingual word embeddings from Twitter via distant supervision. Proceedings of the International AAAI Conference on Web and Social Media 14(1), pp. 72-82.
Oramas, S., Espinosa-Anke, L., Sordo, M., Saggion, H. and Serra, X. 2016. Information extraction for knowledge base construction in the music domain. Data and Knowledge Engineering 106, pp. 70-83., article number: 6. (10.1016/j.datak.2016.06.001)

Books

Espinosa-Anke, L., Martin-Vide, C. and Spasic, I. eds. 2020. Statistical language and speech processing: 8th International Conference, SLSP 2020, Cardiff, UK, October 14–16, 2020. Springer.

Conferences

Siddique, Z., Turner, L. and Espinosa-Anke, L. 2025. Dialz: A Python toolkit for steering vectors. Presented at: The 63rd Annual Meeting of the Association for Computational Linguistics, Vienna, Austria, 27 July - 1 August 2025Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), Vol. 3. Vienna, Austria: Association for Computational Linguistics pp. 363-375.
Borkakoty, H. and Espinosa-Anke, L. 2024. HOAXPEDIA: A unified Wikipedia hoax articles dataset. Presented at: 2024 Conference on Empirical Methods in Natural Language Processing, Miami, Florida, 12-16 November 2024 Presented at Lucie-Aimée, L. et al. eds.Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia. Association for Computational Linguistics pp. 53–66.
Es, S., Janes, J., Espinosa-Anke, L. and Schockaert, S. 2024. RAGAs: Automated evaluation of retrieval augmented generation. Presented at: The 18th Conference of the European Chapter of the Association for Computational Linguistics (System Demonstrations), St Julian's, Malta, 17-22 March 2024 Presented at Aletras, N. and De Clercq, O. eds.Proceedings of the EACL 2024. pp. 150-158.
Siddique, Z., Turner, L. and Espinosa-Anke, L. 2024. Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models. Presented at: The 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12-16 November 2024Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 18601-18619., (10.18653/v1/2024.emnlp-main.1035)
Almeman, F., Schockaert, S. and Espinosa-Anke, L. 2024. WordNet under scrutiny: Dictionary examples in the era of large language models. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italy, 20-24 May 2024 Presented at Calzolari, N. et al. eds.Main conference proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation. ELRA pp. 17683-17695.
Gajbhiye, A., Bouraoui, Z., Espinosa-Anke, L. and Schockaert, S. 2024. AMenDeD: Modelling concepts by aligning mentions, definitions and decontextualised embeddings. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italy, 20-25 May 2024Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. European Language Resources Association pp. 801-811.
Antypas, D. et al. 2023. SuperTweetEval: A challenging, unified and heterogeneous benchmark for social media NLP research. Presented at: The 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6 - 10 December 2023 Presented at Bouamor, H., Pino, J. and Bali, K. eds.Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics pp. 12590-12697., (10.18653/v1/2023.findings-emnlp.838)
Boisson, J., Espinosa-Anke, L. and Camacho Collados, J. 2023. Construction artifacts in metaphor identification datasets. Presented at: 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-10 December 2023 Presented at Bouamor, H., Pino, J. and Bali, K. eds.Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 6581–6590., (10.18653/v1/2023.emnlp-main.406)
Gajbhiye, A., Bouraoui, Z., Li, N., Chatterjee, U., Espinosa-Anke, L. and Schockaert, S. 2023. What do deck chairs and sun hats have in common? Uncovering shared properties in large concept vocabularies. Presented at: Conference on Empirical Methods in Natural Language Processing, EMNLP, Singapore, 6-10 December 2023 Presented at Bouamor, H., Pino, J. and Bali, K. eds.Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 10587–10596., (10.18653/v1/2023.emnlp-main.654)
Borkakoty, H. and Espinosa-Anke, L. 2023. WIKITIDE: A Wikipedia-based timestamped definition pairs dataset. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing, 4-6 September 2023Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd pp. 207-216., (10.26615/978-954-452-092-2_023)
Almeman, F., Sheikhi, H. and Espinosa-Anke, L. 2023. 3D-EX: A unified dataset of definitions and dictionary examples. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing, 4-6 September 2023Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd pp. 69-79., (10.26615/978-954-452-092-2_008)
Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2022. Self-supervised intermediate fine-tuning of biomedical language models for interpreting patient case descriptions. Presented at: 29th International Conference on Computational Linguistics (COLING), Gyeongju, Republic of Korea, 12-17 October 2022Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics
Gajbhiye, A., Espinosa-Anke, L. and Schockaert, S. 2022. Modelling commonsense properties using pre-trained bi-encoders. Presented at: 29th International Conference on Computational Linguistics (COLING), 12-17 October 2022Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics pp. 3971-3983.
Perez-Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2022. SemEval-2022 task 4: patronizing and condescending language detection. Presented at: 16th International Workshop on Semantic Evaluation (SemEval-2022), Seattle, United States, July 2022 Presented at Emerson, G. et al. eds.Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Association for Computational Linguistics pp. 298–307., (10.18653/v1/2022.semeval-1.38)
Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2022. Interpreting patient descriptions using distantly supervised similar case retrieval. Presented at: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 11-15 July 2022Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM pp. 460-470., (10.1145/3477495.3532003)
Loureiro, D., Barbieri, F., Neves, L., Espinosa-Anke, L. and Camacho-collados, J. 2022. TimeLMs: Diachronic Language Models from Twitter. Presented at: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 251-260), Dublin, Ireland, 22 - 27 May 2022 Presented at Basile, V., Kozareva, Z. and Stajner, S. eds.Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics pp. 251-260., (10.18653/v1/2022.acl-demo.25)
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2022. Pre-training language models for identifying patronizing and condescending language: an analysis. Presented at: LREC 2022, Marseille; France, 20-25 June 2022Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association pp. 3902-3911.
Wang, Y., Bouraoui, Z., Espinosa-Anke, L. and Schockaert, S. 2022. Sentence selection strategies for distilling word embeddings from BERT. Presented at: LREC 2022, Marseille, France, 20-25 June 2022Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association (ELRA) pp. 2591-2600.
Davies, C. et al. 2021. Multi-scale user migration on Reddit. Presented at: Workshop on Cyber Social Threats at the 15th International AAAI Conference on Web and Social Media (ICWSM 2021), Virtual, 07 June 2021. AAAI, (10.36190/2021.13)
Ushio, A., Espinosa-Anke, L., Schockaert, S. and Camacho Collados, J. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies?. Presented at: 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), Bangkok, Thailand, 1-6 August 2021.
Li, N., Bouraoui, Z., Camacho Collados, J., Espinosa-Anke, L., Gu, Q. and Schockaert, S. 2021. Modelling general properties of nouns by selectively averaging contextualised embeddings. Presented at: 30th International Joint Conference on Artificial Intelligence (IJCAI 2021), Virtual, 21-26 August 2021.
Wang, Y., Bouraoui, Z., Espinosa-Anke, L. and Schockaert, S. 2021. Deriving word vectors from contextualized language models using topic-aware mention selection. Presented at: 6th Workshop on Representation Learning for NLP (RepL4NLP 2021), Virtual / Bangkok, Thailand, 05 August 2021RepL4NLP 2021 - 6th Workshop on Representation Learning for NLP, Proceedings of the Workshop. Association for Computational Linguistics pp. 185-194.
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! an annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020), Virtual, 8-13 December 2020Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: International Committee on Computational Linguistics pp. 5891–5902., (10.18653/v1/2020.coling-main.518)
Owen, D., Camacho Collados, J. and Espinosa-Anke, L. 2020. Towards preemptive detection of depression and anxiety in Twitter. Presented at: Social Media Mining for Health Applications Workshop & Shared Task 2020, Barcelona, Spain, 8-13 December 2020 Presented at Gonzalez-Hernandez, G. et al. eds.Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task. Association for Computational Linguistics pp. 82-89.
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2020. Don’t patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: 28th International Conference on Computational Linguistics (COLING), Barcelona, Spain, 13-18 December 2020 Presented at Scott, D., Bel, N. and Zong, C. eds.Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics pp. 5891–5902., (10.18653/v1/2020.coling-main.518)
Tuxworth, D., Antypas, D., Espinosa-Anke, L., Camacho-Collados, J., Preece, A. and Rogers, D. 2020. Deriving disinformation insights from geolocalized Twitter callouts. Presented at: Workshop On Deriving Insights From User-Generated Text @KDD2021, 14 -18 August 2021.
Pérez-Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020), Virtual, 8-13 December 2020.
Alghanmi, I., Espinosa-Anke, L. and Schockaert, S. 2020. Combining BERT with static word embeddings for categorizing social media. Presented at: 6th Workshop on Noisy User-generated Text (W-NUT 2020), Virtual, 19 November 2020.
Jeawak, S., Espinosa-Anke, L. and Schockaert, S. 2020. Cardiff University at SemEval-2020 Task 6: fine-tuning BERT for domain-specific definition classification. Presented at: International Workshop on Semantic Evaluation (SemEval 2020), Barcelona, Spain, 12-13 December 2020.
Hee Lee, J., Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2020. Capturing word order in averaging based sentence embeddings. Presented at: European Conference on Artificial Intelligence (ECAI2020), Santiago de Compostela, Spain, 29 August - 2 September.
Bouraoui, Z., Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2020. Modelling semantic categories using conceptual neighborhood. Presented at: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, USA, 7-12 February 2020. pp. -.
Camacho Collados, J., Espinosa-Anke, L., Jameel, S. and Schockaert, S. 2019. A latent variable model for learning distributional relation vectors. Presented at: IJCAI-19: International Joint Conference on Artificial Intelligence, Macau, China, 10-16 August 2019Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Main track. IJCAI pp. 4911-4917., (10.24963/ijcai.2019/682)
Zhou, Y., Shah, J. A. and Schockaert, S. 2019. Learning household task knowledge from WikiHow descriptions. Presented at: 5th Workshop on Semantic Deep Learning, Macau, China, 10-16 August 2019 Presented at Espinosa-Anke, L. et al. eds.Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5). Association for Computational Linguistics pp. 50-56.
Perez Almendros, C., Espinosa-Anke, L. and Schockaert, S. 2019. Cardiff University at SemEval-2019 Task 4: Linguistic features for hyperpartisan news detection. Presented at: SemEval-2019: International Workshop on Semantic Evaluation, Minneapolis, Minnesota, USA, 6-7 June 2019. Association for Computational Linguistics pp. 929-933., (10.18653/v1/S19-2158)
Espinosa-Anke, L., Wanner, L. and Schockaert, S. 2019. Collocation classification with unsupervised relation vectors. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, 28 July - 2 August 2019.
Camacho Collados, J., Espinosa-Anke, L. and Schockaert, S. 2019. Relational word embeddings. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, 28 July - 2 August 2019.
Espinosa-Anke, L. and Schockaert, S. 2018. SeVeN: Augmenting word embeddings with unsupervised relation vectors. Presented at: 27th International Conference on Computational Linguistics (COLING 2018), Santa Fe, NM, USA, 20-26 August 2018.
Espinosa-Anke, L. and Schockaert, S. 2018. Syntactically aware neural architectures for definition extraction. Presented at: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, 1-6 June 2018.
Espinosa-Anke, L., Saggion, H., Ronzano, F. and Navigli, R. 2016. ExTaSem! Extending, taxonomizing and semantifying domain terminologies. Presented at: Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12-17 February 2016Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI pp. 2594-2600.
Espinosa-Anke, L., Camacho-Collados, J., Delli Bovi, C. and Saggion, H. 2016. Supervised distributional hypernym discovery via domain adaptation. Presented at: EMNLP2016: Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1-5 November 2016Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics pp. 424-435., (10.18653/v1/D16-1041)

Database Systems
- MySQL
- MongoDB
Machine Learning
Natural Language Processing
Data Science
Python

I received my BA in English Philology from the University of Alicante (2001 - 2006), and an MA in English for Specific Purposes from the same institution (2006 - 2008). I worked as a Spanish and English teacher for a two years in Madrid, until I moved to the US with a Fulbright FLTA scholarship, where I taught undergraduate Spanish at the Lincoln University of the Commonwealth of Pennsylvania. After that, I received a laCaixa fellowship (a Spanish bank and foundation) to pursue an MA in Natural Language Processing, a joint program between the University of Wolverhampton (UK) and the Universitat Autònoma de Barcelona (Spain) (2011 - 2013). Then, I completed my PhD (2013 - 2017) at Pompeu Fabra University, at the same time I worked as an NLP scientist for Savana, a Spain-based company focused on delivering AI-driven solutions for healthcare.

I am currently co-supervising the following PhD students:

Yixiao Wang, who is working on meaning representations, contextual word embeddings and relational encodings.
Israa Alghanmi, who is working on the intersection between NLP for social media and enriched contextual word embeddings.
David Owen, who is working on NLP for health applications, curently specialised in detection of anxiety and depression disorders in social media.
Joanne Boisson, who is working on the identification and categorization of metaphors.

Current supervision

Zara Siddique

Contact Details

Espinosa-AnkeL@cardiff.ac.uk
+44 29225 10054