Dr Luis Espinosa-Anke
- Available for postgraduate supervision
Teams and roles for Luis Espinosa-Anke
Senior Lecturer
Overview
I am a lecturer at the School of Computer Science and Informatics. Prior to joining Cardiff, I was a Natural Language Processing (NLP) scientist at Savana Médica (a Madrid-based company focused on delivering data-driven healthcare solutions), and before that I was a PhD student at Pompeu Fabra University (Barcelona). My research, in the area of Artificial Intelligence and NLP, is centered on meaning representation, computational semantics, multilingual NLP and computational lexicography. I am laCaixa Fellow, and Fulbright and Erasmus Mundus alumni.
I have been PI of the Don't Patronize Me! project, supported by a Kaggle Open Research grant (2,000 USD). I am also CO-I in a project funded by Snap Inc. on modeling meaning shift in social media (10,000 USD), with Jose Camacho-Collados (PI, COMSC), Daniel Loureiro (University of Porto), and Francesco Barbieri and Leonardo Neves from Snap Inc. I am also CO-I in the £90,000 Welsh Government funded 'Learning English-Welsh bilingual embeddings and applications in text categorisation' project (2020-2021), an interdisciplinary project with Dr. Dawn Knight (PI), Irena Spasic and Padraig Corcoran from the School of Computer Science and Informatics and Geraint Palmer, from the School of Mathematics.
Publication
2025
- Borkakoty, H. and Espinosa-Anke, L. 2025. TACTICAL: A framework for building Wikipedia-derived timelines of atomic changes. Presented at: 28th European Conference on Artificial Intelligence Bologna, Italy 25-30 October 2025. Published in: Lynce, I. et al., ECAI 2025. Frontiers in Artificial Intelligence and Applications IOS Press. , pp.4410-4417. (10.3233/faia251339)
- Borkakoty, H. and Espinosa-Anke, L. 2025. WiDe-Analysis: enabling one-click content moderation analysis on Wikipedia’s articles for deletion. Presented at: ECAI 2025 Workshop on Intelligent Management Information Systems (IMIS 2025) Bologna, Italy 25-30 10 2025. Published in: Hernes, M. , Walaszczyk, E. and Rot, A. eds. Emerging Challenges in Intelligent Management Information Systems: Proceedings of 28th European Conference on Artificial Intelligence ECAI 2025 - IMIS Workshop, Volume 2. Vol. 2.Chem: Springer. , pp.351-365. (10.1007/978-3-032-06611-4_27)
- Gajbhiye, A. et al. 2025. Grouping entities with shared properties using multi-facet prompting and property embeddings. Presented at: The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP) Suzhou, China 4-9 November 2025.
- Siddique, Z. , Turner, L. and Espinosa-Anke, L. 2025. Dialz: A Python toolkit for steering vectors. Presented at: The 63rd Annual Meeting of the Association for Computational Linguistics Vienna, Austria 27 July - 1 August 2025. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations). Vol. 3.Vienna, Austria: Association for Computational Linguistics. , pp.363-375. (10.18653/v1/2025.acl-demo.35)
2024
- Almeman, F. , Schockaert, S. and Espinosa-Anke, L. 2024. WordNet under scrutiny: Dictionary examples in the era of large language models. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) Torino, Italy 20-24 May 2024. Published in: Calzolari, N. et al., Main conference proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation. ELRA. , pp.17683-17695.
- Borkakoty, H. and Espinosa-Anke, L. 2024. HOAXPEDIA: A unified Wikipedia hoax articles dataset. Presented at: 2024 Conference on Empirical Methods in Natural Language Processing Miami, Florida 12-16 November 2024. Published in: Lucie-Aimée, L. et al., Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia. Association for Computational Linguistics. , pp.53–66.
- Es, S. et al., 2024. RAGAs: Automated evaluation of retrieval augmented generation. Presented at: The 18th Conference of the European Chapter of the Association for Computational Linguistics (System Demonstrations) St Julian's, Malta 17-22 March 2024. Published in: Aletras, N. and De Clercq, O. eds. Proceedings of the EACL 2024. , pp.150-158.
- Gajbhiye, A. et al. 2024. AMenDeD: Modelling concepts by aligning mentions, definitions and decontextualised embeddings. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) Torino, Italy 20-25 May 2024. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. European Language Resources Association. , pp.801-811.
- Siddique, Z. , Turner, L. and Espinosa-Anke, L. 2024. Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models. Presented at: The 2024 Conference on Empirical Methods in Natural Language Processing Miami, FL, USA 12-16 November 2024. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.18601-18619. (10.18653/v1/2024.emnlp-main.1035)
2023
- Almeman, F. , Sheikhi, H. and Espinosa-Anke, L. 2023. 3D-EX: A unified dataset of definitions and dictionary examples. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing 4-6 September 2023. Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd. , pp.69-79. (10.26615/978-954-452-092-2_008)
- Antypas, D. et al., 2023. SuperTweetEval: A challenging, unified and heterogeneous benchmark for social media NLP research. Presented at: The 2023 Conference on Empirical Methods in Natural Language Processing Singapore 6 - 10 December 2023. Published in: Bouamor, H. , Pino, J. and Bali, K. eds. Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics. , pp.12590-12697. (10.18653/v1/2023.findings-emnlp.838)
- Boisson, J. , Espinosa-Anke, L. and Camacho Collados, J. 2023. Construction artifacts in metaphor identification datasets. Presented at: 2023 Conference on Empirical Methods in Natural Language Processing Singapore 6-10 December 2023. Published in: Bouamor, H. , Pino, J. and Bali, K. eds. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.6581–6590. (10.18653/v1/2023.emnlp-main.406)
- Borkakoty, H. and Espinosa-Anke, L. 2023. WIKITIDE: A Wikipedia-based timestamped definition pairs dataset. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing 4-6 September 2023. Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd. , pp.207-216. (10.26615/978-954-452-092-2_023)
- Doval, Y. et al., 2023. Meemi: a simple method for post-processing and integrating cross-lingual word embeddings. Natural Language Engineering 29 (3), pp.746-768. (10.1017/S1351324921000280)
- Gajbhiye, A. et al. 2023. What do deck chairs and sun hats have in common? Uncovering shared properties in large concept vocabularies. Presented at: Conference on Empirical Methods in Natural Language Processing, EMNLP Singapore 6-10 December 2023. Published in: Bouamor, H. , Pino, J. and Bali, K. eds. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.10587–10596. (10.18653/v1/2023.emnlp-main.654)
- Owen, D. et al. 2023. Enabling early health care intervention by detecting depression in users of web-based forums using Language models: longitudinal analysis and evaluation. JMIR AI 2 e41205. (10.2196/41205)
2022
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2022. Interpreting patient descriptions using distantly supervised similar case retrieval. Presented at: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 11-15 July 2022. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM. , pp.460-470. (10.1145/3477495.3532003)
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2022. Self-supervised intermediate fine-tuning of biomedical language models for interpreting patient case descriptions. Presented at: 29th International Conference on Computational Linguistics (COLING) Gyeongju, Republic of Korea 12-17 October 2022. Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics
- Gajbhiye, A. , Espinosa-Anke, L. and Schockaert, S. 2022. Modelling commonsense properties using pre-trained bi-encoders. Presented at: 29th International Conference on Computational Linguistics (COLING) 12-17 October 2022. Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics. , pp.3971-3983.
- Loureiro, D. et al., 2022. TimeLMs: Diachronic Language Models from Twitter. Presented at: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 251-260) Dublin, Ireland 22 - 27 May 2022. Published in: Basile, V. , Kozareva, Z. and Stajner, S. eds. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. , pp.251-260. (10.18653/v1/2022.acl-demo.25)
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2022. Pre-training language models for identifying patronizing and condescending language: an analysis. Presented at: LREC 2022 Marseille; France 20-25 June 2022. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association. , pp.3902-3911.
- Perez-Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2022. SemEval-2022 task 4: patronizing and condescending language detection. Presented at: 16th International Workshop on Semantic Evaluation (SemEval-2022) Seattle, United States July 2022. Published in: Emerson, G. et al., Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Association for Computational Linguistics. , pp.298–307. (10.18653/v1/2022.semeval-1.38)
- Wang, Y. et al. 2022. Sentence selection strategies for distilling word embeddings from BERT. Presented at: LREC 2022 Marseille, France 20-25 June 2022. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association (ELRA). , pp.2591-2600.
2021
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2021. Probing pre-trained language models for disease knowledge. Findings 2021 (August), pp.3023-3033. (10.18653/v1/2021.findings-acl.266)
- Davies, C. et al. 2021. Multi-scale user migration on Reddit. Presented at: Workshop on Cyber Social Threats at the 15th International AAAI Conference on Web and Social Media (ICWSM 2021) Virtual 07 June 2021. AAAI(10.36190/2021.13)
- Espinosa-Anke, L. et al. 2021. English–Welsh cross-lingual embeddings. Applied Sciences 11 (14) 6541. (10.3390/app11146541)
- Li, N. et al., 2021. Modelling general properties of nouns by selectively averaging contextualised embeddings. Presented at: 30th International Joint Conference on Artificial Intelligence (IJCAI 2021) Virtual 21-26 August 2021.
- Ushio, A. et al. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies?. Presented at: 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021) Bangkok, Thailand 1-6 August 2021.
- Wang, Y. et al. 2021. Deriving word vectors from contextualized language models using topic-aware mention selection. Presented at: 6th Workshop on Representation Learning for NLP (RepL4NLP 2021) Virtual / Bangkok, Thailand 05 August 2021. Proceedings of the 6th Workshop on Representation Learning for NLP. Association for Computational Linguistics. , pp.185-194. (10.18653/v1/2021.repl4nlp-1.19)
2020
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2020. Combining BERT with static word embeddings for categorizing social media. Presented at: 6th Workshop on Noisy User-generated Text (W-NUT 2020) Virtual 19 November 2020. Published in: Xu, W. et al., Proceedings of the Sixth Workshop on Noisy User-generated Text. Association for Computational Linguistics. , pp.28-33. (10.18653/v1/2020.wnut-1.5)
- Bouraoui, Z. et al., 2020. Modelling semantic categories using conceptual neighborhood. Presented at: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) New York, NY, USA 7-12 February 2020. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34(05).PKP Publishing Services. , pp.7448-7455. (10.1609/aaai.v34i05.6241)
- Camacho Collados, J. et al. 2020. Learning cross-lingual word embeddings from Twitter via distant supervision. Proceedings of the International AAAI Conference on Web and Social Media 14 (1), pp.72-82.
- Jeawak, S. S. , Espinosa-Anke, L. and Schockaert, S. 2020. Cardiff University at SemEval-2020 Task 6: fine-tuning BERT for domain-specific definition classification. Presented at: International Workshop on Semantic Evaluation (SemEval 2020) Barcelona, Spain 12-13 December 2020. Published in: Herbelot, A. et al., Proceedings of the Fourteenth International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics. International Committee for Computational Linguistics. , pp.361-366.
- Lee, J. H. et al. 2020. Capturing word order in averaging based sentence embeddings. Presented at: European Conference on Artificial Intelligence (ECAI2020) Santiago de Compostela, Spain 29 August - 2 September 2020. Published in: De Giacomo, G. et al., 24th European Conference on Artificial Intelligence. Vol. 325.IOS Press. , pp.2062-2069. (10.3233/FAIA200328)
- Owen, D. , Camacho Collados, J. and Espinosa-Anke, L. 2020. Towards preemptive detection of depression and anxiety in Twitter. Presented at: Social Media Mining for Health Applications Workshop & Shared Task 2020 Barcelona, Spain 8-13 December 2020. Published in: Gonzalez-Hernandez, G. et al., Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task. Association for Computational Linguistics. , pp.82-89.
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! an annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020) Virtual 8-13 December 2020. Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: International Committee on Computational Linguistics. , pp.5891–5902. (10.18653/v1/2020.coling-main.518)
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2020. Don’t patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: 28th International Conference on Computational Linguistics (COLING) Barcelona, Spain 13-18 December 2020. Published in: Scott, D. , Bel, N. and Zong, C. eds. Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics. , pp.5891–5902. (10.18653/v1/2020.coling-main.518)
- Pérez-Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020) Virtual 8-13 December 2020.
- Tuxworth, D. et al. 2020. Deriving disinformation insights from geolocalized Twitter callouts. Presented at: Workshop On Deriving Insights From User-Generated Text @KDD2021 14 -18 August 2021.
2019
- Camacho Collados, J. et al. 2019. A latent variable model for learning distributional relation vectors. Presented at: IJCAI-19: International Joint Conference on Artificial Intelligence Macau, China 10-16 August 2019. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Main track. IJCAI. , pp.4911-4917. (10.24963/ijcai.2019/682)
- Camacho Collados, J. , Espinosa-Anke, L. and Schockaert, S. 2019. Relational word embeddings. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL) Florence, Italy 28 July - 2 August 2019. Published in: Korhonen, A. , Traum, D. and Marquez, L. eds. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. , pp.3286-3296.
- Espinosa-Anke, L. , Schockaert, S. and Wanner, L. 2019. Collocation classification with unsupervised relation vectors. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL) Florence, Italy 28 July - 2 August 2019. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. , pp.5765-5772. (10.18653/v1/P19-1576)
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2019. Cardiff University at SemEval-2019 Task 4: Linguistic features for hyperpartisan news detection. Presented at: SemEval-2019: International Workshop on Semantic Evaluation Minneapolis, Minnesota, USA 6-7 June 2019. Association for Computational Linguistics. , pp.929-933. (10.18653/v1/S19-2158)
2018
- Espinosa-Anke, L. and Schockaert, S. 2018. SeVeN: Augmenting word embeddings with unsupervised relation vectors. Presented at: 27th International Conference on Computational Linguistics (COLING 2018) Santa Fe, NM, USA 20-26 August 2018.
- Espinosa-Anke, L. and Schockaert, S. 2018. Syntactically aware neural architectures for definition extraction. Presented at: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies New Orleans 1-6 June 2018.
2016
- Espinosa-Anke, L. et al. 2016. Supervised distributional hypernym discovery via domain adaptation. Presented at: EMNLP2016: Conference on Empirical Methods in Natural Language Processing Austin, TX, USA 1-5 November 2016. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.424-435. (10.18653/v1/D16-1041)
- Espinosa-Anke, L. et al. 2016. ExTaSem! Extending, taxonomizing and semantifying domain terminologies. Presented at: Thirtieth AAAI Conference on Artificial Intelligence Phoenix, AZ, USA 12-17 February 2016. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI. , pp.2594-2600.
- Oramas, S. et al., 2016. Information extraction for knowledge base construction in the music domain. Data and Knowledge Engineering 106 , pp.70-83. 6. (10.1016/j.datak.2016.06.001)
Articles
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2021. Probing pre-trained language models for disease knowledge. Findings 2021 (August), pp.3023-3033. (10.18653/v1/2021.findings-acl.266)
- Camacho Collados, J. et al. 2020. Learning cross-lingual word embeddings from Twitter via distant supervision. Proceedings of the International AAAI Conference on Web and Social Media 14 (1), pp.72-82.
- Doval, Y. et al., 2023. Meemi: a simple method for post-processing and integrating cross-lingual word embeddings. Natural Language Engineering 29 (3), pp.746-768. (10.1017/S1351324921000280)
- Espinosa-Anke, L. et al. 2021. English–Welsh cross-lingual embeddings. Applied Sciences 11 (14) 6541. (10.3390/app11146541)
- Oramas, S. et al., 2016. Information extraction for knowledge base construction in the music domain. Data and Knowledge Engineering 106 , pp.70-83. 6. (10.1016/j.datak.2016.06.001)
- Owen, D. et al. 2023. Enabling early health care intervention by detecting depression in users of web-based forums using Language models: longitudinal analysis and evaluation. JMIR AI 2 e41205. (10.2196/41205)
Conferences
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2020. Combining BERT with static word embeddings for categorizing social media. Presented at: 6th Workshop on Noisy User-generated Text (W-NUT 2020) Virtual 19 November 2020. Published in: Xu, W. et al., Proceedings of the Sixth Workshop on Noisy User-generated Text. Association for Computational Linguistics. , pp.28-33. (10.18653/v1/2020.wnut-1.5)
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2022. Interpreting patient descriptions using distantly supervised similar case retrieval. Presented at: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 11-15 July 2022. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM. , pp.460-470. (10.1145/3477495.3532003)
- Alghanmi, I. , Espinosa-Anke, L. and Schockaert, S. 2022. Self-supervised intermediate fine-tuning of biomedical language models for interpreting patient case descriptions. Presented at: 29th International Conference on Computational Linguistics (COLING) Gyeongju, Republic of Korea 12-17 October 2022. Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics
- Almeman, F. , Sheikhi, H. and Espinosa-Anke, L. 2023. 3D-EX: A unified dataset of definitions and dictionary examples. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing 4-6 September 2023. Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd. , pp.69-79. (10.26615/978-954-452-092-2_008)
- Almeman, F. , Schockaert, S. and Espinosa-Anke, L. 2024. WordNet under scrutiny: Dictionary examples in the era of large language models. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) Torino, Italy 20-24 May 2024. Published in: Calzolari, N. et al., Main conference proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation. ELRA. , pp.17683-17695.
- Antypas, D. et al., 2023. SuperTweetEval: A challenging, unified and heterogeneous benchmark for social media NLP research. Presented at: The 2023 Conference on Empirical Methods in Natural Language Processing Singapore 6 - 10 December 2023. Published in: Bouamor, H. , Pino, J. and Bali, K. eds. Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics. , pp.12590-12697. (10.18653/v1/2023.findings-emnlp.838)
- Boisson, J. , Espinosa-Anke, L. and Camacho Collados, J. 2023. Construction artifacts in metaphor identification datasets. Presented at: 2023 Conference on Empirical Methods in Natural Language Processing Singapore 6-10 December 2023. Published in: Bouamor, H. , Pino, J. and Bali, K. eds. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.6581–6590. (10.18653/v1/2023.emnlp-main.406)
- Borkakoty, H. and Espinosa-Anke, L. 2024. HOAXPEDIA: A unified Wikipedia hoax articles dataset. Presented at: 2024 Conference on Empirical Methods in Natural Language Processing Miami, Florida 12-16 November 2024. Published in: Lucie-Aimée, L. et al., Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia. Association for Computational Linguistics. , pp.53–66.
- Borkakoty, H. and Espinosa-Anke, L. 2025. TACTICAL: A framework for building Wikipedia-derived timelines of atomic changes. Presented at: 28th European Conference on Artificial Intelligence Bologna, Italy 25-30 October 2025. Published in: Lynce, I. et al., ECAI 2025. Frontiers in Artificial Intelligence and Applications IOS Press. , pp.4410-4417. (10.3233/faia251339)
- Borkakoty, H. and Espinosa-Anke, L. 2025. WiDe-Analysis: enabling one-click content moderation analysis on Wikipedia’s articles for deletion. Presented at: ECAI 2025 Workshop on Intelligent Management Information Systems (IMIS 2025) Bologna, Italy 25-30 10 2025. Published in: Hernes, M. , Walaszczyk, E. and Rot, A. eds. Emerging Challenges in Intelligent Management Information Systems: Proceedings of 28th European Conference on Artificial Intelligence ECAI 2025 - IMIS Workshop, Volume 2. Vol. 2.Chem: Springer. , pp.351-365. (10.1007/978-3-032-06611-4_27)
- Borkakoty, H. and Espinosa-Anke, L. 2023. WIKITIDE: A Wikipedia-based timestamped definition pairs dataset. Presented at: R A N L P 2 0 2 3 International conference recent advances in natural language processing 4-6 September 2023. Proceedings of Recent Advances in Natural Language Processing. Shoumen, Bulgaria: INCOMA Ltd. , pp.207-216. (10.26615/978-954-452-092-2_023)
- Bouraoui, Z. et al., 2020. Modelling semantic categories using conceptual neighborhood. Presented at: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) New York, NY, USA 7-12 February 2020. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34(05).PKP Publishing Services. , pp.7448-7455. (10.1609/aaai.v34i05.6241)
- Camacho Collados, J. et al. 2019. A latent variable model for learning distributional relation vectors. Presented at: IJCAI-19: International Joint Conference on Artificial Intelligence Macau, China 10-16 August 2019. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Main track. IJCAI. , pp.4911-4917. (10.24963/ijcai.2019/682)
- Camacho Collados, J. , Espinosa-Anke, L. and Schockaert, S. 2019. Relational word embeddings. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL) Florence, Italy 28 July - 2 August 2019. Published in: Korhonen, A. , Traum, D. and Marquez, L. eds. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. , pp.3286-3296.
- Davies, C. et al. 2021. Multi-scale user migration on Reddit. Presented at: Workshop on Cyber Social Threats at the 15th International AAAI Conference on Web and Social Media (ICWSM 2021) Virtual 07 June 2021. AAAI(10.36190/2021.13)
- Es, S. et al., 2024. RAGAs: Automated evaluation of retrieval augmented generation. Presented at: The 18th Conference of the European Chapter of the Association for Computational Linguistics (System Demonstrations) St Julian's, Malta 17-22 March 2024. Published in: Aletras, N. and De Clercq, O. eds. Proceedings of the EACL 2024. , pp.150-158.
- Espinosa-Anke, L. et al. 2016. Supervised distributional hypernym discovery via domain adaptation. Presented at: EMNLP2016: Conference on Empirical Methods in Natural Language Processing Austin, TX, USA 1-5 November 2016. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.424-435. (10.18653/v1/D16-1041)
- Espinosa-Anke, L. et al. 2016. ExTaSem! Extending, taxonomizing and semantifying domain terminologies. Presented at: Thirtieth AAAI Conference on Artificial Intelligence Phoenix, AZ, USA 12-17 February 2016. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI. , pp.2594-2600.
- Espinosa-Anke, L. and Schockaert, S. 2018. SeVeN: Augmenting word embeddings with unsupervised relation vectors. Presented at: 27th International Conference on Computational Linguistics (COLING 2018) Santa Fe, NM, USA 20-26 August 2018.
- Espinosa-Anke, L. and Schockaert, S. 2018. Syntactically aware neural architectures for definition extraction. Presented at: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies New Orleans 1-6 June 2018.
- Espinosa-Anke, L. , Schockaert, S. and Wanner, L. 2019. Collocation classification with unsupervised relation vectors. Presented at: 57th Annual Meeting of the Association for Computational Linguistics (ACL) Florence, Italy 28 July - 2 August 2019. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. , pp.5765-5772. (10.18653/v1/P19-1576)
- Gajbhiye, A. et al. 2025. Grouping entities with shared properties using multi-facet prompting and property embeddings. Presented at: The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP) Suzhou, China 4-9 November 2025.
- Gajbhiye, A. et al. 2024. AMenDeD: Modelling concepts by aligning mentions, definitions and decontextualised embeddings. Presented at: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) Torino, Italy 20-25 May 2024. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. European Language Resources Association. , pp.801-811.
- Gajbhiye, A. et al. 2023. What do deck chairs and sun hats have in common? Uncovering shared properties in large concept vocabularies. Presented at: Conference on Empirical Methods in Natural Language Processing, EMNLP Singapore 6-10 December 2023. Published in: Bouamor, H. , Pino, J. and Bali, K. eds. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.10587–10596. (10.18653/v1/2023.emnlp-main.654)
- Gajbhiye, A. , Espinosa-Anke, L. and Schockaert, S. 2022. Modelling commonsense properties using pre-trained bi-encoders. Presented at: 29th International Conference on Computational Linguistics (COLING) 12-17 October 2022. Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics. , pp.3971-3983.
- Jeawak, S. S. , Espinosa-Anke, L. and Schockaert, S. 2020. Cardiff University at SemEval-2020 Task 6: fine-tuning BERT for domain-specific definition classification. Presented at: International Workshop on Semantic Evaluation (SemEval 2020) Barcelona, Spain 12-13 December 2020. Published in: Herbelot, A. et al., Proceedings of the Fourteenth International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics. International Committee for Computational Linguistics. , pp.361-366.
- Lee, J. H. et al. 2020. Capturing word order in averaging based sentence embeddings. Presented at: European Conference on Artificial Intelligence (ECAI2020) Santiago de Compostela, Spain 29 August - 2 September 2020. Published in: De Giacomo, G. et al., 24th European Conference on Artificial Intelligence. Vol. 325.IOS Press. , pp.2062-2069. (10.3233/FAIA200328)
- Li, N. et al., 2021. Modelling general properties of nouns by selectively averaging contextualised embeddings. Presented at: 30th International Joint Conference on Artificial Intelligence (IJCAI 2021) Virtual 21-26 August 2021.
- Loureiro, D. et al., 2022. TimeLMs: Diachronic Language Models from Twitter. Presented at: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 251-260) Dublin, Ireland 22 - 27 May 2022. Published in: Basile, V. , Kozareva, Z. and Stajner, S. eds. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. , pp.251-260. (10.18653/v1/2022.acl-demo.25)
- Owen, D. , Camacho Collados, J. and Espinosa-Anke, L. 2020. Towards preemptive detection of depression and anxiety in Twitter. Presented at: Social Media Mining for Health Applications Workshop & Shared Task 2020 Barcelona, Spain 8-13 December 2020. Published in: Gonzalez-Hernandez, G. et al., Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task. Association for Computational Linguistics. , pp.82-89.
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2019. Cardiff University at SemEval-2019 Task 4: Linguistic features for hyperpartisan news detection. Presented at: SemEval-2019: International Workshop on Semantic Evaluation Minneapolis, Minnesota, USA 6-7 June 2019. Association for Computational Linguistics. , pp.929-933. (10.18653/v1/S19-2158)
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! an annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020) Virtual 8-13 December 2020. Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: International Committee on Computational Linguistics. , pp.5891–5902. (10.18653/v1/2020.coling-main.518)
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2020. Don’t patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: 28th International Conference on Computational Linguistics (COLING) Barcelona, Spain 13-18 December 2020. Published in: Scott, D. , Bel, N. and Zong, C. eds. Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics. , pp.5891–5902. (10.18653/v1/2020.coling-main.518)
- Perez Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2022. Pre-training language models for identifying patronizing and condescending language: an analysis. Presented at: LREC 2022 Marseille; France 20-25 June 2022. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association. , pp.3902-3911.
- Pérez-Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2020. Don't patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities. Presented at: The 28th International Conference on Computational Linguistics (COLING 2020) Virtual 8-13 December 2020.
- Perez-Almendros, C. , Espinosa-Anke, L. and Schockaert, S. 2022. SemEval-2022 task 4: patronizing and condescending language detection. Presented at: 16th International Workshop on Semantic Evaluation (SemEval-2022) Seattle, United States July 2022. Published in: Emerson, G. et al., Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Association for Computational Linguistics. , pp.298–307. (10.18653/v1/2022.semeval-1.38)
- Siddique, Z. , Turner, L. and Espinosa-Anke, L. 2025. Dialz: A Python toolkit for steering vectors. Presented at: The 63rd Annual Meeting of the Association for Computational Linguistics Vienna, Austria 27 July - 1 August 2025. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations). Vol. 3.Vienna, Austria: Association for Computational Linguistics. , pp.363-375. (10.18653/v1/2025.acl-demo.35)
- Siddique, Z. , Turner, L. and Espinosa-Anke, L. 2024. Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models. Presented at: The 2024 Conference on Empirical Methods in Natural Language Processing Miami, FL, USA 12-16 November 2024. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. , pp.18601-18619. (10.18653/v1/2024.emnlp-main.1035)
- Tuxworth, D. et al. 2020. Deriving disinformation insights from geolocalized Twitter callouts. Presented at: Workshop On Deriving Insights From User-Generated Text @KDD2021 14 -18 August 2021.
- Ushio, A. et al. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies?. Presented at: 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021) Bangkok, Thailand 1-6 August 2021.
- Wang, Y. et al. 2021. Deriving word vectors from contextualized language models using topic-aware mention selection. Presented at: 6th Workshop on Representation Learning for NLP (RepL4NLP 2021) Virtual / Bangkok, Thailand 05 August 2021. Proceedings of the 6th Workshop on Representation Learning for NLP. Association for Computational Linguistics. , pp.185-194. (10.18653/v1/2021.repl4nlp-1.19)
- Wang, Y. et al. 2022. Sentence selection strategies for distilling word embeddings from BERT. Presented at: LREC 2022 Marseille, France 20-25 June 2022. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). European Language Resources Association (ELRA). , pp.2591-2600.
Teaching
- Database Systems
- MySQL
- MongoDB
- Machine Learning
- Natural Language Processing
- Data Science
- Python
Biography
I received my BA in English Philology from the University of Alicante (2001 - 2006), and an MA in English for Specific Purposes from the same institution (2006 - 2008). I worked as a Spanish and English teacher for a two years in Madrid, until I moved to the US with a Fulbright FLTA scholarship, where I taught undergraduate Spanish at the Lincoln University of the Commonwealth of Pennsylvania. After that, I received a laCaixa fellowship (a Spanish bank and foundation) to pursue an MA in Natural Language Processing, a joint program between the University of Wolverhampton (UK) and the Universitat Autònoma de Barcelona (Spain) (2011 - 2013). Then, I completed my PhD (2013 - 2017) at Pompeu Fabra University, at the same time I worked as an NLP scientist for Savana, a Spain-based company focused on delivering AI-driven solutions for healthcare.
Supervisions
I am currently co-supervising the following PhD students:
- Yixiao Wang, who is working on meaning representations, contextual word embeddings and relational encodings.
- Israa Alghanmi, who is working on the intersection between NLP for social media and enriched contextual word embeddings.
- David Owen, who is working on NLP for health applications, curently specialised in detection of anxiety and depression disorders in social media.
- Joanne Boisson, who is working on the identification and categorization of metaphors.