Atnafu Lambebo Tonja
Atnafu Lambebo Tonja

Atnafu Lambebo Tonja

Google DeepMind Academic Fellow

University College London · London, UK

About

I'm a Google DeepMind Academic Fellow at University College London. Previously I was a postdoctoral researcher at MBZUAI in the UAE, working with Prof. Thamar Solorio. I hold a PhD in Computer Science from Instituto Politécnico Nacional, Mexico, where I was advised by Prof. Alexander Gelbukh and Prof. Olga Kolesnikova.

My research focuses on natural language processing for the world's under-resourced languages — building multilingual language models, evaluation benchmarks, and speech and multimodal systems that serve communities typically left out of mainstream NLP.

Research interests
01

Under-resourced languages

Bringing modern NLP to languages with little digital text — corpora, models, and benchmarks for African and Indigenous languages.

02

Multilingual LMs

Training and evaluating small and large multilingual language models that work across high- and low-resource languages.

03

Evaluation benchmarks

Culturally-aware, linguistically-honest benchmarks — so models are measured on what they actually need to do, not on convenient proxies.

04

Speech & multimodal

Speech recognition for African accents and clinical domains; vision-and-language datasets that reflect cultures outside the Western web.

Spotlight
★ Outstanding Paper EMNLP 2024

The Zeno's Paradox of "Low-Resource" Languages

Hellina H. Nigatu, Atnafu Lambebo Tonja, Benjamin Rosman, Thamar Solorio, Monojit Choudhury

A critique of the "low-resource" label itself — arguing that the term collapses meaningfully different language situations and obscures what actually needs fixing.

News
  1. 2026 2 papers accepted at ACL 2026: AfriMCQA-Multimodal Cultural Question Answering for African Languages and CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data.
  2. 2026 Joined UCL as a Google DeepMind Academic Fellow.
  3. 2025 Joined MBZUAI as a postdoctoral researcher with Prof. Thamar Solorio.
  4. 2025 1 paper accepted at NAACL 2025: ProverbEval — LLM evaluation for low-resource languages.
  5. 2024 1 paper at NeurIPS 2024 D&B: CVQA — culturally-diverse multilingual VQA benchmark.
  6. 2024 2 papers at EMNLP 2024: Zeno's Paradox of "Low-Resource" Languages ★ Outstanding Paper & Walia-LLM (Amharic).
  7. 2024 1 paper at LREC-COLING 2024: EthioLLM.
  8. 2023 2 papers at EMNLP 2023; 1 at TACL (AfriSpeech-200); Best Paper at AACL (MasakhaNEWS); 1 at INTERSPEECH (AfriNames).