For my latest publications please visit google scholar page

2024

<b>ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding</b><br> Israel Abebe Azime,<b> Atnafu Lambebo Tonja </b>, Tadesse Destaw Belay, Yonas Chanie, Bontu Fufa Balcha, Negasi Haile Abadi, Henok Biadglign Ademtew, Mulubrhan Abebe Nerea, Debela Desalegn Yadeta, Derartu Dagne Geremew, Assefa Atsbiha tesfau, Philipp Slusallek, Thamar Solorio, Dietrich Klakow

<b>InkubaLM: A small language model for low-resource African languages </b><br>

<b>Atnafu Lambebo Tonja </b>, Bonaventure F. P. Dossou, Jessica Ojo, Jenalea Rajab, Fadel Thior, Eric Peter Wairagala, Anuoluwapo Aremu, Pelonomi Moiloa, Jade Abbott, Vukosi Marivate, Benjamin Rosman

<b>Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets </b><br>

Israel Abebe Azime, <b>Atnafu Lambebo Tonja </b>, Tadesse Destaw Belay, Mitiku Yohannes Fuge, Aman Kassahun Wassie,

<b>CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark </b><br>

David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, <b>Atnafu Lambebo Tonja </b> et al.

<b>Gender Bias Evaluation in Machine Translation for Amharic, Tigrigna, and Afaan Oromoo </b><br>

Walelign Tewabe Sewunetie, <b>Atnafu Lambebo Tonja </b>, Tadesse Destaw Belay,Hellina Hailu Nigatu, Gashaw Kidanu, Zewdie Mossie, Hussien Seid, Seid Muhie Yimam

<b>NLP Progress in Indigenous Latin American Languages </b><br>

<b>Atnafu Lambebo Tonja </b>, Fazlourrahman Balouchzahi, Sabur Butt, Olga Kolesnikova, Hector Ceballos, Alexander Gelbukh, Thamar Solorio

<b>EthioMT: Parallel Corpus for Low-resource Ethiopian Languages </b><br>

<b>Atnafu Lambebo Tonja </b>, Olga Kolesnikova, Alexander Gelbukh, Jugal Kalita

<b>EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation </b><br>

<b> Atnafu Lambebo Tonja </b>, Israel Abebe Azime3, Tadesse Destaw Belay,Mesay Gemeda Yigezu, et al.

2023

<b>Cross-lingual Open-Retrieval Question Answering for African Languages </b> <br>

Odunayo Ogundepo, Tajuddeen Gwadabe, Clara Rivera, Jonathan H Clark, Sebastian Ruder, David Adelani, Bonaventure Dossou, <b>Atnafu Lambebo Tonja </b> et al.

<b>The Less the Merrier? Investigating Language Representation in Multilingual Models </b><br> Hellina Hailu Nigatu, <b>Atnafu Lambebo Tonja </b>, Jugal Kalita. <i>In EMNLP 2023 </i>

<b>AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR</b><br> Tobi Olatunji, Tejumade Afonja, Aditya Yadavalli, Chris Chinenye Emezue, Sahib Singh, Bonaventure F.P. Dossou, Joanne Osuchukwu, Salomey Osei, <b>Atnafu Lambebo Tonja</b>, Naome Etori, Clinton Mbataku. <i>In TACL 2023</i>

<b>MasakhaNEWS: News Topic Classification for African languages</b> <br> David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, Jesujoba O. Alabi, <b>Atnafu Lambebo Tonja</b>, Christine Mwase, Odunayo Ogundepo, Bonaventure F. P. Dossou, Akintunde Oladipo, …, and Pontus Stenetorp. <i>In IJCNLP-AACL [Best Paper Award …Area Chair Award(Resources and Evaluation) ], 2023 & AfricaNLP Workshop 2023</i>.

<b>AfriNames: Most ASR models" butcher" African Names</b><br> Tobi Olatunji, Tejumade Afonja, Bonaventure FP Dossou, <b>Atnafu Lambebo Tonja</b>, Chris Chinenye Emezue, Amina Mardiyyah Rufai, Sahib Singh. <i>In INTERSPEECH 2023</i>

<b>Parallel Corpus for Indigenous Language Translation: Spanish-Mazatec and Spanish-Mixtec</b><br> <b>Atnafu Lambebo Tonja</b>, Christian Maldonado-Sifuentes, David Alejandro Mendoza Castillo, Olga Kolesnikova, Noé Castro-Sánchez, Grigori Sidorov, Alexander Gelbukh. <i>In AmericasNLP Workshop at ACL 2023</i>

<b>Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models</b><br> <b>Atnafu Lambebo Tonja</b>, Hellina Hailu Nigatu, Olga Kolesnikova, Grigori Sidorov, Alexander Gelbukh, Jugal Kalita. <i>In AmericasNLP Workshop at ACL 2023</i>

<b>Masakhane-Afrisenti at SemEval-2023 Task 12: Sentiment Analysis using Afro-centric Language Models and Adapters for Low-resource African Languages</b><br> Israel Abebe Azime, Sana Al-azzawi, <b>Atnafu Lambebo Tonja</b>, Iyanuoluwa Shode, Jesujoba Alabi, Ayodele Awokoya, Mardiyyah Oduwole, Tosin Adewumi, Samuel Fanijo, Awosan Oyinkansola. <i>In SemEval-2023 Workshop at ACL 2023</i>

<b>Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities</b><br> <b>Atnafu Lambebo Tonja</b>, Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Moges Ahmed Mehamed, Olga Kolesnikova, Seid Muhie Yimam. <i>In RAIL-2023 Workshop at EACL 2023</i>

<b>Low-Resource Neural Machine Translation Improvement Using Source-Side Monolingual Data</b><br> <b>Atnafu Lambebo Tonja</b>, Olga Kolesnikova, Alexander Gelbukh, Grigori Sidorov. <i>Journal of Applied Sciences</i>

2022

<b>AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages</b><br> Bonaventure F. P. Dossou, <b>Atnafu Lambebo Tonja</b>, Oreen Yousuf, Salomey Osei, Abigail Oppong, Iyanuoluwa Shode, Oluwabusayo Olufunke Awoyomi, Chris Chinenye Emezue. <i> In SustaiNLP Wokshop, co-located with EMNLP 2022 </i>

<b>The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation</b><br> Tadesse Destaw Belay, <b>Atnafu Lambebo Tonja</b>, Olga Kolesnikova, Seid Muhie Yimam, Abinew Ali Ayele, Silesh Bogale Haile, Grigori Sidorov, Alexander Gelbukh. <i>In 2022 International Conference on Information and Communication Technology for Development for Africa (ICT4DA) </i>

<b>Improving neural machine translation for low resource languages using mixed training:The case of ethiopian languages</b> <br> <b>Atnafu Lambebo Tonja</b>, Olga Kolesnikova, Muhammad Arif, Alexander Gelbukh, Grigori Sidorov. <i> In Mexican International Conference on Artificial Intelligence </i>

<b>Early Ginger Disease Detection Using Deep Learning Approach</b><br> Mesay Gemeda Yigezu, Michael Melese Woldeyohannis, <b>Atnafu Lambebo Tonja</b>. <i> In International Conference on Advances of Science and Technology</i>

2021

<b>A parallel corpora for bi-directional neural machine translation for low resourced ethiopian languages</b><br> <b>Atnafu Lambebo Tonja</b>, Michael Melese Woldeyohannis, Mesay Gemeda Yigezu. <i> In 2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)</i>

<b>Multilingual neural machine translation for low resourced languages: Ometo-english</b><br> Mesay Gemeda Yigezu, Michael Melese Woldeyohannis, <b>Atnafu Lambebo Tonja</b>. <i> In 2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA) </i>