For my latest publications please visit google scholar page

2024

InkubaLM: A small language model for low-resource African languages

Atnafu Lambebo Tonja , Bonaventure F. P. Dossou, Jessica Ojo, Jenalea Rajab, Fadel Thior, Eric Peter Wairagala, Anuoluwapo Aremu, Pelonomi Moiloa, Jade Abbott, Vukosi Marivate, Benjamin Rosman

Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets

Israel Abebe Azime, Atnafu Lambebo Tonja , Tadesse Destaw Belay, Mitiku Yohannes Fuge, Aman Kassahun Wassie,

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja et al.

Gender Bias Evaluation in Machine Translation for Amharic, Tigrigna, and Afaan Oromoo

Walelign Tewabe Sewunetie, Atnafu Lambebo Tonja , Tadesse Destaw Belay,Hellina Hailu Nigatu, Gashaw Kidanu, Zewdie Mossie, Hussien Seid, Seid Muhie Yimam

NLP Progress in Indigenous Latin American Languages

Atnafu Lambebo Tonja , Fazlourrahman Balouchzahi, Sabur Butt, Olga Kolesnikova, Hector Ceballos, Alexander Gelbukh, Thamar Solorio

EthioMT: Parallel Corpus for Low-resource Ethiopian Languages

Atnafu Lambebo Tonja , Olga Kolesnikova, Alexander Gelbukh, Jugal Kalita

EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation

Atnafu Lambebo Tonja , Israel Abebe Azime3, Tadesse Destaw Belay,Mesay Gemeda Yigezu, et al.

2023

Cross-lingual Open-Retrieval Question Answering for African Languages

Odunayo Ogundepo, Tajuddeen Gwadabe, Clara Rivera, Jonathan H Clark, Sebastian Ruder, David Adelani, Bonaventure Dossou, Atnafu Lambebo Tonja et al.

The Less the Merrier? Investigating Language Representation in Multilingual Models
Hellina Hailu Nigatu, Atnafu Lambebo Tonja , Jugal Kalita. In EMNLP 2023

AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
Tobi Olatunji, Tejumade Afonja, Aditya Yadavalli, Chris Chinenye Emezue, Sahib Singh, Bonaventure F.P. Dossou, Joanne Osuchukwu, Salomey Osei, Atnafu Lambebo Tonja, Naome Etori, Clinton Mbataku. In TACL 2023

MasakhaNEWS: News Topic Classification for African languages
David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, Jesujoba O. Alabi, Atnafu Lambebo Tonja, Christine Mwase, Odunayo Ogundepo, Bonaventure F. P. Dossou, Akintunde Oladipo, …, and Pontus Stenetorp. In IJCNLP-AACL [Best Paper Award …Area Chair Award(Resources and Evaluation) ], 2023 & AfricaNLP Workshop 2023.

AfriNames: Most ASR models" butcher" African Names
Tobi Olatunji, Tejumade Afonja, Bonaventure FP Dossou, Atnafu Lambebo Tonja, Chris Chinenye Emezue, Amina Mardiyyah Rufai, Sahib Singh. In INTERSPEECH 2023

Parallel Corpus for Indigenous Language Translation: Spanish-Mazatec and Spanish-Mixtec
Atnafu Lambebo Tonja, Christian Maldonado-Sifuentes, David Alejandro Mendoza Castillo, Olga Kolesnikova, Noé Castro-Sánchez, Grigori Sidorov, Alexander Gelbukh. In AmericasNLP Workshop at ACL 2023

Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models
Atnafu Lambebo Tonja, Hellina Hailu Nigatu, Olga Kolesnikova, Grigori Sidorov, Alexander Gelbukh, Jugal Kalita. In AmericasNLP Workshop at ACL 2023

Masakhane-Afrisenti at SemEval-2023 Task 12: Sentiment Analysis using Afro-centric Language Models and Adapters for Low-resource African Languages
Israel Abebe Azime, Sana Al-azzawi, Atnafu Lambebo Tonja, Iyanuoluwa Shode, Jesujoba Alabi, Ayodele Awokoya, Mardiyyah Oduwole, Tosin Adewumi, Samuel Fanijo, Awosan Oyinkansola. In SemEval-2023 Workshop at ACL 2023

Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities
Atnafu Lambebo Tonja, Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Moges Ahmed Mehamed, Olga Kolesnikova, Seid Muhie Yimam. In RAIL-2023 Workshop at EACL 2023

Low-Resource Neural Machine Translation Improvement Using Source-Side Monolingual Data
Atnafu Lambebo Tonja, Olga Kolesnikova, Alexander Gelbukh, Grigori Sidorov. Journal of Applied Sciences

2022

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages
Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Oreen Yousuf, Salomey Osei, Abigail Oppong, Iyanuoluwa Shode, Oluwabusayo Olufunke Awoyomi, Chris Chinenye Emezue. In SustaiNLP Wokshop, co-located with EMNLP 2022

The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation
Tadesse Destaw Belay, Atnafu Lambebo Tonja, Olga Kolesnikova, Seid Muhie Yimam, Abinew Ali Ayele, Silesh Bogale Haile, Grigori Sidorov, Alexander Gelbukh. In 2022 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)

Improving neural machine translation for low resource languages using mixed training:The case of ethiopian languages
Atnafu Lambebo Tonja, Olga Kolesnikova, Muhammad Arif, Alexander Gelbukh, Grigori Sidorov. In Mexican International Conference on Artificial Intelligence

Early Ginger Disease Detection Using Deep Learning Approach
Mesay Gemeda Yigezu, Michael Melese Woldeyohannis, Atnafu Lambebo Tonja. In International Conference on Advances of Science and Technology

2021

A parallel corpora for bi-directional neural machine translation for low resourced ethiopian languages
Atnafu Lambebo Tonja, Michael Melese Woldeyohannis, Mesay Gemeda Yigezu. In 2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)

Multilingual neural machine translation for low resourced languages: Ometo-english
Mesay Gemeda Yigezu, Michael Melese Woldeyohannis, Atnafu Lambebo Tonja. In 2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)