A benchmark that evaluates cultural and regional knowledge across African contexts using images and multilingual questions/answers.
Representative samples showing culturally grounded questions in multiple African languages.
Afri‑MCQA is a culturally nuanced, multilingual VQA benchmark centered on African languages and communities. It measures whether Multimodal LLMs understand local artifacts, social practices, and region‑specific entities, beyond generic visual recognition.
Exact counts TBD after final curation.
We host the dataset on Hugging Face Datasets (train/dev/test splits) and release loaders in Python.
from datasets import load_dataset
load_dataset("afri-nlp/afri-mcqa", split="validation")
| Model | Vision Encoder | Lang | Accuracy (dev) |
|---|---|---|---|
| Open‑CLIP + LLM (demo) | ViT‑L/14 | Multilingual | — |
| Qwen‑VL 7B (demo) | — | Multilingual | — |
| GPT‑4o‑mini (demo) | — | Multilingual | — |
Reproduce with our evaluation scripts; update this table for your camera‑ready.
Submit predictions to our EvalAI leaderboard. We provide a starter notebook and a submission validator.
{
"submission_version": 1,
"preds": [
{"question_id": "AMCQA_000001", "answer": "B"},
{"question_id": "AMCQA_000002", "answer": "D"}
]
}
Multi‑choice labels: A/B/C/D. For multilingual free‑form, use the text field.
Preprint coming soon on arXiv.
Dataset builders and evaluation scripts on GitHub: github.com/afri-nlp/afri-mcqa.
Multimodal LLMs increasingly claim global competence, yet most benchmarks under‑represent African cultures, languages, and visual context. Afri‑MCQA aims to close this gap by measuring culturally grounded understanding across regions and languages at scale.
*Indicative; exact figures to be updated in the paper.
{
"id": "AMCQA_000123",
"image": "images/region/sample.jpg",
"language": "sw",
"question": "...",
"choices": {"A": "...", "B": "...", "C": "...", "D": "..."},
"answer": "B",
"category": "clothing",
"country": "TZ",
"rationale": "..."
}
See the dataset card for full annotation guidelines and redaction policies.
Afri‑MCQA is community‑driven. We welcome contributions across languages, annotation, model baselines, and documentation.
@misc{afri_mcqa2025,
title={Afri-MCQA: Multimodal Cultural Question Answering for African Languages},
author={First Author and Second Author and et al.},
year={2025},
eprint={XXXX.XXXXX},
archivePrefix={arXiv},
primaryClass={cs.CL}
}