Knowledge Structures for Decolonial AI

Contributors

Introduction

As a researcher and practitioner building Artificial Intelligence (AI) on the African continent and contributing to its governance and policy, I have observed several concerning practices over the last few years that if not addressed, threaten the self-determination of African communities in the AI age and inadvertently extend neo-colonial values. These practices are observed through three themes that I analyze later through the artifacts presented:

  1. Participation in AI infrastructure and value chains;

  2. Representation, Cultural Context and Indigenous Knowledge in AI; and

  3. Funding of AI infrastructure.

These observations have sparked important conversations about which actors truly constitute the African AI ecosystem and who is meaningfully participating in its development. The ecosystem in its current set up is perpetuating colonial power dynamics, extractive from marginalized communities and without equitable outcomes.

This essay is an exposition of approaches that collectively are resisting the status quo. Researchers, AI developers, legal scholars, activists and communities working on and from the African continent are decolonizing and crafting new narratives for African AI and the underlying knowledge structures. African AI here means AI that is developed, governed, and deployed in ways that reflect the unique geographical, cultural, and political contexts of African communities, not simply AI that happens to be used in Africa (Wairegi, Omino, and Rutenberg, 2021). Further, my use of the term "decolonizing" in this essay refers to the use of knowledge, systems and actions to actively overturn what Aníbal Quijano termed as "the colonial matrix of power that still endures after formal colonialism through global capitalism". This colonial matrix of power is racist and operates through "control of economy (in the form of labor exploitation and resource extraction), control of authority (through institutions and governance mechanisms), control of gender and sexuality (social reproduction and representation in systems), and control of subjectivity and knowledge (through epistemology or education)" (Quijano, 2000; Mignolo and Walsh, 2018). It is racist because it centers modernism and acceptable practices on European cultures, thereby legitimizing current economic systems, science, and knowledge through Eurocentric norms and mechanisms while delegitimizing other ways of knowing and organizing.

By decolonial AI, I mean AI that decenters western knowledge structures through practical action rather than critique alone i.e. what Mignolo calls "delinking" from the colonial matrix of power, and what Walsh grounds in praxis as the fundamental mode of decolonial work (Mignolo, 2007; Walsh, in Mignolo and Walsh, 2018). My argument in this article is that the colonial matrix of power is playing out in real-time in how big tech and funding mechanisms operate on the continent under the guise of ensuring that the AI systems they are selling are inclusive. If this feels like a harsh judgement of their practices, it is because big tech's labor exploitation and knowledge extraction is obscured through discourses of the beneficence of the systems being developed and deployed; that AI is a magic pill for development and prosperity (Hao, 2022; Hao, 2025; Mohamed, Png, and Isaac, 2020).

My artifacts demonstrate this by presenting concrete examples of tokenistic participation of Africans in AI value chains, extractive data collection practices funded by big tech and philanthropy, and the perpetuation of Western knowledge hierarchies through Large Language Models (LLMs) trained predominantly on Western languages and contexts, with African representation slapped on at the end. Rather than dwelling extensively on the problems therein (many of which are deeply investigated by scholars including Ruha Benjamin (2019), Safiya Umoja Noble (2018), Joy Buolamwini (2023), Timnit Gebru and colleagues (Bender et al., 2021), Abebe Birhane (2021), and Shoshana Zuboff (2019)), or diving deeper into theoretical frameworks of decolonization, the essay and artifacts I present offer insight into a decolonial moment. This is an optimistic collection of artifacts that showcase infrastructures and approaches that are being developed and an imaginary of what AI could look like when it prioritizes equitable distribution of resources and opportunities, technology transfer that builds local capacity, funding for local AI research and development in underrepresented regions, and non-extractive access to AI data and tools.

How do we begin to build otherwise from the colonial structures that enable the current extractive model and imagining AI that is accountable to the communities it draws from?

I propose non-standard and, in some cases, radical approaches to AI by showcasing alternative knowledge structures and highlighting how African communities are proactively organizing themselves to self-determine their futures in the AI age. The alternatives I present are not perfect blueprints nor do they completely address the problem of extractive AI. They are given as incomplete and in some cases already fragile examples that help us think through what it might mean to build otherwise, an otherwise where AI is accountable to the communities it draws from and is developed through practices that reimagine its underlying infrastructure.

Context

Communities lie at the heart of 'African decolonial AI'. It is through their participation in the AI value chain - their knowledge systems, their cultural identities and their resources - that infrastructures of AI can be "delinked" from the extractive colonial matrix of power and reoriented towards the communities that generate its most fundamental ingredient: knowledge!

I use the term knowledge structures to describe the ecosystems of decisions, data, models, and governance that determine whose knowledge becomes the foundation AI.

This structure is layered with the first being the data layer. AI systems are built and learn from data and the datasets that currently dominate AI training are drawn from eurocentric contexts, sources, and demographics. Most African knowledge in the form of language, culture, history and civic identity is either absent in AI training datasets, or where it exists, it has not been meaningfully captured on its own terms i.e. what gets recorded, preserved, and made legible to AI systems.

The second is the models layer of AI systems, what gets built from that data. This is where choices about architecture of AI, how it scales, and how it is trained determine whose knowledge defines the AI system, which is operationalized, distorted or excluded. Most pervasive AI systems in use have been produced in the form of Large Language Models that are developed on the assumption that large models trained on vast quantities of data are the standard against which all other approaches should be measured. This assumption concentrates model development in institutions with massive compute resources, overwhelmingly located in Western institutions, and treats African language and cultural knowledge as an add-on to an existing system rather than a foundation for a different one.

The third is the governance layer that defines who controls access, ownership, and the flow of value generated by data and models. This is the layer that determines whether communities that contribute knowledge to AI systems ever benefit from what is built with it. Licensing frameworks, funding conditions, platform ownership, and database architecture are all governance decisions, and they are all currently structured in ways that can route value away from the communities that generate it.

The colonial matrix of power that this essay traces through the African AI ecosystem operates across all three layers simultaneously and the artifacts I present attempt to show interventions at different layers of this knowledge infrastructure, and in some cases attempt to reimagine what the infrastructure itself could look like if it were designed from African communities outward rather than retrofitted onto systems designed elsewhere. They are not presented here as the perfect solutions or embodiments of decolonial AI as some are not complete, others are fragile, and some have already disappeared. They are presented as signals and make visible an imagination of decolonial AI. That visibility that this essay aims to bring is itself a form of knowledge infrastructure.

This leads to several questions that I use to analyze my artifacts and reflect upon:

  1. What does meaningful participation in AI development look like, and when does it become extractive or tokenistic?

  2. How do well-intentioned actors i.e. funders, researchers, open-source advocates, reproduce colonial structures without knowing it?

  3. What would African AI look like if it were designed without reference to what already exists?

Conclusion

The artifacts gathered in this essay do not tell a single story, they sit in tension with one another. The Gen-Z developers who built the Finance Bill GPT participated meaningfully, with urgency and lived expertise, and yet the infrastructure they built on captured the value of their labor. The funders who mandated open-sourcing of African language datasets believed they were advancing inclusion, and yet they inadvertently routed community knowledge toward global North institutions before local researchers could extract fair value from it. OpenAI included Tigrinya in its model and produced gibberish. Good intentions, operationalized through existing structures, reproduce those structures.

This is what the colonial matrix of power looks like in the AI age embedded in the architecture of systems. It is embedded in platform ownership, in funding conditions, in the assumption that openness is always progressive, in the scaling logic that treats a large model trained on Western data as the default against which all other models are measured. It does not require bad actors but infrastructure that goes unexamined.

The artifacts in this essay intervene at different layers of data, models, governance  and with different tools. The NOODL license intervenes at the governance layer, insisting that value must flow back to communities before it flows outward. Federated databases intervene at the infrastructure layer, embedding data sovereignty into architecture rather than treating it as a policy afterthought. InkubaLM intervenes at the model layer, rejecting the assumption that African AI should be a smaller, cheaper version of what already exists. The People's Archive intervenes at the layer of memory itself, insisting that a historical moment belongs to the people who lived it.

None of these interventions is complete. The archive is curator-led rather than community-led. The NOODL license is new and untested at scale. Lacuna Fund, which seeded much of the data infrastructure these alternatives depend on, closed in 2025. InkubaLM is ambitious but under-resourced relative to the models it must coexist with. The Gen-Z GPTs are gone. These are not failure stories, but they are honest ones. They show that decolonial AI is not a destination but a practice, constantly contested and constantly at risk of being absorbed back into the structures it is trying to displace.

What they collectively make visible, however, is a different imaginary. African AI does not have to be legible to Silicon Valley to be valid. It does not have to be large to be powerful. It does not have to be open to be generous. And it does not have to wait for external permission from funders, from Big Tech, from global governance frameworks  to begin. The dung beetle does not wait for a larger creature to carry the load. It builds the capacity to carry what matters, with what it has, on its own terms.

References

Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press.

Bender, E. M., Gebru, T., McMillan-Major, A., and Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). ACM. https://doi.org/10.1145/3442188.3445922

Birhane, A. (2021). Algorithmic Injustice: A Relational Ethics Approach. Patterns, 2(2), 100205. https://doi.org/10.1016/j.patter.2021.100205

Buolamwini, J. (2023). Unmasking AI: My Mission to Protect What Is Human in a World of Machines. Random House.

Carnegie Endowment for International Peace. (2024). How African NLP experts are navigating the challenges of copyright, innovation, and access. https://carnegieendowment.org/europe/research/2024/04/how-african-nlp-experts-are-navigating-the-challenges-of-copyright-innovation-and-access

Hao, K. (2022). Artificial intelligence is creating a new colonial world order. MIT Technology Review, April 19. https://www.technologyreview.com/2022/04/19/1049592/artificial-intelligence-colonialism/

Hao, K. (2025). Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI. Penguin Press.

Mignolo, W. D. (2007). Delinking: The Rhetoric of Modernity, the Logic of Coloniality and the Grammar of Decoloniality. Cultural Studies, 21(2–3), 449–514. https://doi.org/10.1080/09502380601162647

Mignolo, W. D. and Walsh, C. E. (2018). On Decoloniality: Concepts, Analytics, Praxis. Duke University Press.

Mohamed, S., Png, M.-T., and Isaac, W. (2020). Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence. Philosophy & Technology, 33, 659–684. https://doi.org/10.1007/s13347-020-00405-8

Noble, S. U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. New York University Press.

Okorie, C. and Omino, M. (2024). Nwulite Obodo Open Data License (NOODL), Version 1.0. Data Science Law Lab, University of Pretoria. https://licensingafricandatasets.com/nwulite-obodo-license

Ògúnrẹ̀mí, T., Nekoto, W. O., and Samuel, S. (2023). Decolonizing NLP for "Low-resource Languages": Applying Abebe Birhane's Relational Ethics. GRACE: Global Review of AI Community Ethics, 1(1). https://doi.org/10.60690/q2xhtx18

Perrigo, B. (2023). Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. TIME, January 18. https://time.com/6247678/openai-chatgpt-kenya-workers/

Quijano, A. (2000). Coloniality of Power, Eurocentrism, and Latin America. Nepantla: Views from South, 1(3), 533–580.

Quijano, A. (2007). Coloniality and Modernity/Rationality. Cultural Studies, 21(2–3), 168–178. https://doi.org/10.1080/09502380601164353

Sloane, M., Moss, E., Awomolo, O., and Forlano, L. (2022). Participation Is not a Design Fix for Machine Learning. In Proceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO '22). ACM. https://doi.org/10.1145/3551624.3555285

Tonja, A. L., Dossou, B. F. P., Ojo, J., Rajab, J., Thior, F., Wairagala, E. P., Anuoluwapo, A., Moiloa, P., Abbott, J., Marivate, V., et al. (2024). InkubaLM: A small language model for low-resource African languages. arXiv preprint arXiv:2408.17024. 

Wairegi, A., Omino, M., and Rutenberg, I. (2021). AI in Africa: Framing AI through an African Lens. Communication, technologies et développement, 10. https://doi.org/10.4000/ctd.4775

Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. PublicAffairs.