We are proud to announce Aula Fellow’s Victoria Kuketz’s recent appointment as an Obama Fellow. Follow Victoria for news of her Fellowship this year, where she will be concentrating on inclusion and rational governance.
Category: Hard Questions: Inclusion
Hard Questions: Inclusion
-

Saptarishi Futures: An Indian Intergenerational Wayfinding Framework
An Intergenerational Future Study model contextualized within Indian mythology, folklore, and generational value systems. This fusion explores ancient cultural wisdom and modern anticipatory governance to imagine just, inclusive, and regenerative futures across generations.
-

‘Mind the gap’: artificial intelligence and journalism training in Southern African journalism schools
This article examines journalism schools (J-schools) responses to the Artificial Intelligence (AI) ‘disruption’. It critically provides an exploratory examination of how J-Schools in Southern Africa are responding to the AI wave in their journalism curriculums. We answer the question: How are Southern African J-Schools responding to AI in their curriculums? Using a disruptive innovation theoretical lens and through documentary review of university teaching initiatives and accredited journalism curriculums, augmented by in-depth interviews, we demonstrate that AI has opened up new horizons for journalism training in multi-dimensional ways. However, this has brought challenges, including covert forms of resistance to AI integration by some Journalism educators. Furthermore, resource constraints and the obduracy of J-schools’ curriculums also contribute to the slow introduction of AI in J-schools.
-

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding
Known by more than 1.5 billion people in the Indian subcontinent, Indic languages present unique challenges and opportunities for natural language processing (NLP) research due to their rich cultural heritage, linguistic diversity, and complex structures. IndicMMLU-Pro is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) across Indic languages, building upon the MMLU Pro (Massive Multitask Language Understanding) framework. Covering major languages such as Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi, Tamil, Telugu, and Urdu, our benchmark addresses the unique challenges and opportunities presented by the linguistic diversity of the Indian subcontinent. This benchmark encompasses a wide range of tasks in language comprehension, reasoning, and generation, meticulously crafted to capture the intricacies of Indian languages. IndicMMLU-Pro provides a standardized evaluation framework to push the research boundaries in Indic language AI, facilitating the development of more accurate, efficient, and culturally sensitive models. This paper outlines the benchmarks’ design principles, task taxonomy, and data collection methodology, and presents baseline results from state-of-the-art multilingual models.
-

Data Journalism Appropriation in African Newsrooms: A Comparative Study of Botswana and Namibia
Data journalism has received relatively limited academic attention in Southern Africa, with even less focus on smaller countries such as Botswana and Namibia. This article seeks to address this gap by exploring how selected newsrooms in these countries have engaged with data journalism, the ways it has enhanced their daily news reporting, and its impact on newsgathering and production routines. The study reveals varied patterns in the adoption of technology for data journalism across the two contexts. While certain skills remain underdeveloped, efforts to train journalists in data journalism have been evident. These findings support the argument that in emerging economies, the uneven adoption of data journalism technologies is influenced by exposure to these tools and practices.
-

Potential and perils of large language models as judges of unstructured textual data
Rapid advancements in large language models have unlocked remarkable capabilities when it comes to processing and summarizing unstructured text data. This has implications for the analysis of rich, open-ended datasets, such as survey responses, where LLMs hold the promise of efficiently distilling key themes and sentiments. However, as organizations increasingly turn to these powerful AI systems to make sense of textual feedback, a critical question arises, can we trust LLMs to accurately represent the perspectives contained within these text based datasets? While LLMs excel at generating human-like summaries, there is a risk that their outputs may inadvertently diverge from the true substance of the original responses. Discrepancies between the LLM-generated outputs and the actual themes present in the data could lead to flawed decision-making, with far-reaching consequences for organizations. This research investigates the effectiveness of LLM-as-judge models to evaluate the thematic alignment of summaries generated by other LLMs. We utilized an Anthropic Claude model to generate thematic summaries from open-ended survey responses, with Amazon’s Titan Express, Nova Pro, and Meta’s Llama serving as judges. This LLM-as-judge approach was compared to human evaluations using Cohen’s kappa, Spearman’s rho, and Krippendorff’s alpha, validating a scalable alternative to traditional human centric evaluation methods. Our findings reveal that while LLM-as-judge offer a scalable solution comparable to human raters, humans may still excel at detecting subtle, context-specific nuances. Our research contributes to the growing body of knowledge on AI assisted text analysis. Further, we provide recommendations for future research, emphasizing the need for careful consideration when generalizing LLM-as-judge models across various contexts and use cases.







