Research

My research centers on interpretable AI and reinforcement learning: designing systems whose decisions can be understood, trusted, and improved by humans. I favor transparent approaches over black-box models, and I believe that if we can't explain why a system makes its decisions, we can't truly trust it.

Publications

General Game Playing as a Bandit-Arms Problem: A Multiagent Monte-Carlo Solution Exploiting Nash Equilibria

Brandon Mathewe Banda

Oberlin College Honors Papers, 116 (2019)

Paper Code

An interpretable AI model for multi-agent reinforcement learning that makes robust decisions under uncertainty. The model uses Monte Carlo Tree Search that converges to Nash Equilibria, producing informed self-play data rather than random sampling. This yields fully transparent decision pathways and achieves high accuracy on chess puzzles through two-stage RL. It runs in real-time without pre-training and generalizes to any domain that can be formally defined as a game, including autonomous navigation and real-time strategy.

Applied Research

Led a research agenda on how LLMs can automate developer workflows in open-source codebases. Scoped the problem space, designed experiments, and developed a prompting framework for automated test generation across AOSP and ChromeOS
Designed language-agnostic code-graph representations using Kythe to construct structured C++/Java training datasets from large-scale open-source codebases
Collaborated with DeepMind and CoreML teams to define data specifications and quality criteria for training Google's internal models on open-source code
Developed an unsupervised root-cause diagnosis method for Android using density-based clustering on T5X embeddings to group infrastructure errors by underlying cause

Research Interests

Interpretable AI

Designing machine learning systems whose decisions can be understood and trusted by humans. Developing transparent approaches that provide causal explanations rather than opaque predictions.

Reinforcement Learning

Designing RL agents that learn effectively from limited experience in complex environments. My honors thesis applied multi-agent RL with game-theoretic principles to general game playing, focusing on informed exploration over exhaustive search.

Constraint-Based Mathematical Modeling

Using linear optimization and constraint satisfaction to solve problems with structure, from mathematical art to resource allocation. I'm drawn to approaches where the constraints themselves encode meaning.

Future Directions

Formalizing data-efficient reinforcement learning in domains with intractable decision spaces. Rather than simulating every possibility, I want to develop methods that identify which regions of a search space are worth exploring, using interpretable reasoning to guide and explain that process.