This document is still relevant. A more up-to-date (and messy) version can be found here. Besides that, this work-in-progress paper is a step in the new direction this research is shaping up to.

My current plan is to work on research that will enable policy makers to make more informed decisions and to develop foundational technologies to improve AI <> human cooperation.

Cooperation has been the key factor allowing humans to flourish [1],[2]. Without the ability to coordinate and work towards common goals, humankind might have been overcome by nature or our own destructive tendencies [3]. Despite our unprecedented capacity for cooperation, the societies we built in the past and that we build today grow beyond our ability to effectively coordinate them. This is shown by the collapse of past civilizations, where among other causes, a breakdown in coordination was frequently an underlying factor [4] for their failure. This pattern suggests our social structures outpace our biological capacity for managing complexity. Individuals end up choosing inferior alternatives as they are incapable of understanding the complexity of the choices they have [5],[6]. Human's innate tendency towards irrationality and biased thinking is exemplified by psychological paradoxes like the ones pointed out by Peter Singer [7] where we often favor actions due to their geographic proximity over clearly more rational, altruistic options. Ultimately I believe this irrationality is the root of our collective failure to cooperate.

The goals of my research are twofold:

  1. Developing modeling technologies to assist policymakers in: a) Creating more realistic scenarios of current economic and political systems. b) Testing more complex algorithms to solve coordination problems that these models can capture.
  2. Enhancing our understanding of cooperative technologies, how AIs perceive them and how we can teach AIs about cooperation.

For the first goal, I hope to collaborate with researchers working in the area of market shaping. An example is the work done by Rachel Glennerster, president of the Center for Global Development, in which she and her team apply Markov modeling in order to understand market behavior, including in extreme scenarios like the one of pandemics. Backed by the model results and the team's lobbying for investment in the development of the vaccine [8], their work is estimated to have saved millions of lives and billions of dollars in the COVID-19 pandemic [9]. My current research with Professor Wolfram Barfuss at the University of Bonn builds upon Markov models, extending them by incorporating temporal difference (TD) learning methods. TD learning offers a few advantages over traditional Markov models, including the ability to learn from incomplete sequences and to update estimates based on other learned estimates. Furthermore, my research expands the current technologies for modeling allowing for the exploration on how heterogeneous agents interact under various game-theoretical scenarios [10]. The novel integration of heterogeneous actors within TD learning environment represents a significant advancement in representation accuracy. It allows us, for example, to better model and compare vastly different scenarios, from the economies of extremely impoverished nations, to the economies of wealthy countries, to the effects of a pandemic, to better modelling societal collapse.

Regarding the second goal, the advancement of AI has led me to agree with forecasters [11] who worry these systems could be exploited to enhance collusion, further undermining the efficacy of our democratic systems against malicious actors. Moreover, I believe that sufficiently advanced Transformative AI (TAI) or Artificial General Intelligence (AGI) systems might develop superhuman cooperation capabilities. Given that cooperation was the pivotal characteristic enabling our species to dominate the planet, this development could pose a significant threat to life as we know it. My line of research, Cooperative AI, goes beyond improving models of human coordination, it also aims to enhance cooperation between AI systems and humans. The development of more robust frameworks for modeling also proves valuable in simulating the behavior of simpler AI systems than TAIs, such as Large Language Models (LLMs). We attempt to study how TAI and human cooperation by first understanding how AIs behave in the present day in game theoretic environments. We believe this is extremely valuable to ensure more cooperative outcomes when these systems are interacting with human systems.

[1] https://www.scientificamerican.com/article/humans-evolved-to-be-friendly/ [2] https://necsi.edu/complexity-rising-from-human-beings-to-human-civilization-a-complexity-profile [3] https://www.sciencedirect.com/science/article/pii/S0960982219303343 [4] The Collapse of Complex Societies, Book by Joseph Tainter [5] Models of Bounded Rationality Book by Herbert A. Simon [6] The Attention Economy: Understanding the New Currency of Business Book by John C. Beck and Thomas H. Davenport [7] https://www.utilitarianism.net/study-guide/peter-singer-famine-affluence-and-morality [8] https://www.nber.org/system/files/working_papers/w32059/w32059.pdf [9] https://80000hours.org/podcast/episodes/rachel-glennerster-market-shaping-incentives/ [10] https://docs.google.com/presentation/d/1a3Dc969Lc4xaHohcjnDenScpFtLa842JdTFG2QPSZO8/edit [11] https://arxiv.org/abs/2012.08630