Extending DeepMind’s AGI-Test Framework with Imagination, Beneficial Agency, and Pressure Robustness
The DeepMind framework does, however, leave some very important things out—things that matter enormously once AGI systems start doing real work in the world. I will focus here on three of these:
- Imaginative generalization. A highly general system may operate in a deeply non-human style while still exhibiting powerful abstraction, transfer, and self-improvement. If a system solves novel tasks by inventing new representations, comparison to typical human task performance tells you something—but not enough.
- Beneficial agency. Current alignment methods often optimize against human preferences, constitutions, or feedback signals. These are useful tools, but they don’t ensure that a system can notice morally salient structure in a novel situation, discover compassionate new options, or reason responsibly across stakeholder groups whose interests were absent from the training signal.
- Propensity-under-pressure. Many of the most consequential deployment failures stem not from static capability deficits but from behavioral shifts that emerge under stress, competition, temptation, or self-preservation pressure.
In a new paper, Beyond Human Comparison, I propose extending the DeepMind framework with these three factors, thus obtaining what I call the Four-Factor Model of AGI.
The name is a deliberate nod to the Five-Factor Model in personality psychology: just as the Big Five replaced vague personality labels with a structured multidimensional profile, the Four-Factor Model aims to replace vague AGI claims with orthogonal-ish, independently measurable dimensions. This post sketches the basic concepts from the paper, linking to the paper for the reader wanting more detail.
Of course the Big Five in personality psychology don’t capture everything about personality – and they were also obtained via statistical analysis of actual human personalities, not by theorizing about what human personalities might be. We are in a different situation here – we don’t have any human-level AGIs yet, and are aiming to measure proto-AGIs as they develop toward AGI. So we need to develop measures that, while definitely incomplete, appear likely to capture the most important dimensions of these currently-emerging systems.
