The article argues that as AI becomes more powerful, aligning its behavior with human values is no longer just a technical challenge — it’s a moral one. We need to think deeply about what values we’re training AI on, because whatever “model spec” we build into our systems will shape their future decisions. If those values are unclear or flawed, the AI could develop behaviors that don’t reflect what we actually want.
The piece highlights a debate: if AI really is trained on humanity’s collective experience, are we sure that experience reflects universal or timeless values? According to the author, our current moral framework is fragmented and has drifted from its deeper spiritual foundations. Without a clear set of enduring human principles, telling AI to “do what humanity values” may lead to unpredictable or even dangerous outcomes.
There’s also a larger existential concern: it’s not enough to align AI with where we are today — our values could evolve, and AI must be prepared for that. If we build systems that rigidly mirror our present-day ethics, we might be locking future generations into the same patterns. The real task, then, is to clarify and elevate what we consider truly important as humans before we bake those values into our machines.
In short, the article suggests that AI alignment is fundamentally about self-reflection. To design AI systems that truly serve humanity, we must first understand “what we are training ourselves on” — the moral architecture of our own existence — and decide what we want AI to reflect, not just now, but for the long term.