Is “Good” AI Harder than “Bad” AI?
8 min readMar 29, 2024
Many definitions of AI alignment describe a goal of “making AI systems follow human values” while attempting to skirt difficult moral philosophy questions about what those values should be. In this post, I argue that making “good” AI systems is probably harder than making “bad” ones, and we should consider the possibility that different methods may be required to produce “good” systems — in other words, aligning AI to “good”…