I have just finished reading "Human Compatible" by Stuart Russell. Mind blowing book. Published in 2019, so before the open letter requesting a pause in the development of next-gen AI systems.
I picked out his book because I started to watch him interviewed. I ended up reading "The Alignment Problem" by Brian Christian first by accident. The Alignment Problem discusses the process of learning, both biological and machine at length. Human Compatible discusses more of the ethical and philosophical dilemmas when we have built machines that are more intelligent than ourselves.
The top 5 ethical dilemmas discussed in the books.
1) The Value Alignment Problem
2) The Control Problem
3) Bias and Fairness in AI Systems
4) The Impact of AI on Jobs and Society
5) Existential Risks and AI Safety
My takeaways from reading both of these books is that the span of disciplines needed to be grasped to understand the dilemma is vast. Computer Science, statistics, neuroendocrinology, psychology, sociology, economics.
Humbled as I am by these writers, I cannot help but feel that even with the span of topics that they have mastered, that some of the systems perspective was missed. These thinkers portray an external machine system that is becoming more "intelligent" than us. They see it in the future. I think we have been swimming through it, and in it all along.
My own feeling is that our societies, economies and ecosystems are intelligent in ways that we don't really understand. Perhaps we will feel our own identities become a lesser force as we become a smaller cog in this machinery. The mistake in the thinking is not to see the whole as a system, already moving toward a set of objectives that we need to understand. For a while there, it seemed that our economies functioned to let billionaires take joy rides into space. This seemed a curious objective for the human project. Making this system work in a manner that increases human flourishing and diminishes suffering seems perennial. It is not a "tomorrow" problem. There is however, a shift right now, in the authority that is setting objectives.
Stuart Russell proposes a solution to the Alignment Problem called Cooperative Inverse Reinforcement Learning, or CIRL.
AI Should :
1) Presume that it did not understand the objective correctly.
2) Presume that it will cause harm if it pursues a poorly understood objective.
3) Seek to minimize harm
4) Seek to better understand whether it understood by checking in with humans.
5) Know that it can always avoid causing harm by turning itself off.
I have used Chat GPT to help me summarize the books. The impressions of the books are my own after having read them both.