Discussion about this post

User's avatar
Alex's avatar

There is a tension in alignment. By most definitions, an AI that perfectly does what we intended would be considered by most people to be "aligned". But our long-term wellbeing and moral development might conflict with our intentions! It is well-established that we are terrible at knowing what is best for us, and this uncertainty increases with time horizons. We can be confident that humans of 2100 will have different values and morals than we do. Imagine if ASI was created 200 years ago and left in the hands of chattel slave owners?

So the questions is "How do we preserve our agency, while also ensuring we don't use AI to Goodhart ourselves into a permanently-bad situation?"

Hussain's avatar

if we can't still clearly define consciousness and values how then can we align AI to this values?

this defination of alignment seems more factual for me "making AI systems try to do what their creators intend them to do"

3 more comments...

No posts

Ready for more?