How DeepMind thinks it can make chatbots safer

Some technologists hope that one day we will be able to develop a super-intelligent AI system that people will be able to converse with. Ask it a question, and it will provide an answer that sounds like it’s made up of human experts. You can use it to seek medical advice, or to help plan a vacation. Well, at least that’s the idea.

In fact, we are still a long way from that goal. Even today’s most complex systems are pretty stupid. I once had Meta’s AI chatbot, BlenderBot, tell me that a famous Dutch politician was a terrorist. In experiments using AI chatbots to provide medical advice, they told fake patients to commit suicide. Doesn’t fill you with optimism, does it?

That’s why AI Labs is working to make their conversational AIs safer and more helpful before letting them loose in the real world. I just published a story about the latest achievement from Alphabet-owned AI lab DeepMind: a new chatbot called Sparrow.

DeepMind’s new trick for building a great AI chatbot is letting humans tell it how to behave— and force it to use Google searches to back up its claims. Human participants were then asked to rate the trustworthiness of the AI ​​system’s answers. The idea is to continue training artificial intelligence through dialogue between humans and machines.

For this story, I interviewed Sara Hooker, director of the nonprofit artificial intelligence research lab Cohere for AI.

She told me that one of the biggest barriers to safely deploying conversational AI systems is their fragility, which means they excel before being taken into unfamiliar territory, making their behavior unpredictable.

“It’s also a tough question to resolve because any two people can disagree on whether a conversation is inappropriate. Even if we agree that something is appropriate now, that may change over time, Or rely on a shared context that may be subjective,” Hook said.

Nonetheless, DeepMind’s findings underscore that AI security is more than just a technical fix. You need human participation.

Source link