149 — Building a Conversational Agent Overnight with Dialogue Self-Play
One enormous shortcoming of machine-speech systems is that robots don’t know how to talk goodly. And as a result of this, they can’t synthesize their own sentences: They need a human to hard-code grammar for them, from which they can construct sentences. For example,
I am feeling happy to be alive Today I am so good I'm very content
from which a robot can choose an item from column A and combine it with column B. But how do you know what the robot needs to say before the need arises?
These Google engineers have designed a system to “rapidly bootstrap end-to-end dialogue agents.” In other words, this paper describes a protocol that you can use to train chatbots to speak natural language. Spoiler alert: It still requires a lot of human effort.
This system, dubbed “Machines Talking To Machines” (M2M), simulates both a chatbot and its human user, corresponding with unnatural sentences like “Book movie with name IS Inside Out and date IS tomorow,” or “Offer theatre IS Cinemark 16 AND time IS 6pm.”
These querylike languages are then shown to a human, who receives the instructions to “humanize” the text to something more like “I want to buy tickets for Inside Out for tomorrow”, or “How about the 6pm show at Cinemark 16?”
This outline-to-paraphrase conversion can be repeated multiple times for the same outline text: This enabled the researchers to construct the 3000-dialogue dataset listed in the paper. Though they do not propose a method for automatically generating text based on this system, it’s easy to see how this sort of enormous dataset could begin to pave the way for automation.