Tech

8 Googlers invented modern artificial intelligence.This is the inside story


The last two weeks before the deadline were crazy. While officials say some of the team still have desks in the 1945 building, they mostly worked in 1965 because it had a better espresso machine in the tiny kitchen. “People weren’t sleeping,” said Gomez, who as an intern lived in a constant debugging frenzy while also producing visualizations and diagrams for the paper. It’s common on projects like this to do ablation – take things out and see if what’s left is enough to get the job done.

“There are all kinds of possible combinations of tricks and mods, which ones help and which ones don’t. Let’s rip it out. Let’s replace it with this,” Gomez said. “Why does the model behave in this counterintuitive way? Oh, it’s because we didn’t remember to do the masking correctly. Did it work? Okay, on to the next one. All these components that we now call transformers are The output of this extremely fast-paced iteration of trial and error, aided by the Shazeer implementation, resulted in “something minimalist” from these ablations, Jones said. “Norm is a wizard. “

Vaswani recalls sitting on the couch in his office one night while the team was working on a paper. As he stared at the curtains that separated the sofa from the rest of the room, he was struck by patterns on the fabric that looked to him like synapses. Gomez was there, Vaswani told him, and they were The research will go beyond machine translation. “Ultimately, just like the human brain, you need to unify all these modalities—speech, audio, vision—under one architecture,” he said. “I have a strong hunch we’re onto something more general.” “

Among Google’s top brass, however, the effort was viewed as just another interesting artificial intelligence project. I asked a few Transformers people if their bosses had ever called them in to update the project. not that much. But “we knew this was potentially a big deal,” Uszkoreit said, “and that made us really fascinated by the last sentence of the paper, which we commented on for future work.” “

This statement foreshadows what might come next—the application of the transformer model to essentially all forms of human expression. “We are excited about the future of attention-based models,” they wrote. “We plan to extend the transformer to problems involving input and output. Output methods other than text” and investigate “images, audio, and video.”

A few nights before the deadline, Uszkoreit realized they needed a title. Jones noted that the team has outright rejected accepted best practices, especially LSTM, for one technique: attention. Jones recalled that the Beatles once named it “Attention.” A song called “All You Need Is Love.” Why not call the paper “All You Need is Attention”?

The Beatles?

“I’m British,” Jones said, “and it actually took me five seconds to think about it. I didn’t think they were going to use it.”

They continued to collect experimental results until the deadline. “Five minutes before we submitted the paper, the English and French data came in,” Palma said. “In 1965, I sat in my miniature kitchen and got the final result.” With less than two minutes left, they mailed the paper.



Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button