STEVEN LEVY
In 2016, Google engineer Illia Polosukhin had lunch with a colleague, Jacob Uszkoreit. Polosukhin had been frustrated by a lack of progress in his project, using AI to provide useful answers to questions posed by users, and Uszkoreit suggested he try a technique he had been brainstorming that he called self-attention. Thus began an 8-person collaboration that ultimately resulted in a 2017 paper called “Attention Is All You Need,” which introduced the concept of transformers as a way to supercharge artificial intelligence. It changed the world.
Eight years later, though, Polosukhin is not completely happy with the way things are shaking out. A big believer in open source, he’s concerned about the secretive nature of transformer-based large language models, even from companies founded on the basis of transparency. (Gee, who can that be?) We don’t know what they’re trained on or what the weights are, and outsiders certainly can’t tinker with them. One giant tech company, Meta, does tout its systems as open source, but Polosukhin doesn’t consider Meta’s models as truly open: “The parameters are open, but we don’t know what data went into the model, and data defines what bias might be there and what kinds of decisions are made,” he says.
As LLM technology improves, he worries it will get more dangerous, and that the need for profit will shape its evolution. “Companies say they need more money so they can train better models. Those models will actually be better at manipulating people, and you can tune them better for generating revenue,” he says.
No comments:
Post a Comment