Adam Straker
Shawn Swynx Wang is a programming expert (he is the author, for example, of ‘The Coding Career Handbook’), who yesterday addressed his theory of ‘How open source is eating artificial intelligence’ in his personal newsletter. To do this, he established a parallelism between the development of generative AIs both in the field of text and image: its analysis allows reviewing some key points of this technology that has not stopped generating headlines in recent months.
Do you remember GPT-2?
In February 2019, OpenAI announced the release of GPT-2, a ‘language model’ (a text-generating artificial intelligence, to understand us), which claimed that it was capable of producing texts so convincing that “could be used for disinformation or propaganda”, which is why they were only going to make a mutilated version available to the public of it (of 117 million parameters, compared to 1,500 million for the complete model).
The reaction of Anima Anandkumar, director of research at Nvidia, was blunt:
“What you are doing is the complete opposite of ‘open’. It is unfortunate that [pongáis en peligro] the reproducibility of results as the scientific effort.
[…] The progress in AI is, for the most part, attributable to open source and open publishing.”
Shortly after, OpenAI released a somewhat-less-mangled version, with 345 million parameters. But that same summer, some students managed to replicate the full version of GPT-2. Thus, OpenGPT-2 was born. By the end of the year, OpenAI had already released the original version of its ‘dangerous’ model.
Fast-forward a few months: OpenAI releases GPT-3 in May 2020, with an API in closed beta, and grants Microsoft “exclusive license” to use it shortly thereafter. Meanwhile, he had been creating a truly ‘open’ alternative to ‘OpenAI’EleutherAI, which published its 800 GB training dataset in January 2021, and in March they had already launched their GPT-Neo model with 2,700 million parameters.
Before the year was out, OpenAI had already removed the GPT-3 waiting list. The last big news we have in this field was the launch last June of BLOOM, an AI that generates texts in 59 languages, with 176 million parameters… and 100% open source.
Do you remember that in January there was no DALL-E 2? And that in July you didn’t know about Stable Diffusion?
As you can see, it’s been an intense three and a half years in the field of text generation using AI; a period of time in which the power of the models has skyrocketed and in which, although a closed model was the first to give the ‘bell’is done with another ‘open source’ model imposing itself and democratizing access to that technology.
But what has happened with regard to image-generating AIs? Exactly the same, but a much higher rate: everything has happened throughout this year.
The generative AI craze was started by GPT-2 in the realm of text and DALL-E 2 in the realm of images. Both are today dominated by free alternatives: BLOOM and Stable Diffusion
If Midjourney and DALL-E 2 announced the launch of their closed betas in March and April of this same year (with great repercussion in the case of the second), the August launch of Stable Diffusionan image-generating model that offers revolutionary results and is already the most used AI in this field, has been decisive in the recent elimination of the waiting list for the new DALL-E 2 open beta. openness and democratization of access to new technology.
But, of course, the most remarkable thing about ‘open source’ is that it allows the customization and integration of tools. Thus, in the few weeks since its launch, Stable Diffusion already has several user interfaces (web and desktop), with specific plugins for various design tools, and with several freemium online platforms that carry out their own implementations of this AI model.
In fact, the huge success of Stable Diffusion would have been unattainable for a closed toolsince it has been based on the multitude of documentation (guides, tutorials, YouTube courses, Twitter threads) that have been able to generate users in all languages thanks to the original availability of the official documentation, not very beginner friendly. It has also been possible to even spread tricks to allow SD to run on originally incompatible M1 Mac systems
No comments:
Post a Comment