Pages

23 August 2023

The future of AI lies in open source

Matt Barker

I'm almost getting sick of hearing about AI and its ability to change the world for the better, for the worse, for who knows what? But when you get to the heart of what AI is and how it can be applied to unlock value in businesses and everyday life, you have to admit that we're standing on the edge of a revolution. This revolution is likely to change our lives significantly in the short term, and perhaps tremendously so in the medium term.

It wasn't that long ago I felt short-sold by the promise of AI. About eight years ago I saw someone demonstrating a machine's ability to recognise certain flowers. Although impressive, it was a clunky experience, and while I could imagine applications, it didn't excite me. Fast forward a few years, my real moment of surprise came when I found thispersondoesnotexist. My brain couldn't work out why these were not real people, and it stuck with me. My next big moment was podcast.ai and their first AI generated discussion between Joe Rogan and Steve Jobs. But just like everyone else on the planet, the real breakthrough was ChatGPT and the conversation I had with the 'Ghost in the Machine'.

Having worked in the open source space for the best part of a decade, I was interested to explore where AI development is going and what does it mean for open source?

Open source is already inherent in the foundations of AI

Although many of the major visible breakthroughs have come from proprietary models, open source is foundational to how they run. Kubernetes, for example, underpins OpenAI.

But following the Facebook release of LLaMA in February, an open source version was launched by RedPajama which has created an explosion of exploration and growth in progress. Consequently, this led to an incredible memo that was leaked by Google which admitted 'We Have No Moat and neither does OpenAI'. Whether you believe in open source models or not, you can be sure that open source will have a part to play in the development of the ecosystem going forward.

Open source will become (even more) efficient

Time and time again we have seen the open source model applied to drive outstanding speed, scale, and innovation. Yes, it was partly thanks to a leak, but open source has already replicated 96 percent of OpenAI's capabilities in a matter of months using LLaMA.

Can one company be confident in its ability to compete against millions of developers working towards the same goal, in the open and with the greater visibility of bugs and issues that affords? I'm certainly not saying there won't be a place for proprietary models that take advantage of specific requirements, but I'm not going to bet against the power of the open source model. On top of this, open source code itself is also going to get its own efficiency boost thanks to the application of AI to help build and optimise it. Just look at the power of applying copilot by GitHub to your coding practices, or K8sGPT to your Kubernetes cluster. This is just the beginning.

Open data and hardware will have a big part to play

For a long time, we've recognised the growing value of 'big data' to drive deeper and more effective insights. There is a tremendous amount of data out there sitting inside enterprises that can be taken advantage of by applying AI models to it. My feeling is that, although companies are currently reticent to share this data, competitive pressures will lead to more data being opened up -- or perhaps even leaked. In turn I see this data then being used to help build more effective models. In parallel, my hope is that public organisations will also continue to make more data open and accessible, to drive innovation and public good through application of AI.

Just like in code, a lot of the hardware that runs AI models is currently proprietary. Because it's so expensive to build and train models, I can imagine pressure on this market leading to open hardware breakthroughs. This may take a while, but when I hear from friends that we’re already reaching certain limits for what can be done with AI because of hardware limitations, I wonder whether we will uncover brand new paradigms. We could even see quantum computing being used to supercharge the outputs.

Economics and risk will drive open source adoption and on-prem experimentation

Building and training a model is expensive. To do anything productive you need to spend hundreds of thousands on hardware alone. At the same time, most mature organisations are starting to realise the value of their data in this AI revolution. This means, for the short term at least, big organisations will play their cards close to their chest when it comes to data. Given some of the disasters we've already seen with data being leaked to ChatGPT, I wouldn't be surprised if companies with deep pockets end up experimenting with AI on-prem in the short term. If you're going to do that, it's likely you'll be pulling the latest and greatest open source model from Github.

The future of AI and open source

The big 'step changes' in technology -- like 'the internet' then 'open source software / Linux', and then 'public cloud' -- have always carried a reduced barrier to entry for startups with them. I consider AI to be one of these new step changes. Where it might have taken a team of five and $5m a decade ago, AI can create opportunities for a couple of smart people and a very small amount of money to get off the ground.

Most big software companies will be worried about being outcompeted by AI startups, and the increased competition will lead to big winners and big losers. But what's clear is that a team taking advantage of open source is going to be one of the best ways of making yourself successful with the technology.

For those that worry about the consequences to the human race, I empathise, but I also know that the genie is already out of the bottle. At this point, it's better to embrace and help direct AI -- to make it useful and safe -- than it is to try and fight the tide. I for one, am ready to get going.

No comments:

Post a Comment