- So, what next?
- Will the interaction model be standardised? What about vendor lock-in?
- How will we overcome chatty Interfaces?
- Are AI interfaces a front-end engineer killer?
- Could we auto-generate clients?
- That’s it
The internet is alive with ChatGPT and generative AI discussions. Some say “it’s all hype!”, like the recent crypto trend, whilst others believe this is the biggest single legitimate step change in technology we’ve seen in decades. Regardless, it’s interesting to keep up on and see the opportunities as they emerge. Let’s talk about the recently announced ChatGPT plugin system. This won’t be an in depth explanation of AI, LLM’s or ChatGPT, but rather my initial response looking at the social media reactions to the ChatGPT plugin system. I’m curious about the response, where people think it’s going, where I think it’s going, the limitations, etc.
Here’s the TL;DR: of how it works…
Plugin developers expose one or more API endpoints, accompanied by a standardized manifest file and an OpenAPI specification. These define the plugin's functionality, allowing ChatGPT to consume the files and make calls to the developer-defined APIs. - ChatGPT plugins announcement.
ChatGPT does not know anything about your API other than what is defined in the OpenAPI specification and manifest file - Plugin manifest documentation
In a nutshell, ChatGPT announced the ability to make plugins, this essentially allows ChatGPT to interface with APIs and data that it does not yet have access to. Opening up a wealth of potential opportunities. Mitchells tweet caught my eye in particular though, notably: “absolutely no glue code”. Anyone who has built, or interacts with APIs knows that on the surface APIs appear simple, but under the hood APIs get fiddly: versioning, API chaining, rate limiting, authentication, backwards compatibility. Can ChatGPT drastically simplify how we consume and work with API’s?
The answer: Actually… kinda… yeah.
Here are a couple quotes that I plucked from the release announcement that I thought particularly useful or interesting…
The only thing language models can do out-of-the-box is emit text. This text can contain useful instructions, but to actually follow these instructions you need another process. - ChatGPT plugins announcement.
That’s the current known limitation of LLM… that it’s mostly based on text, and doesn’t have a great understanding of the world, like physics, gravity, mechanics, that type of thing. So, are ChatGPT plugins going to help AI better understand the real world context needed to reach another level of intelligence beyond a very, very advanced game of playing “guess the next word in this string of words”? That’s the question.
When a user asks a relevant question, the model may choose to invoke an API call from your plugin if it seems relevant; for POST requests, we require that developers build a user confirmation flow. - Chat Plugins Introduction.
So, there is actually an interface between ChatGPT and your API’s. It’s not just magically understanding the API’s and their context, sadly. You need to provide an OpenAPI specification, so that the data, behaviours and parameters can be understood, and a manifest file, which is (mostly static metadata), but does have a few interesting instructional properties, such as the descriptions you provide the model to instruct about how to work with the API set.
So, really there’s not that much to it. You feed in a manifest file with some metadata, you tell ChatGPT which endpoints to use, it roughly figures out how to call GET API’s and feed in parameters, etc. This is super cool, but it’s mostly fun to think about: what next? What does this enable? What are some opportunities? Flaws? Where could we see this going?
So, what next?
Since this technology is super new, we’ll simply have to wait and see what comes of it. So, here instead of answers, I wanted to pose a series of questions that came to me as I was reading through the announcements and the reaction…
Will the interaction model be standardised? What about vendor lock-in?
Don't be too invested because it'll be gone soon anyway. OpenAI will enforce stronger rate limits or prices will become too steep or they'll nerf the API functionality or they'll take your idea and sell it themselves or you may just lose momentum - Hackernews
OpenAPI have mentioned that they are looking to standardise the interface (this is really like an API for APIs that… read APIs?) which would mean some interoperability between AI providers, so it might be less of the “app store” moment than it first appears. This should go some way to calming those who say it’s too early to jump on this trend.
We expect that open standards will emerge to unify the ways in which applications expose an AI-facing interface. We are working on an early attempt at what such a standard might look like, and we’re looking for feedback - ChatGPT plugins announcement.
And it does seem that OpenAPI also intends to expose this behaviour beyond embedding directly in ChatGPT itself and through their own plugin ecosystem. Again, less of the “app store” moment, seems that they’re just starting with plugins inside ChatGPT, but eventually they can be used elsewhere. Unlike Google, I don’t see the ChatGPT interface being the core way we interact with this technology, so I fully expect plugins to move outside of the ChatGPT interface.
And we plan to enable developers using OpenAI models to integrate plugins into their own applications beyond ChatGPT - ChatGPT plugins announcement.
How will we overcome chatty Interfaces?
An immediate thought was the engineer in me wondering about about how effective and efficient these clients will be, won’t they flood API requests? What about optimisation? These OpenAPI plugins are almost like “GraphQL”, but for humans. An abstraction on top of APIs, but you pay for the flexibility in API request cost because the queries are not optimised for minimising data transfer and number of requests, it’s all about getting the “right” or most complete answer. It’s like a much more flexible query language, but the underlying API is potentially going to get flooded with requests.
Consider implementing rate limiting on the API endpoints you expose. While the current scale is limited, ChatGPT is widely used and you should expect a high volume of requests. You can monitor the number of requests and set limits accordingly. - ChatGPT plugins announcement.
Are AI interfaces a front-end engineer killer?
Another thought coming to me is where this leaves front-end development. If ChatGPT plugins wrap API’s, isn’t that the job of a front-end developer? To take a set of APIs and make them coherent to a human? There are still many use cases for regular UI’s, especially when the interface is particularly relevant to the query, but for things like: booking a restaurant, getting directions, etc. I don’t see any of these use cases being served better than through natural language anyway, there is rarely a visual component. Certainly natural language and augmented, or mixed reality for the visual component.
Could we auto-generate clients?
One other wacky thought I have is… can ChatGPT generate API clients? ChatGPT itself is essentially an API client that reads OpenAPI specs, so why can’t we use that for regular programming clients? Let’s say we are on version 1.0 of an API, and the API is updated to 2.0. If all clients are using AI interfaces that could immediately adapt to the new API contract, could that eliminate the need for backwards compatibility?
That’s it
So, that’s it for now. Not too in depth, just some initial thoughts and ideas that I wanted to share as I wrapped my own head around this super interesting announcement. If you’re keen to learn more, the announcement post and the documentation goes a long way to understanding how the plugin system works, these two links are mostly all that you need. If you have thoughts about where things are headed, reach out, I’d love to hear the ideas.