Beyond the Hype

There is little value in paying to host a generic ChatGPT for your brand since it would be the same as all the other ChatGPT demos already on the internet and offer no competitive advantage. Value can only come from using it to support your business processes in such a way that it either lowers your operational cost or drives more sales. For example, supplementing or partially replacing your expensive support services like customer service with an automated service that can scale at a minimal cost. This is the field where chatbots are currently state-of-the-art, and we will explore how and if we can leverage ChatGPT to lower operational cost or increase revenue.

Let’s tackle the simplest solution first. Given how powerful ChatGPT is, why not just replace the chatbot with a specialized ChatGPT instance? This is a bad idea since ChatGTP's responses can be unpredictable or just incorrect; and in ChatGPT’s standard implementation it is impossible to control the responses completely. Considering that this chatbot would be representing your brand, having it produce an incorrect, or offensive, response could do serious reputational damage.

What about using ChatGPT to generate synthetic user input for more training data. Acquisition of actual user data can be time-consuming and expensive and may not cover a lot of the edge cases. Especially while building a chatbot from scratch, this is often a blocking problem. In theory, ChatGPT could just generate mountains of training data by generating variations of an input sentence of plausible customer text.

However, this introduces other issues, as greatly explained by RASA in their article. Augmenting NLU training data this way does not yield significant improvements to the model. Namely, there is a trade-off between faithfulness and variability. We either generate data with low variability that is just a small deviation of the source sentence and thus does not bring new information to the model. Or we generate data with high variability, which deviates too much from the source sentence and is thus unrelated and cannot be used to train the bot. Natural data generation might not be the best use case for now, but there are other NLP tasks required for a chatbot. Simply put, given a user sentence in natural language, a chatbot tries to do the following two things:

1. Understand the user's intent (NLU – Natural Language Understanding)

2. Generate the answer (NLG – Natural Language Generation)

The better a chatbot is at these two tasks the more services they can provide without human agent intervention, thus lowering the workload of your call centers. Any improvement, even small ones automatically scale to all conversations the chatbot handles, so the saving are a function of the number of customer interactions but the operations cost while the operational cost stay low.

Standing on the Shoulder of Giants

The underlying model of ChatGPT (the GPT-3 model) is one of the largest, most powerful language models nowadays, with 175 billion parameters. The model is one of the best when it comes to making a good guess as to what a human would say next, given what was already said. So, for example, given that a story starts with “Once upon a time there was a white rabbit”, what could the following sentence be. Or, more specifically, for chatbots - given the sentence “What does your product do?” the model would generate a sentence or even a paragraph of what it assumes a human would say.

The downside of this comprehensive perspective is its lack of understanding narrowed, domain-specific questions, which is a characteristic of enterprise chatbots.

Making Your Own ChatGPT

Given the costly nature of training a state-of-the-art large language model that needs 45 terabytes of text, it is likely that we will see a future where chatbots will simply be built on top of a pre-trained model and then finetuned to fit specific purposes, brands or products. “Finetuning” refers to refining models like ChatGPT to better recognize specific patterns of input and output, for example, sentences and words related to your company’s field. It is possible to finetune the large model with our own custom dataset so that the NLU and NLG fit our particular use cases. In this case, we still leverage the advantages of using a large model but with the finetuned touch of constraints of the NLU and NLG. Potential use cases will be discussed in a later paragraph.

This would allow companies to use a state-of-the-art chatbot that has the brand values and tone unique to your company while not having to incur the cost of creating the model from scratch.

Data Privacy

To fine tune the ChatGPT model specifically, you would have to agree to allow OpenAI the possibility to use your data to improve their services. Data privacy laws in most countries would require highly restrictive contracts with third parties to allow the sharing of customer PII data.