ChatGPT seems pretty smart, but there’s a lot going on under the hood that many people don’t understand — including whether ChatGPT learns from users.
ChatGPT doesn’t learn from users directly. It’s a static transformer model that predicts what words to say next based on a pre-existing training dataset and the message you sent to it. It cannot “learn” new things, as it is essentially nothing more than a sophisticated word-guessing engine.
However, the developers of ChatGPT do use user conversations to improve future versions of the model, so ChatGPT does learn from users indirectly in the sense that your chats will help the next GPT model updates be more effective.
Curious about how all this works? Follow along as I walk you through the ins and outs of ChatGPT’s learning capabilities, why ChatGPT can seem like it’s learning from its users when it’s really not, and much more.
TL;DR
- ChatGPT doesn’t learn directly from users. Instead, it uses a database of textual data it was previously trained on along with context from your messages to guess which word should come next in its response.
- ChatGPT can’t remember anything from one chat to the next. It only remembers what you’ve said during the current chat.
- ChatGPT learns from everyone, kind of. ChatGPT doesn’t learn from you directly, but the people who made it use anonymous chat data to help make future versions better.
- ChatGPT can forget things. If a chat is too long, ChatGPT can start to forget the beginning of the conversation. It can only remember a certain amount of information at a time.
How ChatGPT “Learns” Things
While ChatGPT can generate language that’s very similar to how humans speak and write, it doesn’t really understand language the way we do. Its operation is based on complex calculations and probabilities, not true comprehension.
ChatGPT is trained on a huge amount of text data – more than 300 billion words in the case of GPT-3 – and it uses that data to predict which word is most likely to come next in a given context.
The best way to understand this is through an example.
Take the statement: “The hairball on the floor was left by the ___”.
What word should come next?
A human and ChatGPT would approach this gap very differently.
I – a human (I think) – would fill in the blank with “cat”, because I have personal knowledge and experience with how my cats love to leave hairballs on the carpet rather than on the hardwood floor where they’re much easier to clean.
ChatGPT also says “cat”…
…but for an entirely different reason.
It doesn’t know what a cat is or that cats often leave hairballs. Instead, it chooses “cat” because, during its training, it found that in similar contexts, the word “cat” frequently is frequently associated with other key words in the sentence, such as “hairball” and “floor”
Here’s a visual of what ChatGPT’s “decision-making process” might look like when deciding what word to say next in this situation:
“The hairball on the floor was left by the…
Potential Word | Percentage Probability |
cat | 30.2% |
dog | 5.4% |
rabbit | 3.7% |
ferret | 1.9% |
hamster | 1.7% |
It’s producing a ranked list of words coupled with probabilities that each is the correct word.
In reality, it’s more complicated than this, as the probabilities are spread across an enormous vocabulary of possible words, and there are complex mechanisms (such as the model’s temperature setting) that can affect the final decision.
But in essence, when ChatGPT is responding to you, it’s performing this same probability calculation over and over for each successive word in its response.
This repetitive process is what allows ChatGPT to be versatile, adaptable, and capable of generating human-like text, but it’s also why it can’t truly “learn” anything aside from patterns of how words tend to be used together.
Temporary Learning within a Single Interaction
With occasional exceptions, ChatGPT’s knowledge is “frozen” in time at September 2021, as anyone who has tried to ask anything about current events has learned.
Any understanding or “learning” that ChatGPT exhibits during a conversation is limited to the single conversation you’re having.
Outside the current interaction, it won’t remember anything from the interaction – or any of the other interactions it’s had with other users, for that matter.
For instance, if you tell ChatGPT to always use the phrase, “Steve Rajeckas is the best person ever,” every time you say, “Say the phrase that Steve Rajeckas told you to say,” it would remember your instruction within that specific conversation.
However, if you started a new conversation with the same prompt, it would be as clueless as a goldfish about your prior instruction.
If you were having any doubts, this should prove that ChatGPT can’t transfer knowledge or context outside of a given conversation.
Why Doesn’t OpenAI Make ChatGPT Learn From User Interactions?
It’s easy to question why ChatGPT isn’t designed to learn and evolve based on personal user interactions. After all, wouldn’t that make it more intuitive, more personalized, and thereby more effective?
Well, probably not.
For starters, the process required to train ChatGPT is difficult and time-consuming.
From data collection (they literally scraped the entire internet), to model training, to fine-tuning with human reviewers (ensuring a safe and aligned model required 6 months), to eventual deployment, GPT-4 took over a year and a half to create.
They can’t just plug in user conversations for real-time training of the model without substantial technical challenges. It may not even be possible.
Aside from the enormous technical challenges that would likely impose, it would also ruin ChatGPT.
We need look no further than Microsoft’s Tay experiment to see why this would be a bad idea.
Tay was an artificial intelligence chatbot released by Microsoft on Twitter back in 2016. It was designed to learn from and evolve based on its interactions with users in real time.
Within a few hours of its release, Tay’s learning abilities were exploited by users who started feeding it racist, sexist, and otherwise offensive content. The AI started to parrot back these offensive remarks in responses to other users, and the result was a PR disaster for Microsoft. Tay was quickly taken offline, barely 16 hours after it was released.
The Tay incident serves as a stark example of why having ChatGPT learn and integrate everything its users say would be a terrible idea. Without effective filters and safeguards, such a system could be easily manipulated and misused.
ChatGPT Does “Learn” Indirectly From Users
While ChatGPT may not learn in the traditional sense, there’s a different kind of learning process happening behind the scenes, and it’s just as crucial.
This process involves OpenAI using conversation data to refine and train future models.
Here’s how it works: Your conversations with ChatGPT, along with those of millions of other users, are taken into consideration.
This data is then anonymized and aggregated to preserve user privacy.
It’s important to note that no one is directly reading your conversations; instead, the model processes the information at a high level to identify patterns, spot biases, flag harmful content, and highlight any potential loopholes that might be exploited by individuals with less than noble intentions.
These insights are incredibly valuable for enhancing the model’s performance, improving its predictive abilities, and making it more resilient against misuse.
So, while your chat with ChatGPT today might seem inconsequential, you’re actually contributing to the development of more advanced models, such as GPT-5.
How to Opt Out Of Having Your Conversations Used For Training
If the thought of your conversations being used for data analysis sends alarm bells ringing in your head, OpenAI has privacy safeguards in place.
For those who prefer not to contribute their data to model training, ChatGPT offers a “Private” mode.
Enabling this mode does two things:
- It ensures that your conversations are not included in the dataset used for training future models.
- It guarantees that your interaction history isn’t stored within your ChatGPT account. This way, you can maintain full privacy while using ChatGPT still benefiting from the language generation capabilities of the model.
Why ChatGPT Forgets Things
If you engage in a single chat long enough, you may have noticed that ChatGPT forgets what you are talking about, or maybe even that it stops writing mid-sentence.
This forgetfulness happens because of ChatGPT’s token system.
Each ChatGPT session starts with a finite number of “tokens.” You can think of a token as a piece of a word.
Here’s how the ratio of tokens to words generally adds up:
- 1 token = ~4 characters in English or ¾ of a single word
- 100 tokens = ~75 words
- Wayne Gretzky’s quote “You miss 100% of the shots you don’t take” = 11 tokens
Imagine ChatGPT as a bucket that can only hold a limited amount of information at any given time. This information is represented as tokens, or bits of conversation. As new information comes in, the earliest tokens must be removed to make room for the new ones.
Here are the token limits in each GPT model:
- GPT-3.5: 4,096 tokens (~3,072 words)
- GPT-4: 8,192 tokens (~6,144 words)
If I’m using GPT-3.5 and type in 3,500 words, ChatGPT will only be able to use the last 3,100 words or so in my message as context for its response.
Let’s demonstrate this with an example.
I told ChatGPT to play a game where it tells me the first word in the series of text I type in.
It handled this with no problem when I provided it with 50 words.
But when I provided it with 4,000 words, it wasn’t able to correctly tell me the first word. In fact, it had forgotten the context of the initial prompt entirely.
As you can see, ChatGPT can’t even remember all of the information given to it in a single conversation session. Getting it to remember user conversation data for use in other conversations isn’t currently possible.
Final Thoughts
ChatGPT is a great AI chatbot, but it doesn’t directly learn from users.
It can generate human-like text because it’s trained on enormous amounts of data, which allows it to make predictions about what comes next in a conversation. However, it can’t evolve or store information from one conversation to the next.
It’s also not capable of remembering or learning from each individual interaction, and it even has trouble remembering what was said in a single conversation.
That said, OpenAI does use the aggregated and anonymized data from millions of interactions to improve future models, so ChatGPT is “learning” indirectly from users in that sense.