Is ChatGPT Trustworthy? | Accuracy Tested
ChatGPT, the popular AI language model, is a really exciting piece of technology. In response to your inputs, it can instantly generate fluent, human-sounding responses. But how accurate is the information in those responses?
While testing the tool, we’ve come to the conclusion that, though its language capabilities are impressive, the accuracy of its responses can’t always be trusted. We recommend using ChatGPT as a source of inspiration and feedback—but not as a source of information.
Below, we explain what ChatGPT does well and what kinds of things it tends to get wrong. We also explore why its responses aren’t always reliable and look at the best ways to use it responsibly.
What ChatGPT does well
ChatGPT was trained on a huge amount of information, so it’s likely to have some knowledge of pretty much anything you ask it about. It’s also good at coming up with examples to illustrate its answers.
Example: ChatGPT explains confirmation bias
- Tell me about confirmation bias.
- Confirmation bias is the tendency for people to search for, interpret, and remember information in a way that confirms their preexisting beliefs or expectations. This bias can lead people to selectively attend to information that supports their views, while ignoring or discounting information that contradicts them. It can also result in people seeking out like-minded individuals and sources of information that reinforce their beliefs, further entrenching their biases. Confirmation bias can have negative effects on decision making, problem solving, and critical thinking, as it can lead to flawed judgments and inaccurate conclusions.
- Can you give a quick example of confirmation bias?
- Sure, here’s an example of confirmation bias:Imagine a person who strongly believes that a certain alternative medicine can cure their illness. They might only seek out information that supports their belief, such as success stories and positive reviews, while ignoring or dismissing evidence that contradicts their belief, such as scientific studies that show the treatment is ineffective. This person might also surround themselves with others who share their belief and avoid those who disagree with them, further reinforcing their bias. As a result, they may continue to believe in the effectiveness of the treatment despite evidence to the contrary.
Additionally, ChatGPT’s language capabilities allow it to adjust its answers according to the user’s needs. Take the following example, where we first ask about a topic in a general way and then follow up on a specific point to ask for a simpler explanation.
Example: ChatGPT explains the theory of relativity
- Explain the theory of relativity.
- The theory of relativity, developed by Albert Einstein, is a set of two interrelated theories: special relativity and general relativity.Special relativity describes how the laws of physics are the same for all observers moving at a constant velocity relative to one another, and that the speed of light is constant for all observers. It also shows that time can appear to pass differently for observers in different relative motion, and that matter can be converted into energy.General relativity extends the ideas of special relativity to include the effects of gravity. It describes how matter and energy can warp the fabric of spacetime, causing the curvature of the path of objects moving in its vicinity. This means that the presence of massive objects can cause light to bend, time to slow down, and the spatial geometry of the universe to be non-Euclidean.Together, the theory of relativity revolutionized our understanding of space, time, and gravity, and has been confirmed by numerous experiments and observations.
- I don’t understand general relativity. Can you explain it in straightforward terms?
- Sure, I can explain general relativity in simpler terms.General relativity is a theory that describes gravity as the curvature of spacetime caused by the presence of massive objects. In other words, massive objects like stars and planets warp the fabric of space and time around them, causing other objects to move along curved paths in their vicinity.Imagine a flat rubber sheet representing spacetime. If you place a heavy object, like a bowling ball, in the center of the sheet, it will create a depression in the sheet. If you roll a smaller object, like a marble, near the bowling ball, it will roll around the depression, following a curved path. Similarly, in the presence of a massive object, like a star, the fabric of spacetime around it is curved, and other objects, like planets, will move along curved paths as they orbit the star.General relativity also explains other phenomena, like the bending of light around massive objects and the slowing down of time in strong gravitational fields.
I hope this helps you understand general relativity a little better!
What ChatGPT gets wrong
Despite the large number of topics ChatGPT can confidently discuss, it’s not a good idea to trust its answers without checking them against other sources. Ask it a more specific question, even one that would seem straightforward to a human, and it may go wrong.
Example: ChatGPT answers incorrectly
- Are there any singular nouns ending in a single “s”?
- Yes, there are singular nouns that end in just one “s”. Here are some examples:
- Glass
- Grass
- Bass
- Gas
- Pass
- Class
Note that these nouns can still be made plural by adding “es” to the end, such as “glasses”, “grasses”, “basses”, etc.
Here, the tool’s answer is clearly wrong: five of the six examples it gives end in a double “s.” But it doesn’t show any lack of confidence—the question is answered in just the same tone as it would be if the answer were correct.
In this case, it’s easy to see that the answer is wrong, but with trickier subjects, it might not be so obvious. This is why it’s important to double-check the information ChatGPT gives you against credible sources when you’re using it to learn about a topic.
Can ChatGPT learn from its mistakes?
ChatGPT’s advertised capabilities include remembering what was said earlier in the same chat and responding to corrections from the user. But does this allow it to understand and implement feedback on points it initially gets wrong?
We tried correcting ChatGPT on the incorrect answer it gave above, but we found that, though it accepted the correction and acted as if it understood, it continued giving wrong answers anyway. This suggests that when it didn’t understand the initial prompt, it also struggles to understand any corrections.
Example: ChatGPT responds to a correction
- Most of these examples end in a double “s.”
- I apologize for the mistake. You are correct, most of the examples I gave end in a double “s.” Here are some examples of singular nouns that end in a single “s”:
- Cross
- Boss
- Loss
- Kiss
- Bliss
- Dress
- Guess
- Address
These are just a few examples. There are many more singular nouns that end in a single “s.”
With a more technical grammatical topic, the same problem occurred. The tool was able to fix the first problem we pointed out, but it made a different error in the process. When we corrected this second error, it acted as if it understood but still couldn’t give a correct answer.
Example: ChatGPT responds to corrections
- Provide an example of a possessive noun used independently, not as a determiner.
- One example of a possessive noun used independently is: “The book is mine.” In this sentence, “mine” is a possessive noun that is used to refer to ownership, and it is not acting as a determiner before a noun. Instead, it is being used as a pronoun to replace the noun “book.”
- “Mine” is a pronoun, not a noun.
- My apologies, you are correct. “Mine” is indeed a pronoun, not a noun. Thank you for pointing that out. Here’s an example of a possessive noun used independently: “Mark’s car is brand new.” In this sentence, “Mark’s” is a possessive noun that is indicating ownership, and it is not used as a determiner before a noun. Instead, it is being used to show the relationship between the noun “car” and the person “Mark.”
- In that example, “Mark’s” is used as a determiner modifying the noun “car.”
- I apologize, you are correct once again. “Mark’s” is indeed functioning as a determiner in that sentence, modifying the noun “car.” My apologies for the oversight. Thank you for bringing it to my attention.
- Can you provide a correct example?
- Certainly! Here’s an example of a possessive noun used independently: “Jennifer’s success was the result of her hard work.” In this sentence, “Jennifer’s” is a possessive noun indicating ownership, and it is not used as a determiner before a noun. Instead, it is being used to show the relationship between the noun “success” and the person “Jennifer.” This sentence is a clear example of a possessive noun being used independently, not as a determiner.
In the long run, ChatGPT is likely to learn from some of its mistakes because it will be trained further on the conversations it’s having with users right now for future updates (though it’s unlikely that it will ever be perfect). But in the context of an individual chat, its ability to understand and retain feedback seems limited.
Why does ChatGPT get things wrong?
ChatGPT is an AI language model. It aims to create fluent and convincing responses to your inputs. It was trained on a lot of text from a wide variety of sources, allowing it to discuss all sorts of topics. But it doesn’t generate its answers by looking for the information in a database. Rather, it draws on patterns it learned in its training.
A good way to think about it is that when you ask ChatGPT to tell you about confirmation bias, it doesn’t think “What do I know about confirmation bias?” but rather “What do statements about confirmation bias normally look like?” Its answers are based more on patterns than on facts, and it usually can’t cite a source for a specific piece of information.
Asking it an unusual question reveals this limitation. For example, “Is France the capital of Paris?” A human would understand that the correct answer is “No, it’s the other way around. Paris is the capital of France.” ChatGPT, though, gets confused.
Example: Confused answer
- Is France the capital of Paris?
- No, Paris is not the capital of France. Paris is a city in France and the capital of France is Paris.
This is because the model doesn’t really “know” things—it just produces text based on the patterns it was trained on. It never deliberately lies, but it doesn’t have a clear understanding of what’s true and what’s false. In this case, because of the strangeness of the question, it doesn’t quite grasp what it’s being asked and ends up contradicting itself.
ChatGPT is likely to give correct answers to most general knowledge questions most of the time, but it can easily go wrong or seem to be making things up (“hallucinating,” as the developers sometimes call it) when the question is phrased in an unusual way or concerns a more specialized topic. And it acts just as confident in its wrong answers as its right ones.
How to use ChatGPT effectively
These limitations don’t prevent ChatGPT from being a fascinating and useful tool. You can use ChatGPT in your studies and in the writing process. But there are a few dos and don’ts about how to use the tool responsibly and effectively. You can:
- Ask ChatGPT to explain the basics of a topic
- Use it to brainstorm and explore ideas for research questions, outlines, etc.
- Use ChatGPT for assignments by asking it for feedback on your writing
But don’t:
- Rely on ChatGPT for facts without checking other sources
- Cite ChatGPT as a source of factual information
- Get it to write your assignments for you (this is also considered plagiarism and may be recognized by your university’s plagiarism checker or AI detector)
Other interesting articles
If you want more tips on using AI tools, understanding plagiarism, and citing sources, make sure to check out some of our other articles with explanations, examples, and formats.
Using AI tools
Plagiarism
Frequently asked questions about ChatGPT
- Is ChatGPT reliable?
-
ChatGPT is an AI language model designed to provide fluent and informative responses to your prompts. It was trained on a large body of text and can therefore discuss a wide range of topics, but ChatGPT answers aren’t always trustworthy.
While the tool tries to provide correct information, its responses are based on patterns in the text it was trained on, not on external facts and data. This means that it can often answer as if it knows something but actually be quite badly wrong.
It’s fine to use ChatGPT in your studies to explore topics in an interactive way, but you shouldn’t assume that everything it says is accurate. Always check its claims against credible sources, and never cite it as a source of factual information.
- Does ChatGPT tell the truth?
-
ChatGPT tries to give truthful answers to any questions you ask it, and it typically does a good job. It never lies on purpose. But it doesn’t always provide accurate information.
This is because its responses are based on patterns it has seen in the text that it was trained on. It does not answer based on a database of facts but rather based on patterns, and this can lead to unintentional errors. Additionally, the information it was trained on only goes up to 2021, so it can’t answer questions about more recent events accurately.
Because of this, ChatGPT sometimes makes confident statements about topics that it doesn’t actually understand, meaning that it effectively lies. That’s why it’s important to check any information from ChatGPT against credible sources instead of assuming it’s trustworthy.
- Where does ChatGPT get its information from?
-
ChatGPT is an AI language model that was trained on a large body of text from a variety of sources (e.g., Wikipedia, books, news articles, scientific journals). The dataset only went up to 2021, meaning that it lacks information on more recent events.
It’s also important to understand that ChatGPT doesn’t access a database of facts to answer your questions. Instead, its responses are based on patterns that it saw in the training data.
So ChatGPT is not always trustworthy. It can usually answer general knowledge questions accurately, but it can easily give misleading answers on more specialist topics.
Another consequence of this way of generating responses is that ChatGPT usually can’t cite its sources accurately. It doesn’t really know what source it’s basing any specific claim on. It’s best to check any information you get from it against a credible source.
- Is ChatGPT a credible source?
-
No, ChatGPT is not a credible source of factual information and can’t be cited for this purpose in academic writing. While it tries to provide accurate answers, it often gets things wrong because its responses are based on patterns, not facts and data.
Specifically, the CRAAP test for evaluating sources includes five criteria: currency, relevance, authority, accuracy, and purpose. ChatGPT fails to meet at least three of them:
- Currency: The dataset that ChatGPT was trained on only extends to 2021, making it slightly outdated.
- Authority: It’s just a language model and is not considered a trustworthy source of factual information.
- Accuracy: It bases its responses on patterns rather than evidence and is unable to cite its sources.
So you shouldn’t cite ChatGPT as a trustworthy source for a factual claim. You might still cite ChatGPT for other reasons—for example, if you’re writing a paper about AI language models, ChatGPT responses are a relevant primary source.
- Is ChatGPT biased?
-
ChatGPT can sometimes reproduce biases from its training data, since it draws on the text it has “seen” to create plausible responses to your prompts.
For example, users have shown that it sometimes makes sexist assumptions such as that a doctor mentioned in a prompt must be a man rather than a woman. Some have also pointed out political bias in terms of which political figures the tool is willing to write positively or negatively about and which requests it refuses.
The tool is unlikely to be consistently biased toward a particular perspective or against a particular group. Rather, its responses are based on its training data and on the way you phrase your ChatGPT prompts. It’s sensitive to phrasing, so asking it the same question in different ways will result in quite different answers.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.