As one who works to train models that can be the next models or products you may hear about from various companies and organizations these questions hit differently. As I type this a precursor of LLM's corrects spelling and grammar and tries to predict the next word. No one considers this to be unusual at all. Yet I can recall a time in the 1990's when typing in MS word and using spell checks was considered academically dishonest cheating. Good spelling of ordinary useful words was considered a valuable skill, like being good at arithmetic. Now spell check is a fact of life. LLM's will at a minimum be like that. We'll all get used to giving a rough draft to a GPT and getting back something polished. We'll all feed our CV and a job description to an LLM and have it spit out a cover letter that connects our achievements to their specified needs... all so that an LLM on that end can decide if we match.

What Chat GPT "Thinks" of itself.
Amongst all of this Do LLM's think. From the "mouth" of Chat GPT themself. I asked it "Sure let me ask you does ChatGPT4o reason? IF I asked you to tell me the distance between the points (-5,7) and (7,14) why would you tell me what you do? I know you'd use the distance formula because it would give you the right answer. How do you know what you know, compared to how a human does? "
Here is a public link to the whole conversation I feel this illustrates a good valid use for an LLM. A brainstorming tool. It would probably also make a good thorough literature review easier, now that these sort of tools have access to the open internet for searching.Great question! When you ask me, ChatGPT-4, to calculate the distance between two points, I "know" how to respond based on patterns in the data I was trained on, not through conscious reasoning like a human. Here’s how I arrive at an answer and why it might seem like reasoning, even though it’s different from human thought.
1. How I Calculate the Distance Between Points:
When you ask me to find the distance between the points (-5,7) and (7,14), I recognize this as a problem that can be solved using the distance formula:
d=(x2−x1)2+(y2−y1)2d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}I’ve been trained on a large dataset, which includes examples of this type of problem and their correct solutions. So I "know" that when you mention two points, the next appropriate step is to apply this formula. I am essentially pattern-matching your query to similar problems in my training data and predicting the most statistically likely correct answer based on how I’ve seen problems like this handled.
2. What’s Happening Behind the Scenes:
Unlike a human, I don’t “understand” the problem the way you do. I don’t visualize the points, conceptualize geometry, or experience an "aha" moment when I use the distance formula. What I do is:
- Recognize a pattern: You’ve given me two points, which fits the pattern of a distance problem.
- Retrieve the solution method: From my training, I associate this pattern with the distance formula, which is just the most statistically probable next step.
- Calculate: I perform the necessary arithmetic, plug values into the formula, and give you the correct answer.
3. Human Reasoning vs. My Pattern Matching:
A human goes through a different process when solving this problem. Here's what that might look like:
- Understanding the problem: A human knows what points on a plane represent and understands the concept of distance intuitively.
- Recalling the formula: A human retrieves the distance formula from memory, knowing it was derived from the Pythagorean theorem (a deep conceptual connection that I don’t grasp).
- Visualizing: Humans may visualize the points and imagine the distance, reinforcing the reasoning process.
- Applying logic: Humans reason through each step, knowing why the formula works, based on their understanding of Euclidean geometry.
In contrast, I don’t "reason" through the problem like this. My responses are based purely on learned statistical associations. I don’t have a mental model of the problem or the physical world; I’m just very good at recognizing the patterns that produce correct answers based on past data.
4. Is This Reasoning?
Reasoning involves logical steps to reach conclusions, often requiring understanding and interpretation of principles or relationships. When I generate an answer, it may appear like reasoning because I provide accurate results, but fundamentally, I am predicting the next best word, number, or formula based on patterns in my training data.
To draw a parallel:
- Human reasoning: You reason by understanding the principles and concepts, such as the Pythagorean theorem’s connection to distance in a Cartesian plane.
- My pattern matching: I "know" to use the distance formula not because I understand geometry, but because it’s statistically the most likely way to respond when two points are mentioned in this format.
The Nobel Prize.
This brings me to the issue of the Nobel Prize in Physics for 2024 going to very foundational AI research. This research went into the physical and statistical work that would lead to neural networks. I am ok with this idea. The real problem with AI work as it is done now is the hardware is not very efficient. Someday someone will get a Nobel for figuring out how to run AI models on something which uses much less power and generates much less heat. That too will be physics.
Chat GPTs Variation of the Simulation Hypothesis.
Last but not least. Something to think about. I asked Chat GPT to propose a theory of everything base on that it "knows" all published physics data. The concept it came up with sounds like what a LLM that has likely been used to create code for computer simulations would come up with. Space-Time Voxels. I had it write up LaTeX of its idea and put it on Overleaf. It is curious. https://www.overleaf.com/read/nntbfxhnqcct#327ab2

The idea that at the Planck scale, whatever exist down there effectively acts like a tiny computer that computes its own state, like a little Turing machine based on the states of its nearest neighbors. The laws of physics we know and find are then either just a large scale emergent approximation of what goes on in these things or are sometimes the "code" by which these "voxels" operate. In a way to a computer, everything is a computer. I don't think the "simulation" hypothesis is true but it's getting harder and harder to argue against.
What do I think of this theory. The main problem is that there has to be a physical ... thing doing the calculations. It is that thing that concerns us. Any theory that makes the whole universe discrete needs to propose and explain what is this, most fundamental thing? Near as I can say is that perhaps there is just a generalized energy field that is in various states and those different states interact in ways that we call the fundamental forces. All just one smooth continuous "thing", in different states.
Comments