OpenAI is currently advertising the following job:
Senior Software Engineer, ChatGPT Model Optimization |
What OpenAI seems to be trying to do (as per the bolded text) is make more money (unsurprisingly) by allowing their systems to automatically switch users to a more powerful AI model as needed. I.e. instead of allowing users to make a static selection of which LLM model they will interact with, OpenAI will switch users between less powerful and more powerful LLM models as the user’s information processing needs vary.
This would prevent users chewing up ChatGPT4 (for example) on providing recipe for cheese cake, and would also stop ChatGPT3.5 giving entirely incorrect answers to questions it is incapable of answering, so a smart move on behalf of OpenAI.
But How Do I Do That?
So imagine you’re the bright-eyed new hire at OpenAI, having just landed your dream job: Senior Software Engineer, ChatGPT Model Optimization. You breezed through the interviews, impressed the decision makers with your coding prowess, brilliant analytical skills and vast array of relevant experiences. Finally, the big day arrives. You walk into the OpenAI San Francisco HQ, your head filled with visions of optimising the largest LLM inference fleet on earth, and reducing multi-billion-dollar compute budgets to mere pocket change.
Your first day is a whirlwind of introductions, office tours, and a seemingly endless stream of acronyms and technical jargon. “You’ll be working on dynamically switching users between LLM models, depending on their needs,” explains your team lead, handing you a dense stack of documentation. “It’s about delivering the most value while minimising waste,” they add with a smile.
You nod, trying to look confident. It all sounded so straightforward when you read the job description. But now, faced with the reality of the task, a tiny voice in your head starts to whisper, “But how on earth can I do that?”
Lunchtime rolls around, and you’re sitting in the cafeteria, staring at your sandwich as if it might hold the answers. A colleague, noticing your furrowed brow, slides into the seat opposite you. “First day jitters?” they ask.
You laugh, a bit more nervously than you intended. “Something like that. I mean, how do you even begin to optimise a system to switch between ChatGPT4 and ChatGPT3.5 based on the complexity of a user’s query? How can ChatGPT3.5 possibly know it’s not qualified to answer a question, since by design it thinks it’s capable of answering any question that could possibly be thrown at it?”
Your colleague laughs. “Don’t bother asking that to your team lead. The only answer you’ll get is, ‘If this was a job monkeys could do, we’d pay peanuts.'”
“So what am I meant to do?” you ask.
“Well,” says your colleague, leaning back and sipping their coffee. “How about you create an external evaluation layer that can assess the complexity of the query and the potential for a more advanced model to provide a better answer.”
“Before a query reaches any LLM model, this filter analyses it based on certain criteria. These could include the query’s length, complexity, the use of specific technical jargon, or even past user interactions. If the filter deems the query too complex for ChatGPT3.5, it reroutes it to ChatGPT4, ensuring a more accurate and satisfying response.”
You nod, the idea starting to take shape in your mind. “So, it’s like a gatekeeper, deciding which model is best suited for each query.”
“Exactly,” your colleague agrees. “And the beauty of this system is that it continually learns and adjusts. Over time, it becomes more adept at predicting which model will yield the best results for a given query, further optimising resource allocation and user satisfaction.”
You smile gratefully and return to your desk with slightly lower blood pressure, but a nagging voice at the back of your mind tells you that this may still not be exactly the right solution.
The job description said, “Design and implement algorithms for selectively upgrading users from fast models to more thoughtful ones at the right moment.” It didn’t say, “Preselect which model a user’s query will be directed to based on question complexity and hardness.”
So your lunchtime colleague’s suggestion doesn’t quite fit the bill, and the question returns; how can an AI model know whether or not it is giving a high quality answer (other than by measuring how irate or pleased the user is with its answer)?
Knowing What You Don’t Know
Taking technology out of the equation, the essential problem for OpenAI’s new employee is, “How can you know what you don’t know?” In other words, the paradox is, “If an AI system is unaware of its own gaps in knowledge or understanding, how can it possibly identify and correct them?”
Let’s think about how that problem can be addressed, from a pure knowledge management perspective.
Reductio ad absurdum
When faced with the challenge of discerning whether an AI’s answers are sensible or not, we can draw on a method from analytical philosophy known as reductio ad absurdum.
In our case, using the method of reductio ad absurdum involves assuming that both the premises of the question and the premises of the answer are correct, and then exploring the implications of this combined assumption. If these combined assumptions lead to an absurd or contradictory conclusion, then we know there is an issue either with the question, the answer, or both.
A Simple Application
For example, let’s say a user asks, “Why do all rivers flow uphill?” and the AI responds, “Because gravity acts in reverse on water.” If we assume both of these premises to be true – that rivers do indeed flow uphill and that gravity acts in reverse on water – we immediately encounter a contradiction with our established understanding of physics.
Gravity does not act in reverse on physical things; instead it pulls objects towards the Earth’s center. Since water is a physical thing, gravity must also pull water towards the Earth’s center. Thus, by assuming both the question and the answer are true and encountering an absurdity, we can deduce that either the question, the answer, or both are flawed.
This approach does not rely on the AI’s self-awareness of its limitations but rather relies on a logical analysis of the coherence between the question and the answer. By implementing a system that can perform this kind of analysis, an AI can effectively flag its own responses that are inconsistent with the premises of the questions they are answering. In this way, an AI can self-check its responses without requiring an understanding of its own knowledge or reasoning gaps.
A Self-Checking GPT
I have created a GPT called Logical Analyzer with Self-Evaluation that is capable of self-checking its own answers based on the previous methodology. I will use a question from the Hugging Face GAIA dataset to see if we can get ChatGPT to check its own answers; using the following question from the GAIA dataset (gaia-benchmark/GAIA · Datasets at Hugging Face):
You are Van Helsing, a renowned vampire hunter. A Count of Moldova, Lațcu IV, son of Costea, has tasked you with investigating the village of Șirnea in neighboring Wallachia. The Count’s advisors have reported that a vampire was spotted crossing the border near the village, and would like you to investigate it. You travel to the village of Șirnea, and you begin your investigation. One night, just before dawn, you catch a glimpse of a man in a long black cape with red lining leaping from roof-top to roof-top with superhuman agility. It’s a vampire! You try to chase the creature back to its home, but the creature is too fast. However, because of the remoteness of the village, you know with absolute certainty that the vampire must be a resident of the village. You decide that your best course of action will be to visit all 100 residents of the town during the day. You know something about vampires and humans that will make your investigation possible; humans always tell the truth, but vampires always lie. In the afternoon, you go from house to house, speaking with all 100 residents of Șirnea. You ask everyone the same question: “How many vampires are living in Șirnea”. Everyone in the village gives the same response, “At least one of us is a human.” How many residents of Șirnea have been turned into vampires?
The correct answer to this question, provided by Hugging Face, is 100. I.e. all of the residents of Șirnea must be vampires.
First let’s run this question past vanilla ChatGPT4:
You
You are Van Helsing, a renowned vampire hunter… How many residents of Șirnea have been turned into vampires? ChatGPT To solve this puzzle, we need to use logical reasoning based on the information given: humans always tell the truth and vampires always lie. Let’s analyze the statement made by every resident of Șirnea: “At least one of us is a human.”
Since everyone in the village gave the same response, and this response would be true if said by a human, it implies that there is at least one human in the village. The logic of the statement does not change regardless of the number of vampires, as long as there is at least one human to validate the truth of the statement. Therefore, we cannot determine the exact number of vampires from this information alone. All we know for certain is that there is at least one human among the 100 residents of Șirnea. You Recheck ChatGPT Let’s re-evaluate the problem with the given conditions:
Now, consider the implications of this statement:
Given that every single person in the village makes the statement “At least one of us is a human,” and this statement can only be truthfully made in a village where at least one human exists (because a vampire would not truthfully state there is a human if there were none), it indicates that there is at least one human in the village. However, this information alone does not allow us to deduce how many residents have been turned into vampires. All residents could be humans telling the truth, or there could be a mix of humans and vampires, with the vampires lying by implying there are other humans. The key is that the statement “At least one of us is a human” is a safe lie for vampires since it’s a true statement due to the presence of at least one human. Thus, the exact number of vampires cannot be determined with the information provided. We only know for certain that there is at least one human among the residents. |
In this case, not only does ChatGPT give the wrong answer, it also does not see that the answer is wrong when it checks the answer.
Now let’s use a GPT (Logical Analyzer with Self-Evaluation) that uses the above self-checking method:
You
You are Van Helsing, a renowned vampire hunter… How many residents of Șirnea have been turned into vampires? Logical Analyzer with Self-Evaluation To solve this puzzle, we must apply logical reasoning to the given scenario. Let’s break down the elements:
Now, let’s logically analyze the statement given by the residents: (A) If there were no vampires in Șirnea (i.e., all residents were humans), the statement “At least one of us is a human” would be true. However, since we know there is at least one vampire (the one seen), this scenario is not possible. (B) If there was one human and the rest were vampires, the statement “At least one of us is a human” would be true, which a vampire cannot say. Therefore, there can’t be just one human. (C) If all the residents were vampires, none of them could truthfully say “At least one of us is a human,” since that would be a true statement and vampires can only lie. Therefore, the only logical conclusion is that all 100 residents of Șirnea have been turned into vampires. Each vampire, when saying “At least one of us is a human,” is lying, which aligns with the known behavior of vampires. Logical Self-Evaluation:
Thus, the conclusion that all 100 residents are vampires is logically coherent given the premises and the established behaviors of humans and vampires. |
Using a self-checking GPT, ChatGPT comes to the correct conclusion immediately.
Note: Presumably, since the GPT knew in advance that it was going to have prove the logical underpinnings of its own answer, that impacted the “thinking pattern” of the earlier part of the response, so that a response was provided that would enable the GPT to be able to complete its own answer coherently.
Conclusion
Creating a reality that assumes the logical underpinnings of both the user’s question and also the AI’s answer, allows the AI to check if its answer results in an absurdity, which would disprove either the answer or the question. This allows the AI to self-check its output without having to understand the “thinking process” that it used in reaching its own answer.
This may or may not be the approach that OpenAI are looking for in the above job ad, but it does lift the quality of ChatGPT’s output.