In summary
- OpenAI’s new Orion model achieves GPT-4 performance levels at 20% of its training, but with smaller incremental improvements compared to previous models.
- OpenAI faces a limitation of quality data to train models and explores the generation of synthetic data and the development of advanced reasoning models as possible solutions.
- The company divides its development into two lines: the O-Series, focused on intensive reasoning, and Orion for general language tasks, with the intention of converging both in the future.
OpenAI’s upcoming AI model is generating smaller performance gains than its predecessors, sources familiar with the matter told The Information.
Tests conducted by employees reveal that Orion reached the GPT-4 performance level after completing only 20% of his training, according to The Information.
The quality increase from GPT-4 to the current version of GPT-5 appears to be smaller than that from GPT-3 to GPT-4.
“Some company researchers believe Orion is not consistently better than its predecessor at handling certain tasks, according to (OpenAI) employees,” The Information reported. “Orion performs better on language tasks, but may not outperform previous models on tasks such as programming, according to an OpenAI employee.”
While it may seem impressive to some that Orion approaches GPT-4 with only 20% of its training, it is important to note that the early stages of AI training typically produce the most dramatic improvements, with later phases generating smaller gains. .
Therefore, the remaining 80% of training time likely will not produce the same magnitude of advancement seen in previous generational leaps, sources said.
Image: V7 Labs
The limitations come at a critical time for OpenAI following its recent $6.6 billion funding round.
The company now faces heightened investor expectations as it grapples with technical constraints that challenge traditional scaling approaches in AI development. If these early versions don’t live up to expectations, the company’s next fundraising efforts might not meet the same enthusiasm as before, and that could be a problem for a potentially for-profit company, which is what Sam Altman seems to want for OpenAI.
The disappointing results point to a fundamental challenge facing the entire AI industry: the decline in the supply of high-quality training data and the need to stay relevant in a field as competitive as generative AI.
Research published in June predicted that AI companies will exhaust publicly available human-generated text data between 2026 and 2032, marking a critical turning point for traditional development approaches.
“Our findings indicate that current LLM development trends cannot be sustained solely by conventional data scaling,” the research paper states, highlighting the need for alternative approaches to improving models, including synthetic data generation, learning by transfer from data-rich domains and the use of non-public data.
The historical strategy of training language models with publicly available text from websites, books, and other sources has reached a point of diminishing returns, with developers having “largely squeezed everything they can out of that type of data,” according to The Information.
How OpenAI is Tackling This Problem: Reasoning vs. Language Models
To address these challenges, OpenAI is fundamentally restructuring its approach to AI development.
“In response to the recent challenge to training-based scaling laws posed by slowing GPT improvements, the industry appears to be directing its efforts toward improving models after their initial training, potentially generating a different type of scaling law. “, according to The Information.
To achieve this state of continuous improvement, OpenAI is separating model development into two distinct tracks:
The O-Series (which appears to be codenamed Strawberry), focused on reasoning capabilities, represents a new direction in model architecture. These models operate with significantly higher computational intensity and are explicitly designed for complex problem-solving tasks.
The computational demands are substantial, with initial estimates suggesting operating costs six times higher than current models. However, improved reasoning capabilities could justify increased spending for specific applications that require advanced analytical processing.
This model, if the same as Strawberry, is also tasked with generating enough synthetic data to constantly increase the quality of OpenAI’s LLMs.
In parallel, the Orion Models or the GPT Series (considering that OpenAI registered the GPT-5 trademark) continue to evolve, focusing on general language processing and communication tasks. These models maintain more efficient computational requirements while leveraging their broader knowledge base for writing and argumentation tasks.
OpenAI CPO Kevin Weil also confirmed this during a Q&A session and said he hopes to converge both developments at some point in the future.
“It’s not one or the other, it’s both,” he responded when asked if OpenAI would focus on scaling LLMs with more data or use a different approach, focusing on smaller, but faster models, “better base models plus Strawberry scaling.” /computation time in inference.”
Temporary or permanent solution?
OpenAI’s approach to addressing data scarcity by generating synthetic data presents complex challenges for the industry. The company’s researchers are developing sophisticated models designed to generate training data, however, this solution introduces new complications to maintain model quality and reliability.
As Decrypt previously reported, researchers have found that training models with synthetic data represents a double-edged sword. While it offers a potential solution to data scarcity, it introduces new risks of model degradation and reliability concerns with proven degradation after several training iterations.
In other words, as models are trained with AI-generated content, they can begin to amplify subtle imperfections in their results. These feedback loops can perpetuate and magnify existing biases, creating a cumulative effect that becomes increasingly difficult to detect and correct.
OpenAI’s Foundations team is developing new filtering mechanisms to maintain data quality, implementing different validation techniques to distinguish between high-quality and potentially problematic synthetic content. The team is also exploring hybrid training approaches that strategically combine human and AI-generated content to maximize the benefits of both sources while minimizing their respective disadvantages.
Post-workout optimization has also gained relevance. Researchers are developing new methods to improve model performance after the initial training phase, potentially offering a way to improve capabilities without relying solely on expanding the training data set.
That said, GPT-5 remains an embryo of a full model with significant development work ahead. Sam Altman, CEO of OpenAI, has indicated that it will not be ready for deployment this year or next. This extended timeline could prove advantageous, allowing researchers to address current limitations and potentially discover new methods to improve GPT-5 before its eventual release.
Edited by Josh Quittner and Sebastian Sinclair
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.
Crypto Keynote USA
For the Latest Crypto News, Follow ©KeynoteUSA on Twitter Or Google News.