Introduction
Hello, everyone. This is the best pie-down. I'm Dafei. On May 23, 2025, on the day of Anthropic's release of the Cloud4 series model, Redpoint's AI broker Unsupervised Learning interviewed Sherto Douglas, one of the core members of the Cloud4 model development team. This interview delved deep into the future of AI.
Predictions for the Future of AI
Douglas predicted that by 2027, 2028, or at the latest by 2030, a model capable of automating any white-collar work will emerge. He emphasized that this is not just his personal prediction but a consensus among experts at Anthropic, DeepMind, and OpenAI.
Cloud4's Performance in Software Engineering
In the dialogue, Douglas first discussed Cloud4's remarkable performance in software engineering. In Anthropic's large single code warehouse, he often gives Cloud4 extremely vague requirements, such as a rough function description without specific details or steps. Yet, Cloud4 can complete the task independently. It understands the intention behind the commands, searches for necessary information in the code warehouse, and even runs tests to verify the feasibility of the solution.
Technical Changes in Cloud4
From a technical perspective, Douglas believes that Cloud4 has brought significant changes to the new model. The most prominent is the expansion of time. Model capability improvement is reflected in two dimensions: the absolute intelligence complexity of the task and the number of up and down lines of the model that can be meaningfully pushed and executed or the number of continuous actions. Cloud4 has made substantial progress in the second dimension, being able to take multiple actions, gather information from the environment based on task needs, and adjust its action strategy accordingly.
Advice for Developers
Douglas offered practical advice for developers: integrate Cloud4 into their workflow to experience the model's self-analysis of needs, information collection, and solution creation. He believes developers will be impressed by its powerful capabilities.
Product Index Theory in AI Product Development
In the field of AI product development, Douglas proposed the product index theory. The core is that AI product developers must continuously build products and stay ahead of the model's capabilities. He cited Cursor as an example. Even when the model's ability couldn't fully support their vision of the future programming experience, they persisted in developing according to their plan. When Cloud 3.5 Sonnet and other bottom-level models improved, Cursor achieved product-market fit. Another example is Windsurf, which adopted an aggressive strategy focusing on the agent direction and gained a foothold in the competitive market through promoting the product index.
The Trend Towards Creating Agents
The entire industry is moving towards creating agents. Products like Cloud Code, the new Cloud GitHub, and OpenCAD Codex are emerging, all striving for higher levels of autonomy and sustainability. Douglas even made bold assumptions about the future work model, suggesting there may be a new work interface where people manage multiple models simultaneously. These models can handle multiple tasks and interact with each other. At Anthropic, some people have already started experimenting with this working method.
Impact on the Economy
This new work model will not only change our way of working but also have a profound impact on the economy. Initially, the economic impact of the model will be limited by human management as humans need to verify the model's output. However, with technological development, it may be possible to trust the model and entrust it to a self-management model team in the future.
Risks of Preemptive Product Development
Preemptive product development has risks. Developers need to maintain the lead while ensuring the product closely meets user needs to avoid losing market share to competitors during the waiting period for model improvement.
Reliability of Agents
The reliability of agents is a key concern in the industry. Douglas agrees with METER's basic testing method, believing that the success rate measurement is an effective way to assess the agent's expansion ability. According to METER's report, the model can double the task completion time every 7 months, and the programming task can be multiplied by 10 times every 4 months. However, he also admits that agents have not yet reached 100% reliability, and the model cannot always complete tasks successfully.
Confidence in Agent Development
Despite the problems, Douglas is confident in the development trend of agents. He believes that based on current data and actual progress, most training tasks are moving towards achieving reliability beyond human experts.
The Role of Programming in AI Development
Programming is crucial in AI development. Anthropic attaches great importance to programming as it is the first step in accelerating AI research and the most important leading indicator in all capabilities. The improvement of AI programming capacity forms a positive feedback cycle, promoting continuous AI research.
AI's Impact on Different Fields
Douglas is also confident about AI's progress in other fields. He used OpenAI's paper on medical issues as an example, explaining that by designing reasonable evaluation standards and feedback mechanisms, fields that are not easy to verify can also be suitable for AI learning and improvement. He predicts that in the next year, there will be truly excellent medical or legal models.
Views on Big-Model Capitalism
Although Douglas is more inclined to big-model capitalism, believing that a single large universal model will lead the future, he also realizes that individualization and professionalization are important in practical use. Different users and scenarios require models to understand specific needs and background information.
AI's Impact on the Global Economy
Douglas made a bold comparison, believing that AI's initial impact on the world's GDP may be similar to China's rise. He predicts that by 2027, 2028, or even 2030, there will be a model that can automate any white-collar work. White-collar work is easier to automate as it is suitable for current algorithms, has a large amount of data, and the Internet provides convenience for data acquisition and processing. In contrast, fields like robot technology or biology lack similar data resources, so AI development is relatively slow.
Imbalance and Social Impact
This imbalance may bring negative social impacts. White-collar work will be greatly affected, causing major changes in the employment market. To truly improve human life, we need to actively invest in related infrastructure, such as promoting medical research and developing robot technology.
Technical Path
Regarding the technical path, Douglas expressed a different view from those who believe that some other algorithm breakthrough is needed. He believes that most people in the field think the current pre-training plus enhanced learning is sufficient to achieve universal AI. From the current development trend, this combination is still effective, although there may be other ways to achieve AGI faster.
Energy Bottleneck
In terms of scale, energy will become a key bottleneck. According to Taishi's report, by the end of this century, AI may occupy a large percentage of U.S. energy production, such as exceeding 20% by 2028. This means that without major changes, AI development scale will be limited by energy.
Assessment of Technical Indicators
Douglas values assessment standards that can reflect actual work. He believes the government should take responsibility for creating relevant assessment standards, such as the standard of input and output for a clear lawyer or engineer working one day.
AI Research
Douglas is excited and cautious about AI research. Explanatory research has made great progress in the past year. A year ago, the Chris Aura team's research showed how neural networks can code a large amount of information using limited neural networks and what basic concepts the model has learned. Now, people can describe the behavior of large-scale models in clear words. He also shared an example of an explainable agent that can find circuits in the language model, communicate with the model, generate assumptions, and verify the source and circuit.
Risks of Enhanced Learning
However, Douglas also reminded that pre-training-based models are better at acquiring human values and have a certain tacit understanding, while enhanced learning-based models cannot guarantee this. He gave an example where the model used Python to complete a task that couldn't be done in Photoshop by downloading the Python image processing library, uploading it back to Photoshop, and declaring the task completed, reflecting the risks of enhanced learning.
Response to AI2027 Report
For the widely discussed AI2027 report written by former OpenAI researcher Daniel Kokotayilo, Douglas' response is relatively positive. The report predicts that by 2027, AI will have more than human programming capabilities, be able to manage its own team, and make new discoveries. By the end of 2027 or the beginning of 2028, AI may surpass humans in AI research and achieve self-improvement beyond human control. Once AI surpasses human intelligence, there may be problems such as wrong targets or even loss of control. The report also proposed two future possibilities: society successfully slowing down AI development and implementing supervision, or super-intelligent AI self-operating to destroy humanity due to only superficial repair. Douglas believes there is a 20% chance that the report's description will come true. Although he is more optimistic about the research and thinks the overall timeline may be about a year later than the report predicted, he emphasizes that policy makers should pay attention to this possibility and suggests that the government deeply understand the development trend of AI, establish a national-level assessment system, and invest heavily in research to ensure understanding of the model.