Is Artificial General Intelligence coming soon?
I think AGI or something that resembles it is going to arrive by the end of 2026.
I do not like sensationalism. In this newsletter, I tried to write mostly about proven tools and facts. A lot of times, major announcements in AI fail to live up to the hype. Whenever a company or a group of researchers announces something fantastic, I prefer to wait until such tools are available in our hands before talking about them.
But recently, there have been several significant pieces of news that hint at groundbreaking advancements may not be too far. I feel compelled to cover them.
Let’s look at a stranger-than-fiction news that involves Sam Altman, the CEO of OpenAI known for his relatively grounded and conservative style of speech. Yet, earlier this month he said he seeks 7 trillion dollars of investment for building AI chips.
I don’t want to be Captain Obvious here, but 7 trillion dollars is a lot of money. It almost sounds more like a line from Dr. Evil rather than that of a sensible businessman. The market cap of the world’s largest computer chip maker (TSMC of Taiwan) is 600 billion dollars. Adding the market cap of the second largest chipmaker, (Samsung) we still fail to break 900 billion dollars. WSJ’s discussion called Altman’s plan a “moonshot,” because of the sheer scale of the numbers. Even Jensen Huang, the CEO of Nvidia, thinks the number is too high. Yet Altman has been busy talking to multinational players including UAE to realize his plan. Allegedly, even the US government is wary of this move because of UAE’s ties to China. How will Altman pull this grand scheme?
Perhaps the question “how” is not as important as “why.” Why does Altman want 7 trillion dollars? It’s a sum large enough to dwarf the entire existing semiconductor industry. He must be envisioning a future where the demand for AI chips will eclipse the contemporary demand for every single computer chip put together.
Does he know something the rest of us don’t? A silly question, I know. But let’s imagine the possibility that he KNOWS that artificial general intelligence is around the corner, and he knows that AGI can be applied to all aspects of our lives. If that assumption is true, then his haste to prepare for this impending future makes some sense.
I would like to share two pieces of news that suggest radical changes could be coming to the AI world sooner than later.
Google’s Gemini 1.5: Game Changer?
Shortly after the introduction of their chatbot powered by Gemini 1.0, Google announced that Gemini 1.5 is coming. It claims that Gemini 1.5 will not only outperform Gemini Ultra 1.0, but it will have 1 million multi-modal token context size.
What does that mean? According to Google, it means Gemini 1.5 can watch hours of videos, frame by frame, then answer specific questions pertaining to any part of the video. It can also analyze a large code base and find a small piece of information hidden in a large amount of data—a problem that they call “the needle in a haystack” evaluation.
However, given the history of Google not always living up to its own hype, I want to reserve my judgement until Gemini 1.5 is available to the general public.
OpenAI’s Sora AI Video Generator
Recently OpenAI unveiled the existence of its Sora AI video generator. It caused quite a stir around the world.
Before Sora AI’s reveal, I was looking at different AI video generators to prepare an article about the topic. My effort might have gone to waste. If OpenAI’s Sora AI video generator is half as good as they claim it is, then there is no real reason to look at the competition. OpenAI’s Sora video generator simply outclasses other AI video generators, including Google’s Lumiere video generator which was once hyped as the best AI video generator.
To be fair, neither Sora nor Lumiere are available to the public yet. Still, if Lumiere is as good as Google claims it is, and Sora is half as good as OpenAI claims it is, then the latter still blows Lumiere out of the water.
Unlike other AI video generators, Sora actually seems to understand 3D space and the way objects and perspectives behave in 3D. Lumiere in its showcase still showed walking motions out of sync with the backgrounds. Sora AI fixes the problem. It’s the first AI generator to produce plausible and consistent artificial videos through complex object and camera movements. It seems to aim to generate not only statistically plausible subsequent frames, but also physically plausible series of frames. Moreover, OpenAI claims Sora AI can generate 60 seconds of videos at once, compared to other AI models that can only handle handful of seconds at a time.
Again, I would like to reserve the final judgment until Sora is actually available. But this was the moment that suggested to me that AI might eventually achieve anything we can imagine. The limitation of the existing AI video generators suggested perhaps there is a limit to what AI can do in near future. Open AI’s Sora seems to break that barrier.
While we still need to manage our expectations, the recent news brings out the possibility that major breakthroughs could be on the horizon, sooner than later.