Sora's Departure, Yet Not Entirely: Assessment of Abilities
The globe is getting a sneak preview of a significant transformative technology of our era – one that's suspected to surpass the displacement of resources brought about by some of the other multimedia models we've encountered in the past.
Sora, with its ability to generate full videos from prompts, is poised to render redundant various aspects of filmmaking, such as casting, staging, and expensive film operations. And amazingly, it's supposed to be available, at least to ChatGPT Plus and Pro users, right now!
A Seismic Shift
Consider the intricacies involved in manufacturing, utilizing, and advancing physical film for photos. Recall the impact the digital camera had on that industry.
Sora, with its potential for disruption on such a scale, is something we should certainly keep an eye on as it spreads to a larger user base.
Marques Brownlee, a popular YouTuber, has been granted early access to Sora exploration.
In a YouTube critique, Brownlee highlights the program's remarkable ability to generate realistically looking footage, despite acknowledging its rooms for improvement.
A Game of Guess Who?
For those accustomed to the game of identifying who said something, Brownlee presents a series of short videos and asks us to determine whether they're real or AI-generated.
The challenge lies in the fact that it can be quite difficult to differentiate between the two.
There aren't many obvious signs to distinguish AI-generated videos from real ones.
Few Clues
While there are a few tells when it comes to AI video, they're primarily based on our understanding of the world around us.
Here are three that stood out to me:
Inaccurate – one way you can tell that a video is AI-generated is if it contains factually incorrect elements in a scene, such as a location that you're familiar with.
For instance, if you know that there's no dilapidated shack on top of a certain hill, you'll notice the disparity when the AI shows you an aerial view.
Lacking 'ugliness' – it appears that Sora remains true to the tendency of still image and General AI technologies in producing polished, attractive results. In other words, one of the few tells is that the program tends not to create mediocre-looking films with actual composition or lighting problems, or subjects that aren't well-prepared for film.
However, these shortcomings could presumably be overcome with more prompting.
Questionable credibility – another way to tell AI video is if you notice fantastical elements closing in from the margins, such as tentacle monsters. But once more, this has more to do with our knowledge of the world than any visual assessment. If a tentacle god arises from a lake in a Sora video, it will appear real.
After writing this down, I went on to watch the rest of Brownlee's video, and he touches upon some additional lapses in Sora's hyper-realism.
First, there's the absence of object permanence, where items or characters can appear and disappear spontaneously. There's also a phenomenon known as the 'ghost image', where an object will lack substance – for example, in Brownlee's video, we see cars passing through another car in a supposed real-life street video.
In summary, Brownlee notes, Sora struggles with physics. It doesn't always know how objects behave in motion, or what direction they were moving in, if it's working off a still image.
There are also some issues with speed, where a video may slow down or speed up for no apparent reason.
Despite these drawbacks, some of the videos will appear so realistic that it will be difficult to distinguish them from real footage.
Current Access to Sora
At OpenAI's website, it appears that Sora access has been temporarily halted due to high traffic volumes.
Brownlee discusses this eventuality in his video:
"I kind of wonder how long it will take when it's open for everyone to use," he says, noting that a 1080P film of around 10 seconds takes a couple of minutes for him to generate.
Potential Applications for Sora
As Brownlee points out, Sora may prove particularly useful for individuals interested in creating cartoon or claymation features.
This is because the physics can be difficult to get right in real life, but cartoons and stop-motion footage are more forgiving. They’re more abstract, and that's going to be one of the first realms in which Sora becomes most useful.
However, Brownlee also cites fake CCTV camera footage as a desired use for the platform.
He showcases a 'Santa versus Frosty' Mortal Kombat game video created entirely by the AI...and a job interview scene, in which Sora accurately depicts various details without any additional prompts.
However, he suggests that there are some uncertainties related to this technology, and we’re navigating uncharted waters.
As we all begin to use Sora, we'll need to consider its potential applications and the impact it will have on our lives.
People are curious about what's going to happen to major industries that are often centered around places like Burbank and Hollywood.
But the effects of the technology will likely extend far beyond that. Stay tuned as I continue to document what’s emerging in this exciting time for large language models.
The potential impact of Sora on the finance and venture capital (VC) sectors could be substantial, as big money may start flowing towards digital transformation companies that leverage AI technology for video generation. Furthermore, VC firms might seek partnerships or investments in companies that can improve Sora's capabilities or develop similar technology, recognizing the considerable market opportunities that lie ahead.