How AI turns your content into 3D
Did you know that Leia employs one of the most advanced neural networks today? It lies at the heart of how we create 3D imagery. Where our screen technology is the window to 3D Lightfield, Leia’s advanced computer vision translates everything into 3D. All the 2D content in the world gets a huge upgrade thanks to Leia’s AI.
So, while we talk a lot about our 3D screens and how they are manufactured, Andrii Tsarov, SWE Director of Computer Vision and Machine Learning at Leia, brings a little more context to where we are now – and where we are going.
“I’m excited about what computer vision can bring to humanity,” says Andrii. “AI is another step to increase productivity while also reducing the burden on people. What’s more, AI democratizes some spheres – like computer vision. We’re able to analyze things that were simply not possible before. You can do it with visual information, medical technology also, and in some cases, it democratizes art.”
“I’m excited about what computer vision can bring to humanity”
Think of it this way: People that can envision something, but can’t draw can be helped by working with computer vision. If you have trouble editing and creating a movie, AI will be able to help speed that process up and simplify how things get made. It’s already happening on some levels, but as we go further, it’ll be easier to accomplish - on more devices, according to Andrii.
What brought Andrii to Leia, though, were our advances in computational photography. “It’s very exciting stuff,” he says.
“There’s a chicken-and-egg problem. If nobody makes 3D content, nobody would buy a 3D screen. Nobody buys 3D screens, nobody makes content. So we’re bridging this gap to make all content relevant.”
The most obvious examples being LeiaPix Converter, taking 2D images and adding depth, as well as LeiaTube, which takes streaming social video content and converts that on-the-fly into 3D. “The AI is essential to lift the levels of existing content,” Andrii says.
HOW DOES LEIA’S AI WORK?
AI really excels at finding patterns. In fact, when it comes to face detection, neural networks performed even better than the average user at recognizing faces, Andrii points out.
“We train neural nets to analyze 3D scenes,” Andrii explains. “We take actual 3D scenes, the neural net looks at it and remembers how to construct this 3D scene for one picture. Then, there’s the human element. We review the results and for every time that the subject is partially misidentified in the picture, we correct it.”
So, for example, we take a picture of someone wearing a hat. The hat - or part of it - gets misidentified as part of the background, not the subject. So we would need to go back in, correct the depth map to show the neural net what it got wrong.
To teach the neural net, you collect several million pictures with 3D information available. Then you put it to the distributed cloud for it to learn. Processing that information can take from days to weeks for an iteration.
“Our network has vastly improved one iteration over the next to the point where we feel confident in saying we are now surpassing the current state of the art in monocular depth estimation,” Andrii says. He provided some examples of how Leia-provided results compare to most well known work in the field (DPTLarge) elsewhere in this story.
Our AI has already made significant progress converting 2D imagery to 3D. The obvious, “magic” goal for Andrii is that you will not be able to distinguish between converted content from content actually created within 3D.
“With neural networks that we have right now,” he says, “some videos look almost flawless. Some are close. The goal is to make sure that everything we convert looks pretty much professionally converted. That you can’t tell the original source.”
As one might guess, it’s still very much a learning process. For example, Andrii says that while the AI does incredibly well with original content, cartoons, movies and the like, things with a UI overlay and multiple windows in Twitch streams, require more work for the neural network to process. That’s one of the things being worked on right now.
HOW IT ALL COMES TOGETHER – HARDWARE + SOFTWARE
There are many specialized competitors in this field, but no direct ones according to Andrii. “Snapchat, for example, has great video filters,” he says. “Facebook does those Live Photos so that it scrolls and zooms a little with depth. And there are companies that do monovideo to stereovideo conversions…but none of them are online. It’s hard to do online conversion like we do because they don’t control the device.”
There is no other company right now that does this level of 3D conversion as well as manufacture the screens that can leverage the effect. That gives us a distinct advantage leveraging every end of the process and knowing how to get the most out of what we call the 3D Lightfield effect.
So, while we do have all this amazing technology to transform 2D content, we also are developing the hardware to leverage all of it – most notably the 15.6 Monitor platform we recently unveiled.
For all the advances made so far – and there’s still so much further we can go – Andrii says that to truly appreciate what we’re accomplishing, all you need to do is use LeiaTube and see how it works. It’s analyzing 2D video and creating 3D video as you watch and, one day, you won’t even need to think about how it was made – just that it looks breathtakingly real.
Are you using AI and computer vision techniques with Leia products? We’d love to know!
Share and tag us at @LeiaInc in your posts. We love to amplify our community!