An Israeli team, working with partners in the EU, is developing a technology to combine videos of an event taken by different people from different points of view into a single, 3D view of the entire event.
“It’s the ultimate in crowdsourcing,” said Dr. Chen Sagiv, chief coordinator of SceneNet, the consortium that is developing the technology. “We take crowdsourced videos filmed by different devices, including phones and tablets, put them all together, enhance their resolution and add 3D effects. The system creates a single, high-resolution video that lets you see the action from any angle, in 3D — just as if you were there yourself.”
SceneNet could be used to allow viewers to watch a rock concert from any angle or perspective they want, focusing on the drummer or guitarist, or watching the crowds. This would make it as if they were at the event itself, moving through the crowd or going up front to get a better look at the action on stage.
In fact, it was at a rock concert that Sagiv and her husband Nitzan got the idea for SceneNet. “We were at a Depeche Mode concert in Tel Aviv about five years ago, enjoying the show, and Nitzan noticed how everyone was taking video of it with their mobile phones,” said Sagiv. “The video capture capabilities of phones back then guaranteed that the resulting footage would be very low quality. There was also lots of background noise and bad lighting, but the actual scenes were there. We realized that if we had the footage we could enhance it, and if we could stitch it together we could have a single presentation that would look just like the real thing.”
The system “stitches together the videos at their edges, matching the scenes uploaded by the crowdsourced devices. It’s a complicated process, because you have to match the colors and compensate for the different lighting, the capabilities of devices and other factors that cause one video of even the same scene to look very different,” Sagiv explained.
The technology to stitch scene edges, average and correct colors, and ensure that the vocals match the action are all used in the system, but have been around for some time. SceneNet needs to leverage these technologies to parse through thousands of videos that will be uploaded to the cloud, searching each one for its common denominators and determining what must be done to a clip in order to make it look like a natural part of the final presentation, said Sagiv.
SceneNet is actually a consortium consisting of SagivTech, Sagiv’s own Ra’anana-based company, and several European partners. The European Commission agreed to fund the program through 2016. Sagiv’s team is responsible for the video stitching and coordination part of the project. The color and audio coordination aspects of the project are led by Prof. Peter Maass from the University of Bremen in Germany and Prof. Pierre Vadergheynst from EPFL in Switzerland.
The big challenge in video stitching, said Sagiv, was developing a way to automatically identify which videos match and put them in the right place, just as in a puzzle. Fortunately, Sagiv’s company has the capabilities for the heavy-duty video processing needed to sort through all that content. “SagivTech, which has been in business since 2009, is an innovator in graphics processing unit computing and computer vision,” both of which have helped push the SceneNet program ahead, she said.
Sagiv’s team of 12 has developed a mobile infrastructure for the video feeds and a mechanism for tagging the videos and transmitting them to a cloud server. The team developed a 3D enhancement mechanism for the videos and a method to share content via online communities.
Development is going well, said Sagiv. The final product may not be ready for another five years, but parts of it are ready now. “This is a massive project, because video, especially in the cloud, requires a lot of processing power. But there is no doubt that this technology is going to become very important both in the business and the consumer market,” she said. Even before the entire system is ready, said Sagiv, her company may take portions of the technology and add it to devices and software.
According to Nitzan, who is also Chen’s partner in SagivTech, “the basic use for the technology would be a mass event, like a concert. But it could also bring about a change in the culture. Today, everyone uses their smartphone to film all kinds of events, including news events. This is the first real crowdsourced video platform, and by crowdsourcing videos taken by witnesses, you could have a whole new way to produce and watch news and consume content.”