Google debuted its $250 'Clips' lifelogging camera to the public during the October 2017 launch event for the Google Pixel 2. Now, with the camera becoming officially available in only a few weeks time, the company has published a blog post that explains how the Clips camera's underlying algorithms were trained to identify and keep the best shots, and discard the leftovers.
It turns out Google relied on the expertise of a documentary filmmaker, a photojournalist, and a fine arts photographer to train the AI and feed some high-quality photography into its machine learning model. The group collected and analyzed recorded footage from members of the development team to try and answer the question: "What makes a memorable moment?"
"We needed to train models on what bad looked like," said Josh Lovejoy, Senior Interaction Designer at Google. "By ruling out the stuff the camera wouldn't need to waste energy processing (because no one would find value in it), the overall baseline quality of captured clips rose significantly. "
The learning process includes basic elements of photography, such as an understanding of focus and depth-of-field, or the rule of thirds, but also some things that are obvious to most humans but less so to an algorithm—for example: don't cover a lens with your finger and avoid abrupt movements while recording.
Google admits that there is still a ways to go before perfection. It says the AI has been trained to look at "stability, sharpness, and framing," but without careful calibration, a face at the edge of the frame will be appreciated just as much as one at the center, even if the focus of interest is really somewhere else in the image.
"Success with Clips isn't just about keeps, deletes, clicks, and edits (though those are important)," Lovejoy says. "It's about authorship, co-learning, and adaptation over time. We really hope users go out and play with it. " More detail on the development and training process is available on the Google blog.
2018-1-26 18:54