If you can’t capture what you want to search for with just a picture, Google Lens will now let you take a video — and even use your voice to ask about what you’re seeing. The feature will surface an AI Overview and search results based on the video’s contents and your question. It’s rolling out in Search Labs on Android and iOS today.
Google first previewed using video to search at I/O in May. As an example, Google says someone curious about the fish they’re seeing at an aquarium can hold up their phone to the exhibit, open the Google Lens app, and then hold down the shutter button. Once Lens starts recording, they can say their question: “Why are they swimming together?” Google Lens then uses the Gemini AI model to provide a response, similar to what you see in the GIF below.
When speaking about the tech behind the feature, Rajan Patel, the vice president of engineering at Google, told The Verge that Google is capturing the video “as a series of image frames and then applying the same computer vision techniques” previously used in Lens. But Google is taking things a step further by passing the information to a “custom” Gemini model developed to “understand multiple frames in sequence... and then provide a response that is rooted in the web.”
There isn’t support for identifying the sounds in a video just yet — like if you’re trying to identify a bird you’re hearing — but Patel says that’s something Google has been “experimenting with.”
Google Lens is also updating its photo search feature with the ability to ask a question using your voice. To try it, aim your camera at your subject, hold down the shutter button, and then ask your question. Before this change, you could only type your question into Lens after snapping a picture. Voice questions are rolling out globally on Android and iOS, but it’s only available in English for now.