Doing things well requires attention to different aspects, and one of the key ones is speed. There's the speed of shipping new features, iterating, and managing release cycles. Both speed and quality are important. When you talk about latency, it's easy to say you want to reduce it, but at the same time, you’re constantly adding new capabilities. The capability frontier keeps advancing, so you have to find the right balance between speed and expanding functionality. That’s where things get more complicated.
For example, in search, teams now have specific latency budgets, even down to the millisecond. If you ship something that shaves off three milliseconds, you might get credit for 1.5 milliseconds in your latency budget, while the other 1.5 milliseconds goes directly to the user. Depending on the situation, some people may have a latency budget of 30 milliseconds, others just 10 milliseconds. Rigorous reviews ensure these budgets are respected, demonstrating how much latency really matters. For context, humans usually notice latency changes in the low hundreds of milliseconds.
Our dashboards and metrics show that we’ve actually improved search latency by 30% over the last five years. Meanwhile, the search product has gained a lot more functionality. That’s why, in projects like Gemini, we think deeply about the 'Pareto frontier'—making sure the balance between capability and speed is maintained. For example, flash models are about 90% as capable as pro models but are much faster and more effective to serve. Vertical integration also helps improve this balance.
Looking to the future, many people are talking about chat as the new interface for search, especially with technologies like Gemini being incorporated into search or providing AI results. There’s also increasing conversation about 'agentic flows,' where everyone could have a personal agent. Instead of just typing queries, these agents could do things for you—like planning a trip instead of just searching for information about it. The question is whether search will become a distribution mechanism, a new product, or just one of several ways people interact with information.
With every shift in search technology, we’re able to do more. We have to absorb these new capabilities and keep evolving. Especially in mobile situations—like when you quickly get out of a New York subway and need to find a webpage—you expect search to be fast and helpful. People's expectations are constantly shifting, so search needs to keep up.