Today’s The Fast and the Curious post explores how Chrome achieved the highest score on the new Speedometer 3.0, an upgraded browser benchmarking tool to optimize the performance of Web applications. Try out Chrome today! 

Speedometer 3.0 is a recently published benchmark for measuring browser performance that was created as an industry collaboration between companies like Google, Apple, Mozilla, Intel, and Microsoft. This benchmark helped us identify areas in which we could optimize Chrome to deliver a faster browser experience to all our users.

Here’s a closer look at how we further optimized Chrome to achieve the highest score ever Speedometer 3, by carefully tracking its recent performance over time as the updated benchmark was being developed. Since the inception of Speedometer 3 in May 2022, we've driven a 72% increase in Chrome’s Speedometer score - translating into performance gains for our users:



Optimizing workloads

By looking at the workloads in Speedometer and in which functions Chrome was spending the most time, we were able to make targeted optimizations to those functions that each drove an increase in Chrome’s score. For example, the SpaceSplitString function is used heavily to turn space-separated strings such as those in “class=’foo bar’ ” into a list representation. In this function we removed some unnecessary bound checks. When we detect that there are duplicated stylesheets, we dedupe them and reference a single stylesheet instance. We made an optimization to reduce the cost of drawing paths and arcs by tuning memory allocations. When creating form editors we detected some unnecessary processing that occurs when form elements are created. Within querySelector, we were able to detect what selector was commonly used and create a hot-path for that.

We previously shared how we optimized innerHTML using specialized fast paths for parsing, an implementation that also made its way into WebKit. Some workloads in Speedometer 3 use DOMParser so we extended the same optimization for another 1% gain.

We worked with the Harfbuzz maintainer to also optimize how Chrome renders AAT fonts such as those used by Apple Mac OS system fonts. Text starts as a processed stream of unicode characters that is then transformed into a glyph stream that is then run through a state machine defined in the AAT font. The optimization allows us to determine more quickly whether glyphs actually participate in the rules for the state machine, leading to speed-ups when processing text using AAT.

Picking the right code to focus on

An important strategy for achieving high performance is tiering up code, which is picking the right code to further optimize within the engine. Intel contributed profile guided tiering to V8 that remembers tiering decisions from the past such that if a function was stably tiered up in the past, we eagerly tier it up on future runs.

Improving garbage collection

Another area of changes that drove around 3% progression on Speedometer 3 was improvements around garbage collection. V8’s garbage collector has a long history of making use of renderer idle time to avoid interfering with actual application code. The recent changes follow this spirit by extending existing mechanisms to prefer garbage collection in idle time on otherwise very active renderers where possible. Specifically, DOM finalization code that is run on reclaiming objects is now also run in idle time. Previously, such operations would compete with regular application code over CPU resources. In addition, V8 now supports a much more compact layout for objects that wrap DOM elements, i.e., all objects that are exposed to JavaScript frameworks. The compact layout reduces memory pressure and results in less time spent on garbage collection.

Posted by Thomas Nattestad, Chrome Product Manager



On the Chrome team, we believe it’s not sufficient to be fast most of the time, we have to be fast all of the time. Today’s The Fast and the Curious post explores how we contributed to Core Web Vitals by surveying the field data of Chrome responding to user interactions across all websites, ultimately improving performance of the web.

As billions of people turn to the web to get things done every day, the browser becomes more responsible for hosting a multitude of apps at once, resource contention becomes a challenge. The multi-process Chrome browser contends for multiple resources: CPU and memory of course, but also its own queues of work between its internal services (in this article, the network service).

This is why we’ve been focused on identifying and fixing slow interactions from Chrome users’ field data, which is the authoritative source when it comes to real user experiences. We gather this field data by recording anonymized Perfetto traces on Chrome Canary, and report them using a privacy-preserving filter.

When looking at field data of slow interactions, one particular cause caught our attention: recurring synchronous calls to fetch the current site’s cookies from the network service.

Let’s dive into some history.

Cookies under an evolving web

Cookies have been part of the web platform since the very beginning. They are commonly created like this:

    document.cookie = "user=Alice;color=blue"

And later retrieved like this:

    // Assuming a `getCookie` helper method:
    getCookie("user", document.cookie)

Its implementation was simple in single-process browsers, which kept the cookie jar in memory.

Over time, browsers became multi-process, and the process hosting the cookie jar became responsible for answering more and more queries. Because the Web Spec requires Javascript to fetch cookies synchronously, however, answering each document.cookie query is a blocking operation.

The operation itself is very fast, so this approach was generally fine, but under heavy load scenarios where multiple websites are requesting cookies (and other resources) from the network service, the queue of requests could get backed up.

We discovered through field traces of slow interactions that some websites were triggering inefficient scenarios with cookies being fetched multiple times in a row. We landed additional metrics to measure how often a GetCookieString() IPC was redundant (same value returned as last time) across all navigations. We were astonished to discover that 87% of cookie accesses were redundant and that, in some cases, this could happen hundreds of times per second.

The simple design of document.cookie was backfiring as JavaScript on the web was using it like a local value when it was really a remote lookup. Was this a classic computer science case of caching?! Not so fast!

The web spec allows collaborating domains to modify each other’s cookies. Hence, a simple cache per renderer process didn’t work, as it would have prevented writes from propagating between such sites (causing stale cookies and, for example, unsynchronized carts in ecommerce applications).

A new paradigm: Shared Memory Versioning

We solved this with a new paradigm which we called Shared Memory Versioning. The idea is that each value of document.cookie is now paired with a monotonically increasing version. Each renderer caches its last read of document.cookie alongside that version. The network service hosts the version of each document.cookie in shared memory. Renderers can thus tell whether they have the latest version without having to send an inter-process query to the network service.



This reduced cookie-related inter-process messages by 80% and made document.cookie accesses 60% faster 🥳.

Hypothesis testing

Improving an algorithm is nice, but what we ultimately care about is whether that improvement results in improving slow interactions for users. In other words, we need to test the hypothesis that stalled cookie queries were a significant cause of slow interactions.

To achieve this, we used Chrome’s A/B testing framework to study the effect and determined that it, combined with other improvements to reduce resource contention, improved the slowest interactions by approximately 5% on all platforms. This further resulted in more websites passing Core Web Vitals 🥳. All of this adds up to a more seamless web for users.



Timeline of the weighted average of the slowest interactions across the web on Chrome as this was released to 1% (Nov), 50% (Dec), and then all users (Feb).

Onward to a seamless web!

By Gabriel Charette, Olivier Li Shing Tat-Dupuis, Carlos Caballero Grolimund, and François Doray, from the Chrome engineering team



How to get started

Because this change happens at the browser level, developers who use Chrome Custom Tabs will see this change automatically applied starting with Chrome version M124. End users will see the Minimize icon in the Chrome Custom Tab toolbar.

Please note that this is a change in Chrome, and we hope other browsers will adopt similar functionality.

Posted by Victor Gallet, Senior Product Manager

Posted by David Adrian, Bob Beck, David Benjamin and Devon O'Brien



Used billions of times each day, the Chrome address bar (which we call the “omnibox”) is a powerful tool to make searching the web easier, whether you’re trying to quickly find your tabs or bookmarks, return to a web page you previously visited, or find information.

With the latest release of Chrome (M124), we’re integrating machine learning models to power the Chrome omnibox on desktop, so that web page suggestions are more precise and relevant to you. In the future, these models will also help improve the relevance scoring of search suggestions. Here’s a closer look at some of the important insights that help our team build this integration and where we hope the new model takes us.

How we got here

As the engineering lead for the team responsible for the omnibox, every launch feels special, but this one is truly near and dear to my heart. When I first started working on the Chrome omnibox, I asked around for ideas on how we could make it better for users. The number one answer I heard was, "improve the scoring system." The issue wasn't that the scoring was bad. In fact, the omnibox often feels magical in its ability to surface the URL or query you want! The issue was that it was inflexible. A set of hand-built and hand-tuned formulas did the job well, but were difficult to improve or to adapt to new scenarios. As a result, the scoring system went largely untouched for a long time.

For most of that time, an ML-trained scoring model was the obvious path forward. But it took many false starts to finally get here. Our inability to tackle this challenge for so long was due to the difficulty of replacing the core mechanism of a feature used literally billions of times every day. Software engineering projects are sometimes described as "building the plane while flying it." This project felt more like "replacing all the seats in every plane in the world while they're all flying." The scale was enormous and the changes are felt directly by every user.

This ambitious undertaking would not have been possible without the work of such a talented and dedicated team. There were bumps in the road, walls we had to break through, and unanticipated issues that slowed us down, but the team was driven by a sincere belief in the impact of getting this right for our users.

A Surprising Insight

One of the fun things about working with ML systems is that the training considers all the data at a scale that would be difficult to impossible for any individual person or team. And that can lead to surprising insights.

The coolest example of this phenomenon on this project was when we looked at the scoring curve of one particular signal: time since last navigation. The expectation with this signal is that the smaller it is (the more recently you've navigated to a particular URL), the bigger the contribution that signal should make towards a higher relevance score.

And that is, in fact, what the model learned. But when we looked closer, we noticed something surprising: when the time since navigation was very low (seconds instead of hours, days or weeks), the model was decreasing the relevance score. It turns out that the training data reflected a pattern where users sometimes navigate to a URL that was not what they really wanted and then immediately return to the Chrome omnibox and try again. In that case, the URL they just navigated to is almost certainly not what they want, so it should receive a low relevance score during this second attempt.

In retrospect, this is obvious. And if we had not launched ML scoring, we definitely would have added a new rule to the old system to reflect this scenario. But before the training system observed and learned from this pattern, it never occurred to anyone that this might be happening.

The Future

With the new ML models, we believe this will open up many new possibilities to improve the user experience by potentially incorporating new signals, like differentiating between time of the day to improve relevance. We want to explore training specialized versions of the model for particular environments: for example, mobile, enterprise or academic users, or perhaps different locales.

Additionally, we observe that the way users interact with the Chrome omnibox changes over time and we believe the relevance scoring should change with them. With the new scoring system, we can now simply collect fresher signals, re-train, evaluate, and deploy new models periodically over time.

By Justin Donnelly, Chrome software engineer

-