Chromium Blog

Distributed Reliability Testing

Wednesday, February 25, 2009

One of the advantages to being associated with Google is that we have access to a lot of information about the Web, and a lot of computers to test on. About once an hour, our distributed test infrastructure takes the very latest version of Google Chrome in development and uses it to automatically load a large number of the pages that Google has seen are most popular around the world. When it's done, it produces a report like this on the Buildbot waterfall that all our developers (and anyone else) can see:

Results for top 500 web sites:

success: 499; crashes: 0; crash dumps: 0; timeout: 1

Results for top 500 web sites without sandbox:

success: 463; crashes: 0; crash dumps: 0; timeout: 2

Results for extended list of web sites:

success: 99768; crashes: 3; crash dumps: 3; timeout: 463

Here the final test got through a bit over 100,000 pages before stopping to make way for the next build to be tested. And before each Dev, Beta, or Stable channel release, we run with a much larger number of URLs.

In addition, we "fuzz-test" the user interface, automatically performing arbitrary sequences of actions (opening a new tab, pressing the spacebar, opening various dialogs, etc. — a total of more than 30 possible actions). These are also run in our distributed testing architecture, so we can exercise thousands of combinations for each new version of Google Chrome in progress. The same report that shows the page-load results above collects these UI test results too:

Results for automated UI test:

success: 64643; crashes: 0; crash dumps: 0; timeout: 0

This sort of large-scale testing is great for finding crashes that happen only rarely, or that only affect pages that developers wouldn't have visited as part of their haphazard manual testing. By catching a problem right away even if it's very rare, it's easier for developers to figure out what change caused the error and fix it before it ever gets close to showing up in Google Chrome itself.

Posted by Pamela Greene and Patrick Johnson, Software Engineers

We want Google Chrome to be as stable as possible. No matter what site you browse to or what you do, Chrome should never crash. A system we call "distributed reliability testing" is one of the main tools we use to help turn that goal into reality.
One of the advantages to being associated with Google is that we have access to a lot of information about the Web, and a lot of computers to test on. About once an hour, our distributed test infrastructure takes the very latest version of Google Chrome in development and uses it to automatically load a large number of the pages that Google has seen are most popular around the world. When it's done, it produces a report like this on the Buildbot waterfall that all our developers (and anyone else) can see:
Results for top 500 web sites:success: 499; crashes: 0; crash dumps: 0; timeout: 1
Results for top 500 web sites without sandbox:success: 463; crashes: 0; crash dumps: 0; timeout: 2
Results for extended list of web sites:success: 99768; crashes: 3; crash dumps: 3; timeout: 463
Here the final test got through a bit over 100,000 pages before stopping to make way for the next build to be tested. And before each Dev, Beta, or Stable channel release, we run with a much larger number of URLs.
In addition, we "fuzz-test" the user interface, automatically performing arbitrary sequences of actions (opening a new tab, pressing the spacebar, opening various dialogs, etc. — a total of more than 30 possible actions). These are also run in our distributed testing architecture, so we can exercise thousands of combinations for each new version of Google Chrome in progress. The same report that shows the page-load results above collects these UI test results too:
Results for automated UI test:success: 64643; crashes: 0; crash dumps: 0; timeout: 0
This sort of large-scale testing is great for finding crashes that happen only rarely, or that only affect pages that developers wouldn't have visited as part of their haphazard manual testing. By catching a problem right away even if it's very rare, it's easier for developers to figure out what change caused the error and fix it before it ever gets close to showing up in Google Chrome itself.
Posted by Pamela Greene and Patrick Johnson, Software Engineers

Spell Check Dictionary Improvements

Wednesday, February 11, 2009

The Hunspell dictionary maintainers have done a great job creating high-quality dictionaries that anybody can use, but one of the problems with any dictionary is that there are inevitably omissions, especially as new words appear or proper nouns come into common use. We at Google are in a good position to use our knowledge of the internet to identify and fix some of these omissions. The Google translation team used their language models to generate a sorted list of the most popular words in each language. This was cross-checked with the Hunspell dictionaries to generate a list of the top 1000 words not present in each dictionary. This list includes many popular words, but also common misspellings. To remove these words, each list was reviewed by specialist in that language. Generally, we tried to keep proper nouns and even foreign words as long as they were in common usage.

We hope that by using the the existing GPL/LGPL/MPL tri-license for our addition, our work can be picked up by other users of Hunspell. We also hope to make more improvements in the future, both for additional languages like Turkish, and to refine the word lists we already have. If you're passionate about your language, you can help out by writing affix rules for the added words or reviewing more word lists.

The recent dev-channel release of Google Chrome (2.0.160.0) has the additional words we generated for 19 of the languages. Hopefully, you'll see fewer common words marked as misspelled. For example, the English dictionary now includes "antivirus," "anime," "screensaver," and "webcam," and commonly used names such as "BibTeX," "Mozilla," "Obama," and "Wikipedia." For our scientific users, we even have "gastroenterology," "oligonucleotide," and "Saccharomyces"! We'd like to give special thanks to the great help we got from the translation team who generated the words and the language search specialists who reviewed the lists.

Posted by Brett Wilson and Siddhartha Chattopadhyay, Software Engineers

If you're anything like us, you're spending more and more of your time working online. The spellchecker built into Chromium can be a big help in keeping your blog, email, documents, and forum postings spelled correctly and easy to read. Chromium integrates the popular open source library Hunspell with WebKit's built-in spellchecking infrastructure to check words and to provide suggestions in 27 different languages.
The Hunspell dictionary maintainers have done a great job creating high-quality dictionaries that anybody can use, but one of the problems with any dictionary is that there are inevitably omissions, especially as new words appear or proper nouns come into common use. We at Google are in a good position to use our knowledge of the internet to identify and fix some of these omissions. The Google translation team used their language models to generate a sorted list of the most popular words in each language. This was cross-checked with the Hunspell dictionaries to generate a list of the top 1000 words not present in each dictionary. This list includes many popular words, but also common misspellings. To remove these words, each list was reviewed by specialist in that language. Generally, we tried to keep proper nouns and even foreign words as long as they were in common usage.
We hope that by using the the existing GPL/LGPL/MPL tri-license for our addition, our work can be picked up by other users of Hunspell. We also hope to make more improvements in the future, both for additional languages like Turkish, and to refine the word lists we already have. If you're passionate about your language, you can help out by writing affix rules for the added words or reviewing more word lists.
The recent dev-channel release of Google Chrome (2.0.160.0) has the additional words we generated for 19 of the languages. Hopefully, you'll see fewer common words marked as misspelled. For example, the English dictionary now includes "antivirus," "anime," "screensaver," and "webcam," and commonly used names such as "BibTeX," "Mozilla," "Obama," and "Wikipedia." For our scientific users, we even have "gastroenterology," "oligonucleotide," and "Saccharomyces"! We'd like to give special thanks to the great help we got from the translation team who generated the words and the language search specialists who reviewed the lists.
Posted by Brett Wilson and Siddhartha Chattopadhyay, Software Engineers

Irregexp, Google Chrome's New Regexp Implementation

Wednesday, February 4, 2009

While the V8 team has been working hard to improve JavaScript performance, one part of the language that we have so far not given much attention is regexps. Our previous implementation was based on the widely used PCRE library developed by Philip Hazel at the University of Cambridge. The version we used, known as JSCRE, was adapted and improved by the WebKit project for use with JavaScript. Using JSCRE gave us a regular expression implementation that was compatible with industry standards and has served us well. However, as we've improved other parts of the language, regexps started to stand out as being slower than the rest. We felt it should be possible to improve performance by integrating with our existing infrastructure rather than using an external library. The SquirrelFish team is following a similar approach with their JavaScript engine.

A fundamental decision we made early in the design of Irregexp was that we would be willing to spend extra time compiling a regular expression if that would make running it faster. During compilation Irregexp first converts a regexp into an intermediate automaton representation. This is in many ways the "natural" and most accessible representation and makes it much easier to analyze and optimize the regexp. For instance, when compiling /Sun|Mon/ the automaton representation lets us recognize that both alternatives have an 'n' as their third character. We can quickly scan the input until we find an 'n' and then start to match the regexp two characters earlier. Irregexp looks up to four characters ahead and matches up to four characters at a time.

After optimization we generate native machine code which uses backtracking to try different alternatives. Backtracking can be time-consuming so we use optimizations to avoid as much of it as we can. There are techniques to avoid backtracking altogether but the nature of regexps in JavaScript makes it difficult to apply them in our case, though it is something we may implement in the future.

During development we have tested Irregexp against one million of the most popular webpages to ensure that the new implementation stays compatible with our previous implementation and the web. We have also used this data to create a new benchmark which is included in version 3 of the V8 Benchmark Suite. We feel this is a good reflection of what is found on the web.

If you want to try this out, and help us test it in the process, you can subscribe to the dev-channel and if you see problems that might be related to Irregexp consider filing a bug.

And BTW, we'll have sessions on V8 and other Chrome-related topics in May at Google I/O, Google's largest developer conference.

Posted by Erik Corry, Christian Plesner Hansen and Lasse Reichstein Holst Nielsen, Software Engineers

One of the new features in the most recent dev-channel release of Google Chrome (2.0.160.0) is Irregexp, a completely new implementation of regular expressions (regexps) in the V8 JavaScript engine. Irregexp builds on V8's existing infrastructure for memory management and native code generation and is tailored to work well for the kinds of regexps used by JavaScript programs on the web. The result is a considerable improvement in V8's regexp performance.
While the V8 team has been working hard to improve JavaScript performance, one part of the language that we have so far not given much attention is regexps. Our previous implementation was based on the widely used PCRE library developed by Philip Hazel at the University of Cambridge. The version we used, known as JSCRE, was adapted and improved by the WebKit project for use with JavaScript. Using JSCRE gave us a regular expression implementation that was compatible with industry standards and has served us well. However, as we've improved other parts of the language, regexps started to stand out as being slower than the rest. We felt it should be possible to improve performance by integrating with our existing infrastructure rather than using an external library. The SquirrelFish team is following a similar approach with their JavaScript engine.
A fundamental decision we made early in the design of Irregexp was that we would be willing to spend extra time compiling a regular expression if that would make running it faster. During compilation Irregexp first converts a regexp into an intermediate automaton representation. This is in many ways the "natural" and most accessible representation and makes it much easier to analyze and optimize the regexp. For instance, when compiling /Sun|Mon/ the automaton representation lets us recognize that both alternatives have an 'n' as their third character. We can quickly scan the input until we find an 'n' and then start to match the regexp two characters earlier. Irregexp looks up to four characters ahead and matches up to four characters at a time.
After optimization we generate native machine code which uses backtracking to try different alternatives. Backtracking can be time-consuming so we use optimizations to avoid as much of it as we can. There are techniques to avoid backtracking altogether but the nature of regexps in JavaScript makes it difficult to apply them in our case, though it is something we may implement in the future.
During development we have tested Irregexp against one million of the most popular webpages to ensure that the new implementation stays compatible with our previous implementation and the web. We have also used this data to create a new benchmark which is included in version 3 of the V8 Benchmark Suite. We feel this is a good reflection of what is found on the web.
If you want to try this out, and help us test it in the process, you can subscribe to the dev-channel and if you see problems that might be related to Irregexp consider filing a bug.
And BTW, we'll have sessions on V8 and other Chrome-related topics in May at Google I/O, Google's largest developer conference.
Posted by Erik Corry, Christian Plesner Hansen and Lasse Reichstein Holst Nielsen, Software Engineers

ClickJacking

Thursday, January 29, 2009

Posted by Adam Barth, Software Engineer

Although the term "ClickJacking" is new, the underlying issue has been known for years. ClickJacking attacks affect all Web browsers because the attacks rely on standard browser features to trick the user into clicking on a dangerous spot on another Web page. A few months ago, Jeremiah Grossman and Robert Hansen sparked renewed interest in ClickJacking by demonstrating a clever application of the technique against Flash Player. Unfortunately, there is no "silver bullet" solution to all ClickJacking attacks. To find the best long-term solution, we're collaborating with other browser vendors and the standards community. If you're interested in ClickJacking solutions, I'd recommend reading Mark Pilgrim's summary of recent ClickJacking discussion in the HTML 5 working group and joining in the discussion.
Posted by Adam Barth, Software Engineer

Google Chrome User Experience Research

Monday, January 26, 2009

"To achieve the streamlined feel we were after … we had our own intuitions about what was and wasn't useful in current browsers, we had no idea how those ideas matched to reality. So in typical Google fashion, we turned to data; we ran long studies of the browsing habits of thousands of volunteers, compiled giant charts of what features people did and didn't use, argued over and incorporated that data into our designs and prototypes, ran experiments, watched how our test users reacted, listened to their feedback, and then repeated the cycle over and over and over again."

In the future, we are planning on releasing some of our research on this blog and the on the UX Site to show how the data we are collecting is impacting the Chrome experience.

All of our research data comes from studying and observing people. But what kind of "people" do I mean? Probably someone just like you. So if you are interested in becoming a potential participant in a research study on Chrome or one of the many other Google products, I encourage you to sign up at google.com/usability.

Posted by David Choi, User Experience Researcher

Why are the buttons where they are instead of where I want them to be? What's up with bookmarks? Why does the Google Chrome UI look and operate the way it does? These are probably questions that some, many or even all you have about Google Chrome. We explained how we came to some of those decisions in a previous post:

"To achieve the streamlined feel we were after … we had our own intuitions about what was and wasn't useful in current browsers, we had no idea how those ideas matched to reality. So in typical Google fashion, we turned to data; we ran long studies of the browsing habits of thousands of volunteers, compiled giant charts of what features people did and didn't use, argued over and incorporated that data into our designs and prototypes, ran experiments, watched how our test users reacted, listened to their feedback, and then repeated the cycle over and over and over again."

To provide some more insight into this process, I should explain what we mean by "data." The data we turn to is both quantitative and qualitative. Usage logs provide statistics such how many users have tried a feature and how frequently a feature gets used. These logs are collected only from people who have chosen to share usage statistics with us. This quantitative data tells us the "how" and the "when" but not the "why." For that, we use qualitative data gathered through research methods like surveys, interviews and contextual inquiry which involves observing people in their home or work environments. Often we bring people to one of our usability labs where we can observe their interactions and collect feedback on a new feature we are working on. Many times we employ an eye tracker where we can find out what exactly people are looking at on our user interface. By incorporating data from all these sources into our design process, we hope to provide a user experience that satisfies the needs of the many Google Chrome users out there.
In the future, we are planning on releasing some of our research on this blog and the on the UX Site to show how the data we are collecting is impacting the Chrome experience.
All of our research data comes from studying and observing people. But what kind of "people" do I mean? Probably someone just like you. So if you are interested in becoming a potential participant in a research study on Chrome or one of the many other Google products, I encourage you to sign up at google.com/usability.
Posted by David Choi, User Experience Researcher

Google Chrome Installation and Updates

Friday, January 16, 2009

help center

With new browser exploits showing up on regular basis, keep users free from the burden of checking for security updates.

Allow users who are not administrators to install Google Chrome.

Allow updates to happen automatically in the background even when Google Chrome is in use. The next time you open Google Chrome, it can simply start using the latest version.

Just like the minimal user interface (UI) of Google Chrome, limit or eliminate installer UI as much as possible.

Updates should be as small as possible. A security fix should be a small, fast download and should not need a full installer.

Uninstall should be clean and remove changes done by Google Chrome as much as possible.

Installation

Anyone can install Google Chrome, not just administrators.

On Windows Vista there are no 'security prompts' during install. If you are running as a non-elevated Administrator, you can still install Google Chrome without having to enter an administrator's password. However we still need to ask for a password to make Google Chrome the default browser due to how Windows Vista requires browser applications to be registered with it.

You can choose to install or uninstall Google Chrome without affecting other people who use the same computer.

UpdatesGoogle TalkGearsGoogle Earth Plugin

Maintaining different update channels, each with its own update schedules and Chrome versions.

Updating software in the background without any annoying dialogs.

Good proxy support that can handle various proxy configuration to download the installer payload.

Having only one instance of Google Update manage multiple Google programs installed on the machine.

fourteen updatesUn-installationinstructionsPosted by Rahul Kuchhal, Software Engineer

Google Chrome Release Channels

Thursday, January 8, 2009

Because we don't have those big Dot-Oh release milestones on the calendar, we don't have long periods of Beta testing new features. Instead we use automatic update channels to release Google Chrome to a community of early adopters. The channels are essentially a never-ending Beta test and a continuous feedback loop that lets us rapidly develop new ideas into solid product features.

You can subscribe to one of our update channels:

Stable channel. Everyone is on the Stable channel when they first install Google Chrome. The Stable channel is updated with features and fixes once they have been throughly tested in the Beta channel. If you want a rock solid browser but don't need the latest features, the Stable channel is for you.
Beta channel. People who like to use and help refine the latest features subscribe to the Beta channel. Every month or so, we promote stable and complete features from the Dev channel to the Beta channel. The Beta channel is more stable than Dev, but may lack the polish one expects from a finished product.
Developer preview channel. The Dev channel is where ideas get tested (and sometimes fail). The Dev channel can be very unstable at times, and new features usually require some manual configuration to be enabled. Still, simply using Dev channel releases is an easy (practically zero-effort) way for anyone to help improve Google Chrome.

If you're ready to try some new stuff, we've just released a Dev channel update that has a new version of WebKit, a new network stack, and some features like form autocomplete (read about it here). It's less polished than what Dev channel users have been getting during Google Chrome's Beta, so we've moved all of our existing Dev channel users to the Beta channel. If you were on the Dev channel, you can decide whether to switch to the new Dev channel or stay on the new Beta channel.

Posted by Mark Larson, Technical Program Manager

Release early, release often. We think that's the best way to develop software that delights people. With Google Chrome, we want to release fewer features more often instead of making you wait 12 months for the next Major Dot-Oh Release Jam-Packed With Features. We can get your feedback faster, fix things faster, and release new improvements as soon as they're ready. We want Google Chrome to stay nimble so it can keep pace with changes in the sites and web apps you use.

Because we don't have those big Dot-Oh release milestones on the calendar, we don't have long periods of Beta testing new features. Instead we use automatic update channels to release Google Chrome to a community of early adopters. The channels are essentially a never-ending Beta test and a continuous feedback loop that lets us rapidly develop new ideas into solid product features.
You can subscribe to one of our update channels:

Stable channel. Everyone is on the Stable channel when they first install Google Chrome. The Stable channel is updated with features and fixes once they have been throughly tested in the Beta channel. If you want a rock solid browser but don't need the latest features, the Stable channel is for you.
Beta channel. People who like to use and help refine the latest features subscribe to the Beta channel. Every month or so, we promote stable and complete features from the Dev channel to the Beta channel. The Beta channel is more stable than Dev, but may lack the polish one expects from a finished product.
Developer preview channel. The Dev channel is where ideas get tested (and sometimes fail). The Dev channel can be very unstable at times, and new features usually require some manual configuration to be enabled. Still, simply using Dev channel releases is an easy (practically zero-effort) way for anyone to help improve Google Chrome.

To get more actively involved with Google Chrome, subscribe to the Dev or Beta channel. Just run a little program (found here) and that's it. After that, you'll automatically get early access updates.
If you're ready to try some new stuff, we've just released a Dev channel update that has a new version of WebKit, a new network stack, and some features like form autocomplete (read about it here). It's less polished than what Dev channel users have been getting during Google Chrome's Beta, so we've moved all of our existing Dev channel users to the Beta channel. If you were on the Dev channel, you can decide whether to switch to the new Dev channel or stay on the new Beta channel.
Posted by Mark Larson, Technical Program Manager