Making sure the deployed annotations are usable by search engines can be rather difficult, especially on sites with many pages, and site owners all around the world haven’t been shy telling us about this. Today we're releasing a feature that should make debugging rel-alternate-hreflang annotations much easier.

The Language Targeting section in the International Targeting feature enables you to identify two of the most common issues with hreflang annotations:
  • Missing return links: annotations must be confirmed from the pages they are pointing to. If page A links to page B, page B must link back to page A, otherwise the annotations may not be interpreted correctly.
    For each error of this kind we report where and when we detected them, as well as where the return link is expected to be.
incorrect_backlinks.png

  • Incorrect hreflang values: The value of the hreflang attribute must either be a language code in ISO 639-1 format such as "es", or a combination of language and country code such as "es-AR", where the country code is in ISO 3166-1 Alpha 2 format.
    In case our indexing systems detect language or country codes that are not in these formats, we provide example URLs to help you fix them.

incorrect_language.png

Additionally, we've moved the geographic targeting setting to this part of Webmaster Tools, so that you can find all information relevant to international and multilingual targeting in the same place.

We hope you'll find this new feature useful and that it helps you to identify issues with the rel-hreflang-implementation on your site. If you have comments or questions about the feature, please post in our Webmaster Help Forum.

Posted by Gary Illyes, Webmaster Trends

Let’s have a look at each in detail.

Show users worldwide the same content 

In this scenario, you decide to serve specific content for one given country and language on your homepage / generic URL (http://www.example.com). This content will be available to anyone who accesses that URL directly in their browser or those who search for that URL specifically. As mentioned above, all country & language versions should also be accessible on their own unique URLs.


Note: You can show a banner on your page to suggest a more appropriate version to users from other locations or with different language settings.

Let users choose which local version and which language they want 

In this configuration, you decide to serve a country selector page on your homepage / generic URL and to let users choose which content they want to see depending on country and language. All users who type in that URL can access the same page.

If you implement this scenario on your international site, remember to use the x-default rel-alternate-hreflang annotation for the country selector page, which was specifically created for these kinds of pages. The x-default value helps us recognize pages that are not specific to one language or region.

Automatically redirect users or dynamically serve the appropriate HTML content depending on users’ location and language settings

A third scenario would be to automatically serve the appropriate HTML content to your users depending on their location and language settings. You will either do that by using server-side 302 redirects or by dynamically serving the right HTML content.

Remember to use x-default rel-alternate-hreflang annotation on the homepage / generic page even if the latter is a redirect page that is not accessible directly for users.

Note: Think about redirecting users for whom you do not have a specific version. For instance, French-speaking users on a website that has English, Spanish and Chinese versions. Show them the content that you consider the most appropriate.

Whatever configuration you decide to go with, you should make sure all the pages – including country and language selector pages:
  • Have rel-alternate-hreflang annotations.
  • Are accessible for Googlebot's crawling and indexing: do not block the crawling or indexing of your localized pages.
  • Always allow users to switch local version or language: you can do that using a drop down menu for instance.
Reminder: As mentioned in the beginning, remember that you must have separate URLs for each country and language version. 

About rel-alternate-hreflang annotations

Remember to annotate all your pages - whatever method you choose. This will greatly help search engines to show the right results to your users.

Country selector pages and redirecting or dynamically serving homepages should all use the x-default hreflang, which was specifically designed for auto-redirecting homepages and country selectors. 

Finally, here are a few useful reminders about rel-alternate-hreflang annotations in general:
  • Your annotations must be confirmed from the other pages. If page A links to page B, page B must link back to page A, otherwise, your annotations may not be interpreted correctly.
  • Your annotations should be self-referential. Page A should use rel-alternate-hreflang annotation linking to itself.
  • You can specify the rel-alternate-hreflang annotations in the HTTP header, in the head section of the HTML, or in a sitemap file. We strongly recommend that you choose only one way to implement the annotations, in order to avoid inconsistent signals and errors.
  • The value of the hreflang attribute must be in ISO 639-1 format for the language, and in ISO 3166-1 Alpha 2 format for the region. Specifying only the region is not supported. If you wish to configure your site only for a country, use the geotargeting feature in Webmaster Tools
Following these recommendations will help us better understand your localized content and serve more relevant results to your users in our search results. As always, if you have any questions or feedback, please tell us in the internationalization Webmaster Help Forum.

Share on Twitter Share on Facebook

Webmaster Level: All

The homepages of multinational and multilingual websites are sometimes configured to point visitors to localized pages, either via redirects or by changing the content to reflect the user’s language. Today we’ll introduce a new rel-alternate-hreflang annotation that the webmaster can use to specify such homepages that is supported by both Google and Yandex.

To see this in action, let’s look at an example. The website example.com has content that targets users around the world as follows:

Map of the world illustrating which hreflang code to use for which locale

In this case, the webmaster can annotate this cluster of pages using rel-alternate-hreflang using Sitemaps or using HTML link tags like this:


<link rel="alternate" href="http://example.com/en-gb" hreflang="en-gb" />
<link rel="alternate" href="http://example.com/en-us" hreflang="en-us" />
<link rel="alternate" href="http://example.com/en-au" hreflang="en-au" />
<link rel="alternate" href="http://example.com/" hreflang="x-default" />

The new x-default hreflang attribute value signals to our algorithms that this page doesn’t target any specific language or locale and is the default page when no other page is better suited. For example, it would be the page our algorithms try to show French-speaking searchers worldwide or English-speaking searchers on google.ca.

The same annotation applies for homepages that dynamically alter their contents based on a user’s perceived geolocation or the Accept-Language headers. The x-default hreflang value signals to our algorithms that such a page doesn’t target a specific language or locale.

As always, if you have any questions or feedback, please tell us in the Internationalization Webmaster Help Forum.


If you’re signed in, the corrections made on your site will go live right away -- the next time a visitor translates a page on your website, they’ll see your correction. If one of your visitors contributes a better translation, the suggestion will wait until you approve it. You can also invite other editors to make corrections and add translation glossary entries. You can learn more about these new features in the Help Center.

This new experimental feature is currently free of charge. We hope this feature, along with Translator Toolkit and the Translate API, can provide a low cost way to expand your reach globally and help to break down language barriers.

We can’t recommend this enough: reuse the same template for all language versions, and always try to keep the HTML of your template simple.

Keeping the HTML code the same for all languages has its advantages when it comes to maintenance. Hacking around with the HTML code for each language to fix bugs doesn’t scale–keep your page code as clean as possible and deal with any styling issues in the CSS. To name just one benefit of clean code: most translation tools will parse out the translatable content strings from the HTML document and that job is made much easier when the HTML is well-structured and valid.
How long is a piece of string?
If your design relies on text playing nicely with fixed-size elements, then translating your text might wreak havoc. For example, your left-hand side navigation text is likely to translate into much longer strings of text in several languages–check out the difference in string lengths between some English and Dutch language navigation for the same content. Be prepared for navigation titles that might wrap onto more than one line by figuring out your line height to accommodate this (also worth considering when you create your navigation text in English in the first place).

Variable word lengths cause particular issues in form labels and controls. If your form layout displays labels on the left and fields on the right, for example, longer text strings can flow over into two lines, whereas shorter text strings do not seem associated with their form input fields–both scenarios ruin the design and impede the readability of the form. Also consider the extra styling you’ll need for right-to-left (RTL) layouts (more on that later). For these reasons we design forms with labels above fields, for easy readability and styling that will translate well across languages.

Screenshots of Chinese and German versions of web forms
click to enlarge


Also avoid fixed-height columns–if you’re attempting to neaten up your layout with box backgrounds that match in height, chances are when your text is translated, the text will overrun areas that were only tall enough to contain your English content. Think about whether the UI elements you’re planning to use in your design will work when there is more or less text–for instance, horizontal vs. vertical tabs.
On the flip side
Source editing for bidirectional HTML can be problematic because many editors have not been built to support the Unicode bidirectional algorithm (more research on the problems and solutions). In short, the way your markup is displayed might get garbled:

<p>ابةتث <img src="foo.jpg" alt=" جحخد"< ذرزسش!</p>

Our own day-to-day usage has shown the following editors to currently provide decent solutions for bidirectional editing: particularly Coda, and also Dreamweaver, IntelliJ IDEA and JEditX.

When designing for RTL languages you can build most of the support you need into the core CSS and use the directional attribute of the html element (for backwards compatibility) in combination with a class on the body element. As always, keeping all styles in one core stylesheet makes for better maintainability.

Some key styling issues to watch out for: any elements floated right will need to be floated left and vice versa; extra padding or margin widths applied to one side of an element will need to be overridden and switched, and any text-align attributes should be reversed.

We generally use the following approach, including using a class on the body tag rather than a html[dir=rtl] CSS selector because this is compatible with older browsers:

Elements:

<body class="rtl">
<h1><a href="http://www.blogger.com/"><img alt="Google" src="http://www.google.com/images/logos/google_logo.png" /></a> Heading</h1>

Left-to-right (default) styling:

h1 {
  height: 55px;
  line-height: 2.05;
  margin: 0 0 25px;
  overflow: hidden;
}
h1 img {
  float: left;
  margin: 0 43px 0 0;
  position: relative;
}

Right-to-left styling:

body.rtl {
  direction: rtl;
}
body.rtl h1 img {
  float: right;
  margin: 0 0 0 43px;
}

(See this in action in English and Arabic.)

One final note on this subject: most of the time your content destined for right-to-left language pages will be bidirectional rather than purely RTL, because some strings will probably need to retain their LTR direction–for example, company names in Latin script or telephone numbers. The way to make sure the browser handles this correctly in a primarily RTL document is to wrap the embedded text strings with an inline element using an attribute to set direction, like this:

<h2>‫עוד ב- <span dir="ltr">Google</span>‬</h2>

In cases where you don’t have an HTML container to hook the dir attribute into, such as title elements or JavaScript-generated source code for message prompts, you can use this equivalent to set direction where &#x202B; and &#x202C;‬ are Unicode control characters for right-to-left embedding:

<title>&#x202B;‫הפוך את Google לדף הבית שלך‬&#x202C;</title>

Example usage in JavaScript code:
var ffError = '\u202B' +'כדי להגדיר את Google כדף הבית שלך ב\x2DFirefox, לחץ על הקישור \x22הפוך את Google לדף הבית שלי\x22, וגרור אותו אל סמל ה\x22בית\x22 בדפדפן שלך.'+ '\u202C';

(For more detail, see the W3C’s articles on creating HTML for Arabic, Hebrew and other right-to-left scripts and authoring right-to-left scripts.)
It’s all Greek to me…
If you’ve never worked with non-Latin character sets before (Cyrillic, Greek, and a myriad of Asian and Indic), you might find that both your editor and browser do not display content as intended.

Check that your editor and browser encodings are set to UTF-8 (recommended) and consider adding a element and the lang attribute of the html element to your HTML template so browsers know what to expect when rendering your page–this has the added benefit of ensuring that all Unicode characters are displayed correctly, so using HTML entities such as &eacute; (é) will not be necessary, saving valuable bytes! Check the W3C’s tutorial on character encoding if you’re having trouble–it contains in-depth explanations of the issues.
A word on naming
Lastly, a practical tip on naming conventions when creating several language versions. Using a standard such as the ISO 639-1 language codes for naming helps when you start to deal with several language versions of the same document.

Using a conventional standard will help users understand your site’s structure as well as making it more maintainable for all webmasters who might develop the site, and using the language codes for other site assets (logo images, PDF documents) is handy to be able to quickly identify files.

See previous Webmaster Central posts for advice about URL structures and other issues surrounding working with multi-regional websites and working with multilingual websites.

That’s a summary of the main challenges we wrestle with on a daily basis; but we can vouch for the fact that putting in the planning and work up front towards well-structured HTML and robust CSS pays dividends during localization!


Because you’re trying to target a multilingual audience, once Javier hits “Publish,” his profile becomes immediately available in other languages with the translated templates. Also, each of the new language versions is served on a separate URL.


Two localized versions, http://en.example.com/javier-lopez in English and http://fr.example.com/javier-lopez in French

Background on the old issue: duplicate content caused by language variations

The configuration above allowed visitors speaking different languages to more easily interpret the content, but for search engines it was slightly problematic: there are three URLs (English, French, and Spanish versions) for the same main content in Javier’s profile. Webmasters wanted to avoid duplicate content issues (such as PageRank dilution) from these multiple versions and still ensure that we would serve the appropriate version to the user.

A new solution for localized templates

First of all, just to be clear, the strategy we’re proposing isn’t appropriate for multilingual sites that completely translate each page’s content. We’re trying to specifically improve the situation where the template is localized but the main content of a page remains duplicate/identical across language/country variants.

Before we get into the specific steps, our prior advice remains applicable:
  • Have one URL associated with one piece of content. We recommend against using the same URL for multiple languages, such as serving both French and English versions on example.com/page.html based on user information (IP address, Accept-Language HTTP header).

  • When multiple languages are at play, it’s best to include the language or country indication in the URL, e.g., example.com/en/welcome.html and example.com/fr/accueil.html (which specify “en” and “fr”) rather than example.com/welcome.html and example.com/accueil.html (which don’t contain an explicit country/language specification). More suggestions can be found in our blog posts about designing localized URLs and multilingual sites.
For the new feature:
Step 1: Select the proper canonical.
The canonical designates the version of your content you’d like indexed and returned to users.
The first step towards making the right content indexable is to pick one canonical URL that best reflects the genuine locale of the page’s main content. In the example above, since Javier is a Spanish-speaking user and he created his profile on es.example.com, http://es.example.com/javier-lopez is the logical canonical. The title and snippet in all locales will be selected from the canonical URL.

Once you have the canonical URL picked out, you can either:
A. 301 (permanent redirect) from the language variants to the canonical

As an example, if a French speaker visits fr.example.com/javier-lopez (not the canonical), have this page include a cookie to remember the user's language preference of French. Then permanently redirect from fr.example.com/javier-lopez to the canonical at es.example.com/javier-lopez. Because of the cookie, es.example.com/javier-lopez will still render its boilerplate in French (even on the es.example.com subdomain!). Similarly, en.example.com/javier-lopez would set the value of this cookie to English and then 301 redirect to es.example.com/javier-lopez.

Including a language selection link is also helpful should a multilingual user prefer a different experience of your site.

B. Use rel=”canonical”

On the other language variants, include a link rel=”canonical” tag pointing to your chosen canonical. In our example, since the canonical for Javier’s profile is the Spanish version, the English and French pages (and optionally even the Spanish page itself) would include <link rel=”canonical” href="http://es.example.com/javier-lopez" />.

Cookies are not involved in this setup. Therefore, a French speaker will be served es.example.com/javier-lopez with a Spanish template. Implement step 2 if you want the French speakers to be served the French version at fr.example.com/javier-lopez in Google search results.
Step 2: In the canonical URL, specify the various language versions via the rel=”alternate” link tag, using its hreflang attribute.

rel=”alternate” URLs can be displayed in search results in accordance with a user’s language preference. The title and snippet, however, remain generated from the canonical URL (as is customary with rel=”canonical”), not from the content of any rel=”alternate”.
You can help Google display the correctly localized variant of your URL to our international users by adding the following tags to http://es.example.com/javier-lopez, the selected canonical:

<link rel=”alternate” hreflang="en" href="http://en.example.com/javier-lopez" />

<link rel=”alternate” hreflang="fr" href="http://fr.example.com/javier-lopez" />

rel=”alternate” indicates that the URL contains an alternate version located at the URI of the href value. hreflang identifies the language code of the alternate URL and can be specified with ISO-639.

Please note: If your site supports many languages and you’re worried about the increased file size when declaring numerous rel=”alternate” URLs, please see our Help Center article about configuring rel=”alternate” with file size constraints.
Once the steps are completed, the configuration on “The Network” would look like this:
  • http://en.example.com/javier-lopez
    either 301s with a language cookie or contains <link rel=”canonical” href=”http://es.example.com/javier-lopez” />
  • http://fr.example.com/javier-lopez
    either 301s with a language cookie or contains <link rel=”canonical” href=”http://es.example.com/javier-lopez” />
  • http://es.example.com/javier-lopez
    is the canonical and contains
    <link rel=”alternate” hreflang="en" href="http://en.example.com/javier-lopez" />
    and
    <link rel=”alternate” hreflang="fr" href="http://fr.example.com/javier-lopez" />

Results of the above implementation
  • When your content is returned in search results, users will likely see the URL that corresponds to their language preference, whether or not it’s the canonical. (Good news!) This is because with with rel=”canonical” or a 301 redirect, we can cluster the language variations with the canonical. With rel=”alternate” hreflang=”x” at serve-time we can deliver the URL of the most appropriate language to the user: English speakers will be served en.example.com/javier-lopez as the result for the URL in Javier’s profile, French speakers will see fr.example.com/javier-lopez, Spanish speakers will see es.example.com/javier-lopez.

  • By implementing step 1, only content from the canonical version will be available for users in search results (i.e. content from the duplicate versions won’t be searchable). Because the Spanish version es.example.com/javier-lopez is the canonical, queries that include template content from this page, e.g. [Javier Lopez familia] -- when using any language preference -- may return his profile (content from the canonical version). On the other hand, queries that include template content of the “duplicate” version, e.g. [Javier Lopez family], are less likely to return his profile page. If you would like the other language versions indexed separately and searchable, avoid using rel=”canonical” and rel=”alternate”.

  • Indexing properties, such as linking information, from the duplicate language variants will be consolidated with the canonical.

To recap (one more time, with feeling!)

For sites that have their template localized but the keep their pages’ main content untranslated:

Step 1: Once you have the canonical picked out you can use either rel=”canonical” or a 301 (permanent redirect) from the various localized pages to the canonical URL.

Step 2: On the canonical URL, specify the language-specific duplicated content with different boilerplate via the rel=”alternate” link tag, using its hreflang attribute. This way, Google can show the correctly-localized variant of your URLs to our international users.

We realize this can be a little complicated, so if you have questions, please ask in our webmaster forum!

  • Check your backlinks. Since it won't be possible to set up a redirection from the old .yu domain to your new one, all links pointing to .yu domains will lead to dead ends. This means that it will be increasingly difficult for search engines to retrieve your new content. To find out who is linking to you, sign up with Google Webmaster Tools and check the links to your site (you can also download this list as a "comma separated value" -- .csv -- file for ease of use). Then read through the list for sites that you recognize as important and contact their webmasters to make sure that they update their links to your new website.
  • Check your internal links. If you are planning to simply move your content in bulk from the old to the new site, make sure that the new internal navigation is up to date. For example, if you are renaming pages on your site from "www.example.yu/home.htm" to "www.example.com/home.htm" make sure that your internal navigation reflects such changes to prevent broken links.
  • Start moving the site to your new domain. It's a good idea to start moving while you can still maintain control of your old domain, so don't wait! As mentioned in our best practices when moving your site, we recommend starting by moving a single directory or subdomain, and testing the results before completing the move. Remember that you will not be able to keep a 301 redirection on your old domain after September 30, so start your test early.
While you're moving your site, you can test how Google crawls and indexes your new site at its new location by submitting a Sitemap via Google Webmaster Tools. Although we may not crawl or index all the pages listed in each Sitemap, we recommend that you submit one because doing so helps Google understand your site better. You can read more on this topic in our answers to the most frequently asked questions on Sitemaps. And remember that for any question or concerns we're waiting for you in the Google Webmaster Help Forum!
Update: as mentioned here, we have introduced a new feature: Change of Address. Check it out if you are moving from one domain to another! By using this feature you will help us update our index faster and hopefully make the transition for your users smoother.


Users may already be translating your webpage using Google Translate, but you can make it even easier by including our "Translate My Page" gadget, available at http://translate.google.com/translate_tools.

The gadget will be rendered in the user's language, so if they come to your page and can't understand anything else, they'll be able to read the gadget, and translate your page into their language.

Sometimes there may be some content on your page that you don't want us to translate. You can now add class=notranslate to any HTML element to prevent that element from being translated. For example, you may want to do something like:
Email us at <span class="notranslate">sales at mydomain dot com</span>
And if you have an entire page that should not be translated, you can add:
<meta name="google" value="notranslate">
to the <head> of your page and we won't translate any of the content on that page.

Update on 12/15/2008: We also support:
<meta name="google" content="notranslate">
Thanks to chaoskaizer for pointing this out in the comments. :)

Lastly, if you want to do some fancier automatic translation integrated directly into your page, check out the AJAX Language API we launched last March.

With these tools we hope you can more easily make your content available in all the languages we support, including Arabic, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Filipino, Finnish, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish, Ukrainian, and Vietnamese.

If you have more questions on this topic, you can join our Webmaster Help Group to get more advice.