Most of the recommendations we've made in the past are for individual webmasters running their own websites. We thought we'd offer up some best practices for websites that allow users to create their own websites or host users' data, like Blogger or Google Sites. This class of websites is often referred to as freehosts, although these recommendations apply to certain "non-free" providers as well.
Webmaster Tools provides your users with detailed reports about their website's visibility in Google. Before we can grant your users access, we need to verify that they own their particular websites. Verifying ownership of a site in Webmaster Tools can be done using a custom HTML file, a meta tag, or seamless integration in your system via Google Services for Websites. Other website management suites such as Yahoo! Site Explorer and Bing Webmaster Tools may use similar verification methods; we recommend making sure your users can access each of these suites.
Webmaster Tools verifies websites based on a single URL, but assumes that users should be able to see data for all URLs 'beneath' this URL in the site URL hierarchy. See our article on verifying subdomains and subdirectories for more information. Beyond Webmaster Tools, many automated systems on the web--such as search engines or aggregators--expect websites to be structured in this way, and by doing so you'll be making it easier for those systems to find and organize your content.
Let users set their own titles, or automatically set the pages on your users' websites to be descriptive of the content on that page. For example, all of the user page titles should not be "Blogger: Create your free blog". Similarly, if a user's website has more than one page with different content, they should not all have the same title: "User XYZ's Homepage".
Certain meta tags are reasonably useful for search engines and users may want to control them. These include tags with the name attribute of "robots", "description", "googlebot", "slurp", or "msnbot". Click on the specific name attributes to learn more about what these tags do.
Google Analytics is free enterprise-class analytics software that can run on a website by just adding a snippet of JavaScript to the page. If you don't want to allow users to add arbitrary JavaScript for security reasons, the Google Analytics code only changes by one simple ID. If your let your users tell you their Google Analytics ID, you can set up the rest for them. Users get more value out of your service if they can understand their traffic better. For example, see Weebly's support page on adding Google Analytics. We recommend considering similar methods you can use for enabling access to other third-party applications.
Tastes change. Someone on your service might want to change their account name or even move to another site altogether. Help them by allowing them to access their own data and by letting them tell search engines when they move part or all of their site via the use of 301 redirect destinations. Similarly, if users want to remove a page/site instead of moving it, please return a 404 HTTP response code so that search engines will know that the page/site is no longer around. This allows users to use the urgent URL removal tool (if necessary), and makes sure that these pages drop out of search results as soon as possible.
Search engines continue to crawl more and more of the web. Help our crawlers find the best content across your site. Allow us to crawl users' content, including media like user-uploaded images. Help us find users' content using XML Sitemaps. Help us to steer clear of duplicate versions of the same content so we can find more of the good stuff your users are creating by creating only one URL for each piece of content when possible, and by specifying your canonical URLs when not. If you're hosting blogs, create RSS feeds that we can discover in Google Blog Search. If your site is down or showing errors, please return 5xx response codes. This helps us avoid indexing lots of "We'll be right back" pages by letting crawlers know that the content is temporarily unavailable.
Can you think of any other best practices that you would recommend for sites that host users' data or pages?
<meta name="title" content="Baroo? - cute puppies" />RDFa (Yahoo! SearchMonkey):
<meta name="description" content="The cutest canine head tilts on the Internet!" />
<link rel="image_src" href="http://example.com/thumbnail_preview.jpg" />
<link rel="video_src" href="http://example.com/video_object.swf?id=12345"/>
<meta name="video_height" content="296" />
<meta name="video_width" content="512" />
<meta name="video_type" content="application/x-shockwave-flash" />
<object width="512" height="296" rel="media:video"
resource="http://example.com/video_object.swf?id=12345"
xmlns:media="http://search.yahoo.com/searchmonkey/media/"
xmlns:dc="http://purl.org/dc/terms/">
<param name="movie" value="http://example.com/video_object.swf?id=12345" />
<embed src="http://example.com/video_object.swf?id=12345"
type="application/x-shockwave-flash" width="512" height="296"></embed>
<a rel="media:thumbnail" href="http://example.com/thumbnail_preview.jpg" />
<a rel="dc:license" href="http://example.com/terms_of_service.html" />
<span property="dc:description" content="Cute Overload defines Baroo? as: Dogspeak for 'Whut the...?'
Frequently accompanied by the Canine Tilt and/or wrinkled brow for enhanced effect." />
<span property="media:title" content="Baroo? - cute puppies" />
<span property="media:width" content="512" />
<span property="media:height" content="296" />
<span property="media:type" content="application/x-shockwave-flash" />
<span property="media:region" content="us" />
<span property="media:region" content="uk" />
<span property="media:duration" content="63" />
</object>