About sitemaps and their submission to search engines

January 30, 2008

Sitemaps are a standardized way of telling search engines which pages they should crawl.

How useful are they? My own experience is that for content-oriented websites and blogs, sitemaps are an efficient way to get known by search engines.

For each post I write here, I tend to welcome readers through Google search about 10 days after the posting date. As a consequence I believe sitemaps are an interesting asset to give your future readers a better user experience while trying to reach you.

What does a sitemap look like?

Basically an XML file with URLs in it:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://blog.logeek.fr/2008/1/19/a-beginner-s-guide-to-datawarehouse</loc>
    <lastmod>2008-01-21T03:23:05+00:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.5</priority>
  </url>
  .... more urls ....
</urlset>

If you manage or create a website, you can generate the sitemap yourself or use an existing tool. Most blogging engines provide a way to generate sitemaps either natively or through a plugin.

How to use the sitemap once you have one?

There are a couple of ways to tell search engines that you are using a sitemap.

One option is to declare it in your robots.txt file. Another option is to push the sitemap location to the search engines directly — if you have a new website I believe this option will help you spread the word.

Where to submit the sitemap location? Most of the big search engines support sitemaps:

One last tip: have a look at how your pages look like once indexed using the site:mysite.com Google query. What you get should be easy to read and understand to help people looking for content find what they need.