The sitemap.xml File: A Most Important File for Visibility on the Web ---- Page 1 ----

The sitemap.xml file should contain all of the page links in your website. It is perhaps one of the most important maps you will make of your site. Having a proper and up-to-date site map gives search engines a critical list of every page in your site, including when it was last updated, and how often it is updated.

Luckily, the sitemap.xml file is really simple to create. The down side is that it can be a very long file if you have many web pages. In that case, you would probably want to break the site map into pieces representing the sections of your site.

Here is an example of a simple, two-page site map with an explanation of the entries.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>http://www.techno-witch.com/simplywebsites/default.aspx</loc>
        <lastmod>2009-08-04</lastmod>
        <changefreq>monthly</changefreq>
        <priority>0.5</priority>
    </url>
    <url>
        <loc>http://www.techno-witch.com/simplywebsites/usefulwebsitetools.aspx</loc>
        <lastmod>2009-08-04</lastmod>
        <changefreq>monthly</changefreq>
        <priority>0.5</priority>
        </url>
</urlset>

Ok, so for each page on our site we create a <url></url> block. Inside the block, there are four basic items we can adjust. The first is the <loc></loc> block, which contains the full url to the page. You must start it with http://. This is not valid 'www.techno-witch.com'; this is valid 'http://www.techno-witch.com'.

The next block is <lastmod></lastmod>. It contains the date that the page was last modified. The format for the date and time is WC3 Datetime. The format for the date would therefore be:

YYYY-MM-DD

By specifying the date of the last update we can help ensure that as our page content is updated, our information on the search engines automatically updates with it. It is entirely possible that some of your pages could be overlooked by a search engine spider if your lastmod date doesn't reflect the fact that you just made changes. The spider may simply look at the date and think 'oh, nothing's changed since the last time I looked, so why bother?'. But if you say, 'Hey it's just changed!', then the spider is likely to respond with 'Hey, cool, I'll make sure I check it out asap!'.

Next we have <changefreq></changefreq>. This block will tell the bot how often the page is updated. Now not every engine uses this value, and therefore it is technically optional. For those engines that do use this value, it can only help to include it.

The last value is <priority></priority> and it, too, is optional. What this does is allow you to rank the importance of the various pages, in theory, making the higher priority ones display more prominently. This value doesn't actually help your position in the search results. It simply helps the engines determine which of your pages to display first when it has to choose the most important ones.

Written by Trinity C. McKenzie for publication by Techno-Witch.com