The robots.txt File: What it is and How to Use it

The robots.txt file is a very simple and easy-to-use tool that helps you direct and control the activity of robots on your website. Using the robots.txt file you can direct the spiders to the site maps and content on your site, while at the same time keeping them out of your administration and other private areas.

The basic robots.txt file is extremely simple. Take a look at the following example:

User-agent: *
Sitemap: http://www.yourwebsite.com/sitemap.xml
Disallow: /yourprivatecontent/
Allow: /

The first line in the above example  says that these instructions are for all bots.  On the second line we hand the visiting bots the map, and then we start telling them all of the places they can't go.

Yep, first thing is to tell them where not to go; usually, they never go there (although not all bots respect this, however). Once we have finished telling them all the places they can't go, we then tell them where they can visit.

The above example would tell all of the bots how to find the map, how to avoid /yourprivatecontent/, and, finally, how they can get anywhere else they please.

More information on the robots.txt file can be found at www.robotstxt.org

Written by Trinity C. McKenzie for publication by Techno-Witch.com