Home  >  Search Engine Optimisation  >  Robots Text File
You are not logged in:   Control Panel  |  Messages  |  Your Account  |  Tools
On-Site Elements
Robots.txt File
More About Robots.txt
Robots.txt is a file created by webmasters to instruct search engines on crawl behaviour, wether a directory or file type should be indexed. Robots.txt are used in many different scenarios, mainly to prevent indexing of protected content, images and much more. Search engines obey robots.txt, as this allows the webmaster to have selective preferences for website security and for privacy reasons.

Hundreds of thousands of websites are built and released online with no block on crawl permissions for various directories of a website. This prevents major security implications as search engines can list protected content unless instructed not too.

Robots.txt is also used for SEO purposes to prevent the indexing of various documents. Some webmasters prevent pages with duplicate content from being indexed, whilst others use robots.txt to stop Google from crawling internal links pages and much more.

Robots.txt is a raw text file placed on a webserver, typically on the root of the domain, so that search engines can access it directly. Search engines will obey the robots.txt file as it is a mandatory requirement. Over the years the robots.txt file has become of more use, and has many substantial benefits, even tailoring instructions to each search engine based on the crawlers name.

Search engines have bots that work on their behalf, to crawl and index content, these bots have names assigned to them, it is these names that can be used within robots.txt to either allow or block access to various documents. For example, you can prevent Yahoo's bot (Slurp) from indexing a particular part of the website, whilst allowing other bots such as Google's bot (Googlebot) to index it.

This is very useful when tailoring results as per search engine requirements. Some search engines for example, may not be able to index flash, as a result, you could allow Google and Yahoo to crawl it, whilst disallowing other search engines to prevent failed results.

This standard has become more widely used over the past 8 years as the internet and search engines have evolved, it is with this that more webmasters really need to utilise the power of robots.txt.
More About Robots.txt
Search engines are numerous, with over 50,000 search engines online, out of this huge volume, only about 10 actually provide any worthwhile traffic. A list of search engine robot names can be found at http://www.robotstxt.org/db.html.

So what commands are available within robots.txt?

There are many different things that can be specified in Robots.txt, such as disallow and allow, which we elaborate further on below:

Example Robots.txt:

# robots.txt for http://www.example.com/

User-agent: *
Disallow: /cyberworld/map/ # This is an infinite virtual URL space
Disallow: /tmp/ # these will soon disappear
Disallow: /foo.html

Above is a snippet from an example robots.txt file, here's how the commands break down.

Break Down of Information
User-agent
User agent is the name of the search engine bot, if you specify a chain of search bots here, you can instruct either all or only the listed search engines to adhere to the commands. If you would like all search engines to follow the commands, simply use a star *.

Disallow
What ever file or directory follows this command it will be honored by the specified user agent. For example, if you disallow mypage.html it will mean that the user agent will not index the file. Here you can also disallow entire directories by listing them as they are presented on the server, i.e. /mydirectory/.

Part of the Foley-Computers Group.
Search Engine Optimisation
Link Building
Keyword Research
Competitor Analysis
Search Engine Marketing
Pay Per Click
Website Design
Article Writing
Assertive-Media © 2002 - 2009 All Rights Reserved.
Please note that all material found on this website is copyright Assertive-Media. We protect our website and its contents using Copyscape.
Website Competitor Profiling
Off-Site Search Engine Optimisation
Google Analytics
Free SEO Audit
SEO Knowledge Base
Web Development
E-Commerce
Sitemap |  About Us |  Contact Us |  Privacy Policy  | Who We Are
Bookmark and Share
Bookmark and Share
Related Content
Title Tags

Creating Optimal Title Tags


Assertive-Media offer a comprehensive search engine optimisation service. If you would like to enquire about our title tag development simply fill out our enquiry form, or call us on 01582 524 969.

To enquire about our title tag creation services simply call us on 01582 524 969.

Enquiry Form


Search Engine Optimisation

On-Site Elements
Title Tags
Meta Tags
Header Tags
Page Content
Anchor Text
Internal Links
URL Structure
Keyword Emphasis

External Elements
301 Redirects
302 Redirects
HTTP Status Codes
Canonicalisation
Robots Text File
Crawl Permissions
404 Errors
Sub-Domains
Domain Age
Domain Keyword Utilisation
Geographic Targeting

Get a FREE SEO Audit on your website today