How to keep robots out of your web site
THE ROBOTS.TXT FILE
You know that search engines have been created to help people find information quickly on the Internet- and the search engines acquire much of their information through robots ( Contd...
You know that search engines have been created to help people find information quickly on the Internet- and the search engines acquire much of their information through robots (also known as spiders or crawlers)- that look for web pages for them.
The spiders or crawlers robots explore the web looking for and recording all kinds of information. They usually start with URL submitted by users- or from links they find on the web sites- the sitemap files or the top level of a site.
Once the robot accesses the home page then recursively accesses all pages linked from that page. But the robot can also check out all the pages that can find on a particular server.
After the robot finds a web page it works indexing the title- the keywords- the text- etc. But sometimes you might want to prevent search engines from indexing some of your web pages like news postings- and specially marked web pages (in example: affiliate´s pages)- but whether individual robots comply to these conventions is pure voluntary.
ROBOTS EXCLUSION PROTOCOL
So if you want robots to keep out from some of your web pages- you can ask robots to ignore the web pages that you don´t want indexed- and to do that you can place a robots.txt file on the local root server of your web site.
Internet Business
How to keep robots out of your web site
Search Engine Optimization - A Word is a Word is a Word...
I/' d like to talk a little about words. In particular- I/' d like to talk about a special category of words. Words that you may not suspect are important- vital really- to your online business su...
Why you need to have a text link strategy and how to get started
If you have been on the internet more than 10 seconds you know what a text link is -- a word- or group of words which can be clicked to direct a web surfer to another page or site on the internet wh...
Search Engine Optimization - A Word is a Word is a Word...
I/' d like to talk a little about words. In particular- I/' d like to talk about a special category of words. Words that you may not suspect are important- vital really- to your online business su...
Why you need to have a text link strategy and how to get started
If you have been on the internet more than 10 seconds you know what a text link is -- a word- or group of words which can be clicked to direct a web surfer to another page or site on the internet wh...
In example if you have a directory called e-books and you want to ask robots to keep out of it- your robots.txt file should read:
User-agent: * Disallow: e-books/
When you don´t have enough control over your server to set up a robots.txt file- you can try adding a META tag to the head section of any HTML document.
In example- a tag like the following tells robots not to index and not to follow links on a particular page:
meta name="ROBOTS" content="NOINDEX- NOFOLLOW"
Support for the META tag among robots is not so frequent as the Robots Exclusion Protocol- but most of major web indexes currently support it.
Related Articles in Internet Business
Search Engine Optimisation Pitfalls
On page factors - Is your website search engine friendly? So you have a website but where is it on Google? Have you fallen foul of a penalty or have you overlooked one of the many common search engi...
Getting Indexed in Google – Quick and Easy
Google is the dominant search engine on the net- controlling more traffic than Yahoo and MSN combined. This means you need to get your entire site into Google as fast as possible. Getting Indexed i...
On page factors - Is your website search engine friendly? So you have a website but where is it on Google? Have you fallen foul of a penalty or have you overlooked one of the many common search engi...
Getting Indexed in Google – Quick and Easy
Google is the dominant search engine on the net- controlling more traffic than Yahoo and MSN combined. This means you need to get your entire site into Google as fast as possible. Getting Indexed i...
NEWS POSTINGS
If you want to keep the search engines out of your news postings- you can create an an "X-no-archive" line in of your postings/' headers:
X-no-archive: yes
But although common news clients- allow you to add an X-no-archive line to the headers of your news postings- some of them don´t permit you to do so.
The problem is that most search engines assume that all information they find is public unless marked otherwise.
So be careful because though the robot and archive exclusion standards may help keep your material out of major search engines there are some others that respect no such rules.
Five Common Myths About Search Engine Optimization
Picture this scene- an adolescent boy walks into a barber shop and says to the barber- “Don’t touch me- I’m only here because my mom forced me.” Search engine optimizers are sometimes put into the...
How to Harness the Power of Web Directories: The Missing Link in Your SEO Strategy
So you want exposure on the Internet? Of course you do. You want to drive people to your site- because that’s the only way your online business can succeed. And the more eyes you can get to your pag...
...
Picture this scene- an adolescent boy walks into a barber shop and says to the barber- “Don’t touch me- I’m only here because my mom forced me.” Search engine optimizers are sometimes put into the...
How to Harness the Power of Web Directories: The Missing Link in Your SEO Strategy
So you want exposure on the Internet? Of course you do. You want to drive people to your site- because that’s the only way your online business can succeed. And the more eyes you can get to your pag...
...
If you/' re highly concerned about the privacy of your e-mail and Usenet postings- you must use some anonymous remailers and PGP. You can read about it here:
http://www.well.com/user/abacard/remail.html http://www.io.com/~combs/htmls/crypto.html
http://world.std.com/~franl/pgp/
Even if you are not particularly concerned about privacy- remember that anything you write will be indexed and archived somewhere for eternity- so use the robots.txt file as much as you need it.
Written by Dr. Roberto A. Bonomi
tagged
-
Related Articles:
- Search Engine Optimization - A Word is a Word is a Word...
I/' d like to talk a little about words. In particular- I/' d like to talk about a special category of words. Words that you may not suspect are important- vital really- to your online business su - Why you need to have a text link strategy and how to get started
If you have been on the internet more than 10 seconds you know what a text link is -- a word- or group of words which can be clicked to direct a web surfer to another page or site on the internet wh - Search Engine Optimisation Pitfalls
On page factors - Is your website search engine friendly? So you have a website but where is it on Google? Have you fallen foul of a penalty or have you overlooked one of the many common search engi - Getting Indexed in Google – Quick and Easy
Google is the dominant search engine on the net- controlling more traffic than Yahoo and MSN combined. This means you need to get your entire site into Google as fast as possible. Getting Indexed i - Five Common Myths About Search Engine Optimization
Picture this scene- an adolescent boy walks into a barber shop and says to the barber- “Don’t touch me- I’m only here because my mom forced me.” Search engine optimizers are sometimes put into the - How to Harness the Power of Web Directories: The Missing Link in Your SEO Strategy
So you want exposure on the Internet? Of course you do. You want to drive people to your site- because that’s the only way your online business can succeed. And the more eyes you can get to your pag




