answersLogoWhite

0


Best Answer
Disadvantages of Web RobotsNetwork Performance

Robots traditionally have a bad press in discussions on bandwidth, even though the functions of some well-written and ethical robots are ultimately to conserve bandwidth.

There are points to consider on the bandwidth front, since robots can span relatively large portions of Web-space over short periods. Bottlenecks can arise locally though high bandwidth consumption, particularly if the robot is in frequent or permanent use, or if it is used during network peak times. The problem is exacerbated if the frequency of requests for resources is unregulated.

Server-side Concerns

So-called "rapid-fire" requests (successive HTTP requests to a single server without delays) have been shown to be very resource consuming for a server under current HTTP implementations (in fact, this is the basis of several "denial of service" attacks). Here again, an unregulated robot can cause problems. Suitable delays and an ethical traversal algorithm can help resolve this.

The skewing of server logs is another issue that causes concern. A robot that indexes an entire site will distort logs if a recognised "user-agent" is not supplied. These may be hard to distinguish from regular users.

Unethical Robots

A small number of rogue robots are in use. The tasks of these robots are such that they are particularly unwelcome by servers. Such tasks include email culling, for the production of large lists of email addresses that can be sold to advertisers and copyright violation through copying entire sites.

Additionally robots can contribute to a site "hit quota" and consume bandwidth which the site may pay for.

User Avatar

Wiki User

12y ago
This answer is:
User Avatar

Add your answer:

Earn +20 pts
Q: What are the disadvantages of web crawlers?
Write your answer...
Submit
Still have questions?
magnify glass
imp
Related questions

What is the part of the search engine responsible for collecting data on the web?

Web crawlers are charged with the responsibility to visiting webpages and reporting what they find to the search engines. Google has its own web crawlers (aka robots) and they call them Googlebots. Web crawlers have also been referred to as spiders, although I think this term is more commonly replaced with "bots".


What web crawlers use PHP?

PHPCrawl, PHP Parallel Web Scraper I'm sure there are many others.


What search engine uses web crawlers?

The most well known are Google, Bing, and Yahoo.


What are crawlers and for what purpose are they used?

A crawler is a computer program with the purpose to visit web sites and do something with the information on it. Many crawlers crawl for search engines to index whatever page they visit. Such crawlers often return several times per day to check for updates. Another use is to gather information such as mail addresses or something that suits the owner. This kind of crawlers check all the links on the page and visit them after the information collection, and in this way never stopping but keep crawling all over (the public parts of) the Web.


When was The Crawlers created?

The Crawlers was created in 1954.


What is a sitemap?

Google Sitemaps is an experiment in Web crawling by using Sitemaps to inform and direct Google search crawlers. Webmasters can place a Sitemap-formatted file on their Web server which enables Google crawlers to find out what pages are present and which have recently changed, and to crawl your site accordingly. Google Sitemaps is intended for all web site owners, from those with a single web page to companies with millions of ever-changing pages.


What is a Google Sitemap?

Google Sitemaps is an experiment in Web crawling by using Sitemaps to inform and direct Google search crawlers. Webmasters can place a Sitemap-formatted file on their Web server which enables Google crawlers to find out what pages are present and which have recently changed, and to crawl your site accordingly. Google Sitemaps is intended for all web site owners, from those with a single web page to companies with millions of ever-changing pages.


When was Creepy Crawlers created?

Creepy Crawlers was created in 1964.


When was The Sky Crawlers created?

The Sky Crawlers was created in 2001.


How does one raise night crawlers?

How do you raise night crawlers


What are disadvantages of web directories?

nuffin'


How does Lycos fetch submitted documents?

Lycos fetches submitted documents by sending out automated web crawlers, also known as spiders, to systematically browse and index content from publicly accessible web pages. These crawlers follow links from one page to another, collecting information to be stored in the search engine's database for retrieval in response to user queries.