Sold out

Information Retrieval System - A Domain Specific Parallel Crawler

English · Paperback / Softback

Description

Read more

The World Wide Web is an interlinked collection of billions of documents formatted using HTML. Due to the growing and dynamic nature of the web, it has become a challenge to traverse all URLs in the web documents and handle these URLs, so it has become imperative to parallelize a crawling process. The crawler process is further being parallelized in the form ecology of crawler workers that parallely download information from the web. This paper proposes a novel architecture of parallel crawler, which is based on domain specific crawling, makes crawling task more effective, scalable and load-sharing among the different crawlers which parallel download web pages related to different domains specific URLs.

Product details

Authors Nidhi Tyagi
Publisher VDM Verlag Dr. Müller
 
Languages English
Product format Paperback / Softback
Released 25.08.2011
 
EAN 9783639377798
ISBN 978-3-639-37779-8
No. of pages 92
Subject Natural sciences, medicine, IT, technology > IT, data processing > Internet

Customer reviews

No reviews have been written for this item yet. Write the first review and be helpful to other users when they decide on a purchase.

Write a review

Thumbs up or thumbs down? Write your own review.

For messages to CeDe.ch please use the contact form.

The input fields marked * are obligatory

By submitting this form you agree to our data privacy statement.