Implementation of a Hidden Web crawler

dc.contributor.authorMAHAMEDI, Soundous
dc.contributor.authorSupervisor: SAOUDI, Lalia
dc.date.accessioned2023-05-24T08:26:03Z
dc.date.available2023-05-24T08:26:03Z
dc.date.issued2015-06-10
dc.description.abstractCurrent-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of web pages reachable purely by following hypertext links, ignoring search forms and pages that require authorization or prior registration. In particular, they ignore the tremendous amount of high quality content "hidden" behind search forms, in large searchable electronic databases. In this work, we provide a framework for addressing the problem of extracting content from this hidden Web, that is why we have built a task-specific hidden Web crawler called the Intelligent Hidden Web Crawler (IHiWC). We describe the architecture of IHiWC and present a number of new techniques that went into its design, approach and implementation. We also present results from experiments we conducted to test and validate our techniques.en_US
dc.identifier.urihttp://dspace.univ-msila.dz:8080//xmlui/handle/123456789/38681
dc.language.isoenen_US
dc.publisherUniversity of M'silaen_US
dc.subjectDeep crawler, Hidden Web Crawling, forms classification, forms submissionen_US
dc.titleImplementation of a Hidden Web crawleren_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MAHAMEDI Soundous.PDF
Size:
9.25 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections