The Skramstad Slurp Bot
The Skramstad Slurp Bot is a robot crawler that indexes the web to build a document repository
for its search engine. It discovers web documents by following links.
The robot crawler obeys robots.txt directives (e.g. Disallow: /private-directory). robots.txt files are placed
in the root of a website, for example https://www.cia.gov/robots.txt.
The syntax of Robot Exclusion Standards can be found
at www.robotstxt.org/robotstxt.html.
|
|