Unit 7 What Is Involved In Creating A Web Crawler

553 Words3 Pages

Discussion Assignment Unit 7
What is involved in creating a web crawler? What are the difference between static and dynamic web content?

Introduction:
When we are searching for a file or document, a search engine must tell you where that file or document can be found. To find information on the hundreds of millions of Web pages that exist on the internet, a search engine uses special software robots, called spiders, to build lists of the words found on Web sites. The process of extracting web page in order to index them and support a search engine called Web crawling (Manning, Raghaven & Schütze, 2009).

The following operations are involved in creating a web crawler:
• A crawler picks a URL from one or more URLs that constitute a seed set. …show more content…

“A duplicate elimination module that determines whether an extracted link is already in the URL frontier or has recently been fetched” (Manning, Raghaven & Schütze, 2009).

The differences between static and dynamic web content
Dynamic Web sites, as opposed to the static ones on which the Web was first built, are easier to maintain, are more responsive to users, and can alter their content in response to differing situations. A dynamic page does not exist within a web server. The page is created using input from the user. A program residing on the web server will create and format the page. The page that is created by the program is then downloaded to the user’s browser. A copy of the page is not (usually) maintained on the web server.
A static web page does not change in relation to user requests or input. The page is created by a web developer and resides on the web server. When the user requests the page via a browser, a copy of the page is sent to the browser for display.

References:
1. 1. Manning, C.D., Raghaven, P., & Schütze, H. (2009). An Introduction to Information Retrieval (Online ed.). Cambridge, MA: Cambridge University Press. Available at

More about Unit 7 What Is Involved In Creating A Web Crawler