Chapter 5 Flashcards
Some web pages are “invisible.” That is, no search engine will return them in a query. Why do these pages exist?
No other web page links to them. They are synthetic. They are file types browsers don’t understand.
The main responsibility of a crawler is to
Build a list of tokens that are associated with each page
When picking additional sources you should choose
Independent Sources
Enclosing search terms in quotes asks for pages with
The search terms in the exact order as written
Google usually ignores numbers. What symbol could you add to a query to make sure Google uses the number as part of the query?
+ the plus sign
When searching on Google, this is the same as using the AND keyword.
& the and symbol
A primary source is
A person with direct knowledge
Who is in charge of the World Wide Web?
NO ONE
When should a researcher be skeptical of a primary source?
Always, the researcher should verify the information from other sources
After finding the web page you want, what is the next question you should ask yourself?
Is the information authorative?
The higher the______ is, the closer to the top of the list a web page will be in the returned results of a search query.
Page Rank
____ in Google search queries are interpreted as AND.
Spaces
The main work of the_____ is to build an index.
Crawler
If a web page meets all the authoritative rules given in this chapter, it can still contain______information.
False
_____ is the keyboard shortcut for finding certain words on a web page.
Ctrl+F or Command+F
Treating query terms as independent is almost_____what you actually want.
Always
Wikipedia is validated by
NO ONE
Crawlers crawl less than ____ of the Web
Half
To find corroborating information on the Web, you should always_____with another independent web page.
Cross Check - Validate
Wikipedia is not considered a _____ source.
Credible
Why do we need search engines? What do search engines do? Answer both of these questions in great detail.
Web content is not organized and search engines look around and organize what they find.
Explain what a crawler is and what it does.
is a software that crawls the web pages (part of a search engine) -has a to-do list, visits every web page it can find, and builds an index (list of tokens)
Explain why in the past physical books were trusted much more than web pages are trusted today
Physical books went through a detailed process involving many experts. anyone can post anything on a web page.
Explain the differences between the AND and OR logical operators and when you should use them in web searches.
When using AND, all terms must be on the page; when using OR, at least one or more terms can be on the page. whenever you can.