Chapter 5 Flashcards

1
Q

Some web pages are “invisible.” That is, no search engine will return them in a query. Why do these pages exist?

A

No other web page links to them. They are synthetic. They are file types browsers don’t understand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The main responsibility of a crawler is to

A

Build a list of tokens that are associated with each page

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When picking additional sources you should choose

A

Independent Sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Enclosing search terms in quotes asks for pages with

A

The search terms in the exact order as written

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Google usually ignores numbers. What symbol could you add to a query to make sure Google uses the number as part of the query?

A

+ the plus sign

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When searching on Google, this is the same as using the AND keyword.

A

& the and symbol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A primary source is

A

A person with direct knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Who is in charge of the World Wide Web?

A

NO ONE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When should a researcher be skeptical of a primary source?

A

Always, the researcher should verify the information from other sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

After finding the web page you want, what is the next question you should ask yourself?

A

Is the information authorative?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The higher the______ is, the closer to the top of the list a web page will be in the returned results of a search query.

A

Page Rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

____ in Google search queries are interpreted as AND.

A

Spaces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The main work of the_____ is to build an index.

A

Crawler

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If a web page meets all the authoritative rules given in this chapter, it can still contain______information.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

_____ is the keyboard shortcut for finding certain words on a web page.

A

Ctrl+F or Command+F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Treating query terms as independent is almost_____what you actually want.

A

Always

17
Q

Wikipedia is validated by

A

NO ONE

18
Q

Crawlers crawl less than ____ of the Web

A

Half

19
Q

To find corroborating information on the Web, you should always_____with another independent web page.

A

Cross Check - Validate

20
Q

Wikipedia is not considered a _____ source.

A

Credible

21
Q

Why do we need search engines? What do search engines do? Answer both of these questions in great detail.

A

Web content is not organized and search engines look around and organize what they find.

22
Q

Explain what a crawler is and what it does.

A

is a software that crawls the web pages (part of a search engine) -has a to-do list, visits every web page it can find, and builds an index (list of tokens)

23
Q

Explain why in the past physical books were trusted much more than web pages are trusted today

A

Physical books went through a detailed process involving many experts. anyone can post anything on a web page.

24
Q

Explain the differences between the AND and OR logical operators and when you should use them in web searches.

A

When using AND, all terms must be on the page; when using OR, at least one or more terms can be on the page. whenever you can.

25
Q

What is an independent source, and why is it important to use independent sources when researching?

A

The information that has no direct link to the actual publisher/author. explores a new perspective of the same content, can not be subject to plagiarism

26
Q

Give three examples of when you would want to use quotes around your search query.

A

book titles, quotations, music lyrics

27
Q

What is a cached page and how is it useful?

A

A web page that is temporarily saved and stored by a web server as a backup copy. It’s useful if the original page is not open due to the website being down and stored on powerful web servers so can get data fast

28
Q

Provide at least two concrete examples of when you would want to limit the domain of your web search.

A

You know where to find the information and you know that the organization is an educational institution.

29
Q

What are the pros and cons of using Wikipedia to find information?

A

pros: covers a number of topics, can be found just about anything, articles are added quite quickly after something happens or something new comes out.
cons: not accurate, authors aren’t experts necessarily, no one can validate it

30
Q

Use the search query “HTML quick reference” in your preferred search engine. Then describe each part of the first result found including page title, snippet, URL, cached page, and site links.

A

google: page title is quick HTML reference, the snippet is a tag list with explanations covering the HTML 3.2 version, the URL is www.htmlgoodies.com/…/refernce/article…/quick-HTML-reference…, and one of the site links is top.

31
Q

What are the rules for Intersecting Alphabetized Lists? How are they used to implement search queries?

A

rules: put a marker at the start of each tokens index list, if all markers point to URL save it, move the markers to the next position for whichever URL is earliest in the alphabet, repeat points 2 and 3 until marker reaches the end of the list
-locate pages, make intersecting multiple lists, notice when URLs are the same on lists