ITEC50 Flashcards
a program that searches documents for specified keywords and returns a list of the documents where the keywords were found
search engine
sending out a spider to fetch as many documents as possible
search engine
an indexer, who reads documents and creates an index based on the words contained in each document
search engine
proprietary algorithm to create its indices such that, ideally, only meaningful results are returned for each query
search engine
How do web search engines work?
- finding specific information on the vast expanse of the World Wide Web.
- it would be virtually impossible to locate anything on the Web without knowing a specific URL.
Algorithm of search engine:
- to determine the relevance of the information in the index
- frequency and location of keywords on a Web page
- the way that pages link to other pages on the Web
A search engine consists of three components:
- spider
- index
- search engine mechanism
program that traverses the Web from link to link, identifying and reading pages
spider
database containing a copy of each Web page gathered by the spider
index
software that enables users to query the index and that usually returns results in relevancy ranked order
search engine mechanism
backbone or procedure of technology
algorithm
Three types of search engines:
- powered by robots (called crawlers; ants or spiders)
- powered by human submissions
- hybrid of the two
automated software agents
crawlers
use automated software agents (called crawlers) that visit a Web site
crawler-based search engines
read the information on the actual site
crawler-based search engines
read the site’s meta tags
crawler-based search engines
follow the links that the site connects to perform indexing on all linked Web sites as well
crawler-based search engines
returns all that information back to a central depository
crawler-based search engines
check for any information that has changed
crawler-based search engines
rely on humans to submit information that is subsequently indexed and cataloged
human-powered search engines
only information that is submitted is put into the index
human-powered search engines
Query on Search Engine
- searching through the index that the search engine has created
- giant databases of information that is collected and stored and subsequently searched.
- index hasn’t been updated since a Web page became invalid
it enables you to search through various Internet databases
alenka
provides ad-free result
alenka
Some of the Internet Search Engines you can use to search the Web
- yahoo
- baidu
- bing
January 1996 as a research project by Larry Page and Sergey Brin
99% of its revenue is derived from its advertising programs
its web search engine is the company’s most popular service
made by Jerry Yang and David Filo that grew rapidly throughout the 90s
yahoo
samples of yahoo
Yahoo! Messenger and Yahoo! Mail
it offers services for on-the-go messaging
Yahoo! mobile
Chinese search engine for websites, audio files, and images
baidu
index of over 740M web pages, 80M images, and 10M multimedia files
baidu
became the first Chinese company to be included in the NASDAQ-100 index
baidu
formerly Live Search, Windows Live Search, and MSN Search
bing
Notable changes include the listing of search suggestions as queries are entered
bing
list of related searches
explorer pane
top sites in the Philippines
- Google.com.ph
- Youtube.com
- Google.com
- Abs-cbn.com
- Facebook.com
first tool for searching the Internet, created in 1990
Archie
downloaded directory listings of all files located on public anonymous FTP servers
Archie
it indexed plain text documents
Gopher
they came along to search Gopher’s index system
Veronica and Jughead
first actual Web search engine was developed by Matthew Gray in 1993 and was called
Wandex
Key Terms To Understanding Web Search Engines
- spider trap
- meta tag
- deep link
- robot
a condition of dynamic Web sites in which a search engine’s spider becomes trapped in an endless loop of code
spider trap
group of pages that have links only to the pages within the group
spider trap
simplest form of spider trap with only one page
E
has two incoming links and only one link going out to itself
E
a special HTML tag that provides information about a Web page
meta tag
a hyperlink either on a Web page or in the results of a search engine query to a page on a Web site other than the site’s home page
deep link
a program that runs automatically without human intervention
robot
collections of reviewed and recommended links that have been created by subject specialists, usually librarians, to support research needs and to pinpoint high-quality sites on the web
library gateways
WHEN DO YOU USE LIBRARY GATEWAYS AND SUBJECT-SPECIFIC DATABASES?
- library gateways - Use it when you are looking for high quality information sites on the Web
- subject-specific databases - use it when looking for information on a specific topic and thousands of databases devoted to specific topics of interest
Examples of Library Gateways
- Academic Information
- Digital Librarian
- Infomine
- Internet Public Library
- Librarians’ Index to the Internet
- PINAKES
- WWW Virtual Library
Examples of Subject-specific Databases
- Educator’s Reference Desk (educational information)
- Expedia (travel)
- Internet Movie Database (movies)
- Jumbo Software (computer software)
- Kelley Blue Book (car values)
- Monster Board (jobs)
- Motley Fool (personal investment)
- MySimon (comparison shopping)
- Roller Coaster Database (roller coasters)
- Voice of the Shuttle (humanities research)
- WebMD (health information)
may kinalaman sa search
Keyword relevance
engagement metrics ng link
Page authority
performance
User experience
gaano kaupdated yung page
freshness
pre-displayed information
snippets/metadescription
pagtaas ng visibility ng site sa search engines
search engine optimization
multiple other search engines
meta-search
boost keyword rankings
meta tags
following one page to another
spider/crawling
Storing web pages on a database
index/indexing
Ranking by popularity and relevance
search engine mechanism
reads documents and creates an index based on the words contained in each document
indexer