Option C: Web Science Flashcards
Internet
An interconnected set of networks and computers that permits the transfer of data governed by protocols like TCP/IP.
Acts as the physical medium for services such as the World Wide Web.
WWW (World Wide Web)
A set of hypertext-linked resources identified as URIs that are transferred between a client and a server via the Internet.
Provides a mechanism to share information.
HTTP (Hypertext Transfer Protocol)
The protocol used to transfer and exchange hypermedia.
It permits the transfer of data over a network.
HTTPS (Hypertext Transfer Protocol Secure)
a protocol for secure communication over a computer network.
consists of communication over HTTP within a connection encrypted by SSL or TLS, which ensures authentication of website using digital certificates, integrity and confidentiality through encryption of communication.
HTML (Hypertext Markup Language)
a semantic markup language that is the standard language used for web pages
URL (Uniform Resource Locator)
a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it
A URL is a specific type of URI
XML (Extensible Markup Language)
A way of writing data in a tree-structured form by enclosing it in tags. It is human readable AND machine readable and it is used for representation of arbitrary data structures.
XSLT (Extensible stylesheet language)
Styling language for XML. It is used for transforming XML documents into other XML documents or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG.
JavaScript
An object-oriented computer programming language commonly used to create interactive effects within web browsers.
CSS (cascading style sheet)
contain hierarchical information about how the content of a web page will be rendered in a browser.
URI (Uniform Resource Identifier)
specifies how to access a resource on the Internet. more general than a URL
Describe how a domain name server functions
(steps)
- User type the domain name into the URL search area on the web browser and press “Enter” on the keyboard.
- The domain name is intercepted by a “Domain Name Server” or DNS.
- The DNS server that the user’s system is configured to (primary DNS) checks through its own database to see if the domain name is there.
- If it isn’t the name is passed on to the next DNS server in the hierarchy.
- This continues until the domain name is found or the top level / authoritative DNS server is reached.
- When the IP address is found it is sent back to the original DNS server.
- If the IP address is not found, an error message is returned
IP (Internet Protocol)
A part of TCP/IP protocol suite and the main delivery system for information over the Internet. IP also defines the format of a packet.
TCP (Transmission Control Protocol)
A data transport protocol that includes mechanisms for reliably transmitting packets to a destination.
FTP (File Transfer Protocol)
A TCP-based network to pass files from host to host. Files can also be manipulated/modified remotely. control information (log-ins) are sent separately from the main file differing FTP from HTTP.
< head >
not visible on a page, but contains important information about it in form of metadata
< title >
inside head, displayed in tab of the web page.
< meta > tags
various type of meta tags, gives search engines information about the page, but are also used for other purposes, such as to specify the charset used.
< body >
The main part of the page document. This is where all the visible content goes in.
navigation bar
a set of hyperlinks that give users a way to display the different pages in a website
hyperlinks
“Hot spots” or “jumps” to locate another file or page; represented by a graphic or colored and underlined text.
table of contents
An ordered list of the topics in a document, along with the page numbers on which they are found. Usually located at the beginning of a long document. normally in a sidebar.
continuation
area of the web page preventing the sidebar to extend to the bottom of the web page.
protocols
A set of rules governing the exchange or transmission of data between devices.
standards
set of technical specification that should be adhered to.
ISO (International Organization for Standardization)
non-govt. org. that develops and publishes international standards. these standards ensure safety, reliability and quality for products and services.
personal page
A web page created by an individual that contain valid and useful opinions, links to important resources, and significant facts. static usually. normally created using some form of website creator like Wix.
blog
A Web log, which is a journal or newsletter that is updated frequently and published online.
- Only the owner can post an article / open a thread of discussion / start a theme.
- Registered users may be allowed to comment but the owner may moderate the comments before displaying them.
- Users cannot edit or delete posts.
Search Engine Pages
indexes content from the internet or an intranet, serves related links based on a users queries. uses web crawlers. back-end is programmed in an efficient language, e.g. C++
Forums
An online discussion group, much like a chat room.
- All registered participants can post an article / open a thread.
- All registered users are allowed to comment (without moderation).
- Can have moderators who can edit or delete posts after they have been made.
Wiki
define + evaluate
A collaborative website that can be edited by anyone that can access it
- can be vandalised by users with ill intent.
+ ability to change quickly.
Static websites
sites that only rely on the client-side and don’t have any server-side programming. The website can still be dynamic through use of JavaScript for things like animations.
static websites pros and cons
Pros:
- lower cost to implement
- flexibility
Cons:
- scalability
- hard to update
- higher cost in the long term to update content
dynamic website
A website that generates a web page directly from the server; usually to retrieve content dynamically from a database.
This allows for a data processing on the server and allows for much more complex applications.
dynamic website pros and cons
pros:
- information can be retrieved in an organised way
- allows for content management systems
- low ongoing cost, unless design changes
Cons:
- sites are usually based on templates, less individual sites
- higher initial cost
- usually larger codebase
explain the function of a browser
- interprets and displays information sent over the internet in different formats
- retrieves information from the internet via hyperlinks
server-side scripting
Also called back-end scripting; scripts are executed on the server before the web page is downloaded by a client. (e.g. if you log-in to an account, your input is sent to the server to be checked before downloading your account). These are the parts of the web page that must be refreshed whenever there is a change. e.g. CGI
Client-side scripting
Happens in the browser of the client. It is used for animations, form validation and also to retrieve new data without reloading the page. e.g. in a live-chat
cookies
hold data specific to a website or client and can be accessed by either the server or the client. the data in a cookie can be retrieved and used for a website page. some sites require cookies to function. cookies are used to transport information from one session to another and eliminate the use of server machines with huge amounts of data storage –> smaller and more efficient.
why is XML used in server-side scripting?
XML is a flexible way to structure data and can therefore be used to store data in files or to transport data. It allows data to be easily manipulated, exported, or imported. websites can then be designed separately from data content.
CGI (Common Gateway Interface)
+ how it works
Intermediary between client side and server side.
Provides interactivity to web applications / enables forms to be submitted. It uses a standard protocol that acts as an intermediary between the CGI program and the web server. The CGI allows the web server to pass a user’s request to an application program, and then the forwarded data is received to the user’s browser;
search engine
Software that interrogates a database of web pages
surface web
(open internet) web sites freely accessible to all users over the internet. web that can be reached by a search engine. static and fixed pages. e.g. Google, Facebook, YouTube
deep web
Proprietary internet web sites accessible over the internet only to authorized users and often at a cost
in-links
links that point to the page
out-links
links that point to a different page
PageRank Algorithm
analyzes links between web pages to rank relevant web pages in terms of importance
Factors:
- quantity, quality (rank), and relevance of inlinks
- number of outlinks (fewer outlinks means higher value)
HITS algorithm
- identifies a set of web pages relevant to the user’s query
- gives each web page an authority score based on the number and quality of inlinks
- gives each web page a hub score based on the number and quality of outlinks
- combines authority and hub score to generate a combined score for each web page
- ranks web pages based on this score