Module 1 Flashcards
What is Data Mining?
the process of extracting (aka mining) knowledge from data
What is Machine Learning?
a technique or method in which knowledge is extracted from data. The process of applying a machine learning technique on the data to extract knowledge is referred to as data mining.
What are 3 purposes of data?
- describing or diagnosing a phenomenon.
2.predicting events or changes based on the available data.
3.creating a system that use data objects and mimics a cognitive capability of a human behavior, e.g., finding a cat in a picture, understanding a handwritten text, chatting with you about something, etc.
How can we describe or diagnose a phenomenon using data?
we use classification, regression, or clustering, i.e., categorizing data with similar properties together
How can we predict events or changes based on available data?
prediction, i.e., use of the existing data to describe what will happen in the future.
What is Artificial Intelligence?
creating a system that use data objects and mimics a cognitive capability of a human behavior, e.g., finding a cat in a picture, understanding a handwritten text, chatting with you about something, etc.
What is the standard format for web documents?
HTML (Hyper Text Markup Language)
What are the two main components of a web page. Describe them
The header part of the page presents an introduction to, and meta information about, the information that exists on that page.
The body contains the actual text of the page
What is an API?
An application programmable interface that allows consumers to collect data from websites
How can a company analyze its own system?
They can use their own server log files
What is page tagging?
The collection of users’ data via the cookies installed b the web page on the data. They can collect data on browser version, operating system, screen size, etc.
What is web scraping?
The process of automatically collecting data from web pages or web resources. It focuses on a single source of information. Another name for it is Web Knowledge Extraction
What is web crawling?
The process of reading and storing all web pages of a site or number of sites. It is related to gathering pages from the web and indexing them to support a search engine. This downloads the entire website (which is comprised of many web pages)
Who/ What primarily uses web crawlers?
Web crawling is heavily used by search engines that download documents of a web page and then store the docs in their local data base.
What is an inverted index?
An inverted index is a map of keywords and their location used to access the database of web documents