Chapter 17 Flashcards
Business Intelligence (BI)
an assortment of software applications used to analyze an organization’s raw data
Decision Support Systems (DSS)
computer-based systems that support an organization’s decision-making activities
how much does big data or poor quality data cost a big business?
around 600 billion annually causes them to make poor quality decisions
Big Data
all the data the collect even if they don’t actually use it, data includes information about people, places, things, etc.
could be text messages, photos someones posts, emails, anything that can be collected
How can marketing, transportation, govt. and public administration, healthcare, and cyber security?
Marketing- the more data they have on customers increases their ability to advertise products to customers well
Transportation- the more info they have on people and what kind of transportation people are using and which is preferred
Govt and public administration- more info about people in cities to improve funding goals for each cities public health
healthcare- the more info they have on health and habit of people better idea of how many doctors the city needs
cyber security- security online
how did we get here with collecting data and how much data is generated?
started in the 1950’s, organizations started using and processing data and information to support the tactical and strategic decisions they made, or were going to make
things like personality tests
Structured data
fixed formats, well labeled and often with traditional fields, needs to hav a recognizable pattern that allows it to be queried, searched, and in a standard format
Unstructured data
disorganized data that cannot be easily read or processed by a computer because it is not stored in rows and columns like traditional data tables
is like collecting massive amounts of facebook messages and instagram posts to determine future fashion trends
90% of all data is unstructured
Semi structured data
mix of unstructured and structured data and can possibly be converted into structured data, but not without a lot of work.
the 4 V’s of big data
volume
velocity
variety
veracity
Volume
Refers to the amount of data collected by an organization. How much data does your business need, and where do you keep it once you’ve collected it
velocity
how fast data is collected and how quickly you can analyze the data
90% of the data has been created in the last 2 years
every 60 seconds videos are uploaded on the internet
variety
what data is collected, Structured, Semi-structured, or Unstructured
veracity
is the data complete and whole and is it well structured
CRM, Customer Relationship Management
one type of software thats used to hold customer information, system that holds information about sales, marketing, customer service records
Enterprise Resource Planning system ERP
how resources are used whats the customer relationship to the business, helps them plan resources and how to market them
Data warehouses
used for big businesses
datamart
used for smaller medium sized businesses and are cheaper, cant collect as much data and dont have all the data to ask as specific questions to help them market
ETL
means Extract, Transform, and Load data
Extract
extracting information/ data, when you determine where your data resides, you can start extracting it, often from Customer Relationship Management (CRM) or Enterprise Resource Planning (ERP) software
Transform
Once you’ve extracted data, it needs to become normalized
normalizing data means organizing it into fields and records of a relational database it provides the standard format to analyze data
Load
Once data is transformed and normalized, it’s ready to be finally transferred into the data warehouse or datamart
loading sometimes happens daily weekly, or hourly
Hadoop
- has yellow elephant logo from creators sons stuffed animal
- open source project under apache
- its a cluster system HDFS (hadoop filing system)
- lets you store bigger files than you typically can and many many files
- its a distributed file system
- best for large companies like facebook, ebay, american express, google, target, walmart, airefares
MapReduce
processes all the data , its the processing arm or engine of Hadoop was created in 2005
data mining
called Data Discovery is the examination of huge sets of data to find patterns and connections, and identify outliers.
like when u watch something and they start suggesting similar things
topic analytics
catalogued into different categories based on the topic
tries to catalog phrases of an organization’s customer feedback into relevant topics
Business analytics
try to predict future trends to give them a competitive advantage
Forms of business analytics
Descriptive- uses past research
Predictive- trying to predict future patterns
Decision- uses descriptive and predictive, past and present to make decisions
how do you display the analytics and or data you share (data visualization)
power points and dashboards