Data and IS Flashcards
Data and Information
Data are facts, events, and transactions which have been recorded. They are
basically the raw inputs which further get processed to become information.
When facts are filtered through one or more processes (human or system), and
are ready to give certain kind of details… they are the information.
Processed data when presented in some useful and meaningful form, it is
actually the information we are looking at.
Definitions of Data,Information
Data
Raw facts such as an employee’s name and number of hours worked in a
week, inventory part numbers or sales orders.
Information
A collection of facts organized in such a way that they have additional
value beyond the value of the facts themselves.
Data Pyramid
Data(Raw)
Information(Meaning)
knowledge(Context)
Wisdom(Applied)
What is good information?
Accurate – entering incorrect sales data creates false information.
Timely – knowing that production doesn’t have enough raw materials for
next week’s schedule won’t be useful information three weeks from now.
Relevant – if your boss needs to know how many shipments were late
last month, you shouldn’t give him a list of all items that shipped.
Worth its cost – is it cost worthy to map out the entire U.S. if you only
need one state?
Nature of data
Structured data
Unstructured/textual data
Semi-structured data
DBMS (Database Management Systems)
Software for creating, storing, organizing, and accessing data from a database
Separates the logical and physical views of the data
Logical view: how end users view data
Physical view: how data are actually structured and organized
Examples: Microsoft Access, D B 2, Oracle Database, Microsoft S Q L Server, MySQL
Business Intelligence
Business Intelligence is a collection of software and tools that are designed to
understand and interpret the vast quantities of data that an organisation accumulates over time.
Business Intelligence tools use AI to process vast amounts of data and break it down into individual insights. This means it can then be analysed and potentially actioned into a business decision.
It allows companies to see patterns, trends, areas of growth and also areas of weakness and vulnerability.
Par of digital transformation
Data Warehouses
Database that stores current and historical data that may be of interest to decision makers
Consolidates and standardizes data from many systems, operational and transactional databases
Data can be accessed but not altered
Data Warehouse is basically the collection of data from various heterogeneous sources.
It is the main component of the business intelligence system where analysis and management of data are done which is further used to improve decision making.
It involves the process of extraction, loading, and transformation for providing the data for analysis.
ETL(Extract Transform and Load)
ETL stands for extract, transform, load,three database functions that are
combined into one tool to pull data out of one database and place it into anotherdatabase.
1.Extraction
Collecting data from a variety of sources
Converting data into a format that can be
used in transformation processing
2.Transformation processing
Make sure data meets the data
warehouse’s needs
3.Loading
Process of transferring data to the data
warehouse
Data Warehouses Characteristics
Subject-Oriented:
A data warehouse can be used to analyze a particular subject area. For example, “sales” can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where
often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer.
Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a data warehouse
should never be altered.
Data Warehouse-Multidimensionality
Multidimensional presentation
1.Dimensions: products, salespeople, market segments,
business units, geographical locations, distribution channels,
country, or industry
2.Measures: money, sales volume, head count, inventory profit,
actual versus forecast
3.Time: daily, weekly, monthly, quarterly, or yearly
Data Mart
Subset of data warehouses that is highly focused and isolated for a
specific population of users
Data marts are often built and controlled by a single department within an
organization.
Data mart
Smaller version of data warehouse
Used by single department or function
Advantages over data warehouses
More limited scope than data warehouses
Differences Between a Data Warehouse and a Data Mart
1.Scope :
corporate vs line of business
2.Subject:
Multiple vs Single subject
3.Data Sources:
Many vs Few
4.Size
100+gb vs less then 100gb
5.Implementation time:
Months to years vs Months
Cloud Data Warehouse
Eliminates the need to purchase any in-house hardware for data warehousing.
• Offer lower upfront costs as compared to traditional warehouses.
• It offers higher scalability with an increase in available data.
• Frees up capacity on in-house systems
• Frees up cash flow
• Makes powerful solutions affordable
Additional DW Considerations
Cloud Data Warehouses
Usability:
Moving a a document into the cloud storage folder permanently move
document from its original folder to the cloud storage location.
Bandwidth:
Several cloud storage services have a specific bandwidth allowance.
Accessibility:
If you have no internet connection, you have no access to your data.
Data Security:
There are concerns with the safety and privacy of important data stored
remotely. The possibility of private data commingling with other organizations makes
some businesses uneasy.
Software: If you want to be able to manipulate your files locally through multiple
devices, you’ll need to download the service on all devices.