Data analytics Flashcards
data analytics
process and practice of analyzing data to answer questions, extract insights, and identify trends
data science
process of building, cleaning, and structuring datasets to analyze and extract meaning
big data
data sets whose size or type is beyond the ability of traditional relational databases to capture, manage and process the data with low latency.
Sources of data are becoming more complex than those for traditional data because they are being driven by artificial intelligence (AI), mobile devices, social media and the Internet of Things (IoT). For example, the different types of data originate from sensors, devices, video/audio, networks, log files, transactional applications, web and social media — much of it generated in real time and at a very large scale.
big data analytics
you can ultimately fuel better and faster decision-making, modelling and predicting of future outcomes and enhanced business intelligence
types of analytics
https://online.hbs.edu/Documents/a-beginners-guide-to-data-and-analytics.pdf
Analytics is used to extract meaningful
insights from data that can drive decisionmaking and strategy formulation. There
are four types of analytics you can leverage
depending on the data you have and the type
of knowledge you’d like to gain.
1. Descriptive analytics looks at data
to examine, understand, and describe
something that’s already happened.
2. Diagnostic analytics goes deeper
than descriptive analytics by seeking
to understand the “why” behind what
happened.
3. Predictive analytics relies on historical
data, past trends, and assumptions to answer
questions about what will happen in the
future.
4. Prescriptive analytics identifies specific
actions an individual or organization should
take to reach future targets or goals.
data analytics in business
The main goal of business analytics is to extract meaningful insights from
data that an organization can use to inform its strategy and, ultimately,
reach its objectives. Business analytics can be used for:
* Budgeting and forecasting: By assessing a company’s historical
revenue, sales, and costs data alongside its goals for future growth,
an analyst can identify the budget and investments required to make
those goals a reality.
* Risk management: By understanding the likelihood of certain
business risks occurring—and their associated expenses—an analyst
can make cost-effective recommendations to help mitigate them.
* Marketing and sales: By understanding key metrics, such as leadto-customer conversion rate, a marketing analyst can identify the
number of leads their efforts must generate to fill the sales pipeline.
* Product development (or research and development): By
understanding how customers reacted to product features in the
past, an analyst can help guide product development, design, and
user experience in the future.
data ecosystem
The term data ecosystem refers to the
programming languages, packages,
algorithms, cloud-computing services, and
general infrastructure an organization uses to
collect, store, analyze, and leverage data. No
two organizations leverage the same data in
the same way. As such, each organization has
a unique data ecosystem.
data life cycle
While the data ecosystem encompasses
everything that handles, organizes, and
processes data, the data life cycle describes
the path data takes from when it’s first
generated to when it’s interpreted into
actionable insights. This life cycle can be
split into eight steps: generation, collection,
processing, storage, management, analysis,
visualization, and interpretation.
data privacy
Data privacy, also known as information privacy, is a
subcategory of data protection that encompasses the ethical and legal obligation to protect access to personally identifiable information (PII), which is any information that can be linked to a specific individual. Some examples of PII include full name, address, Social Security number, and passport number.
data integrity
Data integrity is the accuracy, completeness, and quality of data as it’s maintained
over time and across formats. Preserving the integrity of your company’s data is a
constant process.
Threats to a dataset’s integrity include:
* Human error: For instance, accidentally deleting a row of data in a spreadsheet.
* Inconsistencies across formats: For instance, a dataset in Microsoft Excel that
relies on cell referencing may not be accurate in a different format that doesn’t
allow those cells to be referenced.
* Collection error: For instance, data collected is inaccurate or lacking
information, creating an incomplete picture of the subject.
* Cybersecurity or internal privacy breaches: For instance, someone hacks into
your company’s database with the intent to damage or steal information, or an
internal employee damages data with malicious intent.
data analytics skills: critical thinking
- Critical Thinking
If you’re interested in using data to solve business problems, you need to be adept at thinking critically about challenges and solutions. While data can provide many answers, it’s nothing without a human’s discerning eye.
“From the first steps of determining the quality of a data source to determining the success of an algorithm, critical thinking is at the heart of every decision data scientists—and those who work with them—make,” Tingley says in the
Harvard Online course Data Science Principles. “Data science is a discipline that’s built on a foundation of critical thinking.”
data analytics skills: hypothesis formation and testing
- Hypothesis Formation and Testing
At the heart of data and analytics is the desire to answer questions. The proposed explanations for these leading questions are called hypotheses, which must be formed before analysis takes place. An example of a hypothesis is, “I predict that a person’s likelihood of recommending our product is directly proportional to their reported satisfaction with the product.” You predict the data will show this trend and must prove or disprove the hypothesis through analysis. Without a hypothesis, your analysis has no clear direction.
data analytics skills: data wrangling
- Data Wrangling
Data wrangling is the process of cleaning raw data in preparation for analysis. It involves identifying and resolving mistakes, filling in missing data, and organizing and transferring it into an easily understandable format.
This is an important skill for anyone dealing with data to acquire because it leads to a more efficient and organized data analysis process. You can extract valuable insights from data more quickly when it’s cleaned and in its optimal viewing format.
data analytics skills: mathematical ability
- Mathematical Ability
You don’t have to be a mathematician to become data literate, but strong math skills become increasingly important as you deal with more complex analyses.
A seasoned data professional needs a solid understanding of statistics, probability, linear algebra, and multivariable calculus. Data scientists often call on statistical methods to find structure in data and make predictions, and linear
algebra and calculus can make machine-learning algorithms easier to comprehend.
If you’re not a data scientist or analyst, your work may not require you to understand the more complex mathematical concepts, but having a basic understanding of statistics can go a long way.
data analytics skills: data visualization
- Data Visualization
It’s crucial to know how to transform
raw data into compelling visuals that
tell a story. Rather than simply presenting a list of
values to your stakeholders, it’s more
effective to visually communicate data
in a way that’s easily digestible. Some
popular data visualization techniques
that all business professionals should
know include pie charts, bar charts,
and histograms