UNIT 2 Flashcards

1
Q

Big data

A

Big Data is a term that describes large and complex data sets.

This data cannot be collected, managed and processed using traditional data processing software within a reasonable period of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Big data can include:

A

Structured
Unstructured
Semi-structured data

Each of which can be exploited to understand customer insights, leading to better decisions and actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Big dat sources

A
  • media
  • cloud
  • web
  • internet of things
  • databases
  • social network profiles
  • social influences
  • activity generated data
  • data warehouse appliances
  • network & in-stream monitoring technologies
  • legacy documents
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The four V´s of big data

A
  1. Variety- different forms of data
  2. Veracity- uncertainty of data
  3. Volume- scale of data
  4. Velocity- analysis of streaming data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Volume

A

Volume refers to the amount of data generated through various sources. On social media sites, for example, we have 2,6 billion Facebook users, 2 billion on YouTube, and 1 billion on Instagram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Velocity

A

This is the speed at which data is being made available. The rate of transfer over servers and between users has increased to a point where it is impossible to control the information explosion.

There is a need to address this with more equipped tools, and this comes under the kingdom of big data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variety

A

There is structured and unstructured data

  • Different formats: Pictures, videos, emails, tweets, posts, messages, etc. are unstructured.
  • Sensor-collected data from the millions of connected devices (IoT) is what you can call semi-structured.
  • Records maintained by businesses: transactions, storage, and analyzed previous unstructured information are part of structured data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sectors generating and using Big data

A
  • banking and finance
  • media &entertainment
  • healthcare
  • education
  • government
  • transportation
  • insurance
  • retail
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Big data analytics

A

The massive quantities of data contributed by all these users in terms of images, videos, messages, posts, tweets etc. have pushed data analysis away from the now incapable excel sheets, databases, and other traditional tools toward big data analytics.

Big data analytics is the often complex process of examining big data to uncover information, such as hidden patterns, correlations, market trends and customer preferences. This can help organizations to make informed business decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Benefits of Big data analytics

A
  • more effective marketing
  • new revenue opportunities
  • customer personalization and improved operational efficiency
  • competitive advantages over rivals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

4 steps of big data analytics

A
  1. Collectdata from a variety of different sources. Often, it is a mix ofsemi-structuredand unstructured data
  2. Data is processed: After data is collected and stored in a data warehouse, data professionals must organize, configure and partition the data properly for analytical goals.
  3. Data is cleansed for quality:
    - Data professionals wash up the data using scripting tools or enterprise software.
    - They look for any errors or inconsistencies, such as duplications or formatting mistakes, and organize and tidy up the data.
  4. The collected, processed and cleaned data is analyzed with analytics software
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Tools for analytics software

A
  • Data mining, search of patterns and relationships
  • Predictive analytics, which builds models to forecast customer behavior and other future developments
  • Machine learning, which uses algorithms to analyze large data sets
  • Deep learning, more advanced than machine learning
  • Text mining and statistical analysis software
  • Artificial intelligence (AI)
  • Business intelligence software
  • Data visualization tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Big data uses and examples

A
  1. Customer acquisition and retention
  2. Personalized engines
  3. Targeted ads
  4. Product development
  5. Price optimization
  6. Supply chain and channel analytics
  7. Risk management
  8. Improved decision making
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Big data analytics benefits

A
  • Quickly analyzing large amounts of data from different sources, in many different formats and types.
  • Rapidly making better-informed decisions for effective strategizing, which can benefit and improve the supply chain, operations and other areas of strategic decision-making.
  • Cost savings, which can result from new business process efficiencies and optimizations.
  • A better understanding of customer needs, behavior and sentiment, which can lead to better marketing insights, as well as provide information for product development.
  • Improved, better informed risk management strategies that draw from large sample sizes of data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Big data challenges

A

1- Accessibility of data. With larger amounts of data, storage and processing become more complicated. Big data should be stored and maintained properly to ensure it can be used by less experienced data scientists and analysts.

2- Data quality maintenance. With high volumes of data coming in from a variety of sources and in different formats, data quality management for big data requires significant time, effort and resources to properly maintain it.

3- Data security. The complexity of big data systems presents unique security challenges. Properly addressing security concerns within such a complicated big data ecosystem can be a complex undertaking.

4- Choosing the right tools. Selecting from the vast array of big data analytics tools and platforms available on the market can be confusing, so organizations must know how to pick the best tool that aligns with users’ needs and infrastructure.

5- Lack of internal analytics skills and the high cost of hiring experienced data scientists and engineers, some organizations are finding it hard to fill the positions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly