Explore And Analyse Data With Python Flashcards

1
Q

Data exploration and analysis is at the core of data Science will stop data scientists requires skill in programming languages like Python to explore visualise and manipulate data

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Visualise data.
Data scientists visualise data to understand it better.
They might scan the road also examined summary measures such as averages or graph the data.
Graphs are a powerful means of visualising data and data scientists often use craft to discern moderately complex patterns quickly

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Representing data visually.
Graphing is done to provide a fast qualitative assessment of Our Data which can be useful for understanding results finding outlier values and examining how numbers are distributed and so on

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Or sometimes you know ahead of time what kind of graph any more stress or other times for used crossovers and exploratory way or in an exploratory well.
To understand the power of data visualisation consider the following the location of X and Y have a self-driving car.
And was daughter’s room it is hard to see any real titans.
The mean average tells us at the hospital Center and X is equal to 0.2 and y is equal to 0.3 and the range of the numbers appear to be about - 2 and 2

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Graphs are not limited to 3D scatter plots there can be used to explore other aspects of your data, for example proportions life pie charts and stacked bar graphs and how the data is spread like histograms and box and whisker plots.
Often when we are trying to understand raw data or results we might experiment with different types of graphs until we come across one that explain the data in a visually intuitive way

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Real-world data issues.
Real world data can contain many different issues that can affect the utility of the data and our interpretation of the results.
It’s important to realise that most real-world data are influenced by factors that weren’t recorded at the time.
For example he might have a table of race car track times alongside engine sizes but various other factors that weren’t written down such as were they probably also played a role.
If problematic we can often reduce the influence of these factors by increasing the size of the data set

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In other situations data points are clearly outside of water expected also known as outliers and can sometimes be safely removed from analysis the although we must take care not to remove data points that provide real insights

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Another common issue in real-world data is biased.
By is refers to a tendency to select certain types of values more frequently than others in a way that person must represent the underlying population or real-world.
Bias can sometimes be identified by its floor and data while keeping in mind basic knowledge about where the data came from

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Real world data will always have issues but data scientists can often overcome these issues bipolar new line checking for missing values and badly recorded data.
Considering removing obvious outliers.
Examining what real-world factors might affect the analysis and determining if the dataset size is large enough to reduce the impact of these factors.
Checking for bio-data and considering the options to fix the Bias if found

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Machine learning is a subset of data science that deals with predictive modelling in other words machine learning uses data to create predictive models in order to predict an invalid.
He might use machine learning to predict how much food a supermarket needs to order or to identify plants in photographs

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Machine learning works by identifying relationship spoon data values that describe the characteristics of something such as its features or the height and colour of a plant and the value we want to predict such as the label or species of a plant.
These relationships are built into a a model through a training process

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly