Week 2 Describing a location in a distribution and Normal Distributions Flashcards
Percentile
the p th percentile of a distribution is the value which has p percent of the observations less than it. A measure that indicates the percentage of data points in a set that fall below a specific value, essentially showing how a particular data point compares to the rest of the data set; for example, if you are at the 80th percentile, it means 80% of the data points are lower than your value.
Cumulative Relative Frequency
The total percent of a distribution with values less than or equal to a particular value. The running total of relative frequencies, essentially representing the proportion or percentage of data points that fall below a certain value, including that value, within a dataset; it is calculated by adding up the relative frequencies for all data points less than or equal to a specific value in the distribution.
Standardized Value (z-score)
Quantifies the distance between a data point and the mean of a dataset. It’s expressed in terms of standard deviations. it measures distance from the mean
A z-score measures how far a data point is from the mean in units of standard deviation. Probability
A z-score can be used to determine the probability of being above or below a given data point. Unusual values A data point can be considered unusual if its z-score is above or below a certain threshold. z = (x - μ) / σ, where x is the data point, μ is the mean, and σ is the standard deviation.
Transformation
Performing the same mathematical operation(s) on every observation in a distribution.Replacement that changes the shape of a distribution or relationship. a mathematical operation on each observation, then use these transformed numbers in your statistical test. Sometimes the scale that is most familiar, or easiest to measure, is not the scale that is most informative when it comes to analyzing and interpreting the data. In such situations, applying a transformation to the data helps to make it more informative
Density Curve
A curve that is always at or above the horizontal (x) axis, and has an area of exactly 1 underneath it. a graphical representation of a continuous probability distribution, where the area under the curve represents the probability of a random variable falling within a specific range of values, and the total area under the curve always sums to 1 (representing 100% probability); essentially, it’s a smooth curve that visualizes the overall shape of a data distribution, allowing for easier interpretation of probabilities within that distribution.
Normal Distribution / Normal Curve
A particular class of distributions which are symmetric, single peaked, and have a characteristic bell shape. a bell-shaped curve that represents a probability distribution where most data points cluster around the mean, with progressively fewer data points falling further away from the mean on either side, creating a symmetrical distribution; it’s also known as a Gaussian distribution or bell curve.
Standard Normal Distribution
A normal distribution with a mean of 0 and a standard
deviation of 1. a specific type of normal distribution where the mean is always 0 and the standard deviation is always 1; essentially, it’s a “normalized” version of the normal distribution, allowing for easy comparison of data from different normal distributions by converting values to z-scores (how many standard deviations away from the mean a data point is) which follow this standard distribution.