summarice data Flashcards
Count number of rows with each unique value of variable
df[‘w’].value_counts()
number of rows in DataFrame.
len(df)
number of distinct values in a column.
df[‘w’].nunique()
Basic descriptive statistics for each column (or GroupBy)
df.describe()
Return the sum of the values for the requested axis
DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)
DataFrame.count(axis=0, level=None, numeric_only=False)
Count non-NA cells for each column or row.
Return the median of the values for the requested axis
DataFrame.median(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation=’linear’)
This method returns the minimum of the values in the object.
DataFrame.min(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
This method returns the maximum of the values in the object.
DataFrame.max(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Return the mean of the values for the requested axis
DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Return unbiased variance over requested axis.
DataFrame.var(axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)
Return sample standard deviation over requested axis.
DataFrame.std(axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)