Class 03 - 2016-09-26 - Descriptive Stats
Jump to navigation
Jump to search
Contents
Notes on Slides
- Best Practices to describing data: Report at least one measure of:
LOCATION
- The center of the data
- mean: The average. Outliers heavily influence this stat
- x-bar for sample
- mu for population
- median: The middle of the dataset. When typical representative data is desired.
- mode: Most frequent measurement
DISPERSION
- Relative to what? the center? Comparison with a past time period?
- Range: Delta, max-min
- Variance: Measured in square units
- Standard deviation: sqrt of the variance
- s for sample
- sigma for population
Relative Measures of Location
- using x-bar and s
- Chebyshev Rule: 75% of data falls within 2 stdevs
- Empirical Rule: If Normal distribution (bell shaped) 95% of data within 2 stdevs
- 1-sigma = 68% of all data points
- 2-sigma = 95% of all data points
- 3-sigma = 99% of all data points
- To visualize use Histogram, dotscale, or box plot
Skewness
- Symmetrical, positive or negative
Making Comparisons
- Descriptive Statistics:
- Mean, median, mode, min, max, range, stdev, coef. of variation, count
- Five number summaries / boxplot
- Tell you the skewness
- Min, 1st quartile, MEDIAN (not mean), 3rd quartile, maximum
- Coeff. of variation
- measuring relative variability between datasets
- STDEV * 100 = what percent of the sample mean
- Bollinger band - +/- 2 stdev