Class 03 - 2016-09-26 - Descriptive Stats

From Colettapedia
Jump to navigation Jump to search

Notes on Slides

  • Best Practices to describing data: Report at least one measure of:

LOCATION

  • The center of the data
  1. mean: The average. Outliers heavily influence this stat
    • x-bar for sample
    • mu for population
  2. median: The middle of the dataset. When typical representative data is desired.
  3. mode: Most frequent measurement

DISPERSION

  • Relative to what? the center? Comparison with a past time period?
  1. Range: Delta, max-min
  2. Variance: Measured in square units
  3. Standard deviation: sqrt of the variance
    • s for sample
    • sigma for population

Relative Measures of Location

  • using x-bar and s
  • Chebyshev Rule: 75% of data falls within 2 stdevs
  • Empirical Rule: If Normal distribution (bell shaped) 95% of data within 2 stdevs
    • 1-sigma = 68% of all data points
    • 2-sigma = 95% of all data points
    • 3-sigma = 99% of all data points
  • To visualize use Histogram, dotscale, or box plot

Skewness

  • Symmetrical, positive or negative

Making Comparisons

  • Descriptive Statistics:
    • Mean, median, mode, min, max, range, stdev, coef. of variation, count
  • Five number summaries / boxplot
    • Tell you the skewness
    • Min, 1st quartile, MEDIAN (not mean), 3rd quartile, maximum
  • Coeff. of variation
    • measuring relative variability between datasets
    • STDEV * 100 = what percent of the sample mean
  • Bollinger band - +/- 2 stdev