11  Planned Outline

Here show some advanced uses of statistics from different domains.

11.1 Advanced statistical usecases

  • TableOne summary statistics
    • Univariate/One-variable case
    • Bivariate / Two-variable case
  • Use of logistic regression
    • Odds ratio
  • Use of multivariable linear regression
    • Interpretation of coefficients

11.2 Understanding data distribution

  • Shape of data
  • Examples of various continuous distributions
  • Visualization of continuous data
  • Visualizing discrete and categorical data

11.3 Describing data distribution graphically

  • Visualizing categorical data
    • Bar plot
  • Visualizing numerical data
    • Dot plot / point plot
    • Histogram
    • Density plot
    • Scatterplot for bivariate numerical data

11.4 Describing data distribution numerically

  • Measures of center
    • Also known as measures of location
    • also known as measures of central tendency
  • Understanding variation in data
    • Concept of dispersion (the idea of deviation from the center)
    • Measures of dispersion
      • Range
      • Variance
      • Standard deviation
  • Creating data summary with boxplot
  • Measures of relative standing
    • Z-score

11.5 Understanding relationship between two or more variables

  • Correlation between two numerical variables
  • Linear regression to assess relationship of one outcome variable (aka. dependent variable) with one or more predictor (aka independent) variables

11.6 Foundations of Probability*

  • Interpretation of probability
  • Calculation of probability
  • Probability of complex events (union, intersection, complements)
  • Conditional probability

11.7 Random variable and its probability distribution

  • What is a random variable?
  • Probability distribution of discrete random variable
    • Examples such as
      • Binomial distribution
      • Geometric distribution
      • Poisson distribution
  • Probability distribution of continuous random variable
    • Examples such as
      • Normal distribution
      • Gamma distribution
      • t-distribution
      • Chi-square distribution

11.8 Making decision from data (Statistical Inference)

  • Concepts
    • Sample and Population
    • Statistic (a quantity calculated from sample)
    • Parameter (a quantity calculated from the entire population)
  • How do statisticians make decision by studying on a fraction of the data?
  • A five-step process for making decision (statistical inference)

11.9 Sampling Distribution

  • Distribution of a statistic (aka. sampling distribution)
  • Distribution of sample mean (simulation study)
  • Distribution of sample proportion (simulation study)
  • Central Limit Theorem (CLT)
  • How sampling distribution supports learning from data

11.10 Estimation of a Population Proportion

  • Estimation of population proportion
  • Margin of Error
  • Confidence Interval for a population proportion
  • Sample size calculation*

11.11 Hypothesis testing for population proportion

  • Hypothesis testing
    • Hypothesis and making a conclusion after the procedure
    • Errors in hypothesis testing
    • Steps to carrying out a hypothesis test
  • Hypothesis testing (answering question about population proportion)
    • test for a single population proportion
    • testing the difference between two population proportions

11.12 Hypothesis test for population mean

  • Test for a single population mean
  • Test for the difference between two population means

11.13 Making decision from categorical data

  • Chi-square test for univariate categorical data
  • Test for independence and test for homogeneity in a contingency table

11.14 Comparing Risk in two Populations

  • Risk difference
  • Relative Risk
  • Odds ratio

References