top of page

Statistics - Day 6

  • Writer: supriyamalla
    supriyamalla
  • Jul 2, 2021
  • 2 min read

Inferential Statistics


In statistics, when we say distribution we usually refer to probability distribution (normal, binomial and uniform).


Discrete uniform distribution - all outcomes have equal chance of occurring.


caution: A distribution is not a graph itself. Graph is just a visual representation

Fun fact: Getting a sum of 7 when you roll two dice together is the highest


Normal distribution (Gaussian distribution/bell curve)

To make the distribution a normal distribution: (x-mean)/std deviation (also called as "z")

this ways mean would become 0 and std deviation would be 1. and thus would become easier to make predictions.


Central Limit Theorem

First watch this video on YouTube by Khan Academy:




What it essentially means is if we plot frequency distribution of means of sample size close to a big number, it will form a bell curve ( a normal distribution)


"Central Limit Theorem suggests that if you randomly draw a sample of your customers, say 1000 customers, this sample itself might not be normally distributed. But if you now repeat the experiment say 100 times, then the 100 means of those 100 samples (of 1000 customers) will make up a normal distribution." - article on Medium by Sujeewa Kumaratunga PhD


Standard error: Standard deviation of the distribution formed by sample means. (std deviation/root n)


Estimator and Estimates:

  1. Point estimates - mean, variance, std deviation

  2. Confidence interval estimates


How to calculate confidence interval?

  1. First calculate sample mean (x)

  2. Then write the std deviation (population)

  3. Calculate std error (std dev/root n)

  4. then calculate z score. (let's say conf interval is 90%, so alpha would be 10% i.e. 0.1 divide that by 2 (0.05). subtract 1-0.05=0.95 Now, look for a cross section of x and y as to when the value would be 0.95 from the zscore table (it is 1.6 and 0.05) add that together=1.65

  5. then calculate: x-zscore*std error, x+zscore*std error


Student's T distribution:


More on this tomorrow!

This one is for all The Office fans out there! :)





Comments


Post: Blog2 Post

Subscribe Form

Thanks for submitting!

©2020 by Learn Data Science with me. Proudly created with Wix.com

bottom of page