## Archive for August, 2010

## It’s A Math, Math World (The Substitution Principle)

This week’s blog post is a little different than the past few weeks. It is a little less math-intensive, but still very relevant. I will introduce a concept in mathematics that extends from Algebra and Geometry into higher math. It is a concept that eludes some of my best tutoring students. It is the concept of *The Substitution Principle*.

Basically, the substitution principal states in plain English that:

**If quantities have the same value then they are interchangeable.**

Sometimes the wording is different or the concept is clouded by the details of the problem, but it merely states that if two quantities are equal, in numeric or algebraic value, then one can take the place of the other in your calculation or mathematical proof.

Think about it in terms of an equation:

**Example 1:**

Given: x+ y = **z**

**z**=5

Conclusion: substitute 5 in for z into the top equation and get

x+y=5

**Example 2: **

a=b and a=f

We can conclude that **b=f**

**Example 3 (from Geometry):**

A = πr^{2}

A=18πy

We can conclude that πr^{2} =18πy and we can solve for r or y.

**Example 4 (from a geometric proof):**

Given: <1 complementary to <3 (i.e. m<1 + m<3 = 90)

<2 complementary to <3 (i.e. m<2 + m<3 = 90)

Conclusion: m<1 + m<3 = m<2 + m<3

m<1 = m<2

As you can see, this principal has many useful applications.

*Like what you read? Get blogs delivered right to your inbox as I post them so you can start standing out in your job and career. There is not a better way to learn or review college level stats topics than by reading, It’s A Math, Math World*

## It’s A Math, Math World (Hypothesis Tests))

Today’s blog post is really targeted as a “Statistics 101” for the early statistics student or non-statistician. Last week, we summarized the Central Limit Theorem and Confidence Intervals. This week we look at more inferential statistics, particularly hypothesis tests. Let me know your opinions!

Another way to see if 2 population means differ is to use a hypothesis test. To begin, we formulate a statement, or hypothesis, that both means do not differ and use the sample data to determine whether the collected data is consistent or inconsistent with this hypothesis.

We define the *Null hypothesis* as H0: **u1 = u2** (no difference)

And the *Alternative hypothesis* as H1: **u1 ≠ u2 **(two-tailed test)

The corresponding one-tailed tests would yield H1: u1 < u2 **OR**

u1 > u2

**We will consider the two tailed test in this case.**

We have to determine the test statistic which is the calculated value that we will use to test the null hypothesis. In this case, we will assume the H0 is true:

t-stat = [(ybar_{1} – ybar_{2}) – (**u1-u2**)] /SE (ybar_{1}-ybar_{2})

We also have the significance level of the test that determines the degree of uncertainty or error in our process. This significance level, in our case it is equal to 0.05, determines, through a probability table, a certain *critical value* that we compare the t-stat to in order to determine whether to accept or reject the H0.

For example if |t-stat| > critical value, we say that the t-stat falls in the critical or rejection region and we reject the null hypothesis and conclude that the means are not equal.

However, if |t-stat| < critical value, then the t-stat is not in the critical region and we fail to reject the H0 and we conclude that there is not enough evidence to reject the H0.

**Example: **

In this case, we are considering two *independent* samples of size ≥ 30 so that we may use the standard normal distribution tables (smaller sample sizes would use the students-t distribution)

α = level of significance = 0.05

A study was conducted on two engines and it was hoped that the new engine produced less noise than their old engine. We want to test whether there is any difference between the noise levels produced by the engines.

SAMPLE |
SAMPLE SIZE |
SAMPLE MEAN (DECIBELS) |
SAMPLE STANDARD DEVIATION (DECIBELS) |

STANDARD MODEL |
65 | 84 | 11.6 |

NEW MODEL |
41 | 72 | 9.2 |

We define the *Null hypothesis* as H0: **u1 = u2** (no difference)

And the *Alternative hypothesis* as H1: **u1 ≠ u2 **(two-tailed test)

Let:

xbar_{1} = 84

xbar_{2} = 72

u1 – u2 = 0 (we assume H0 is true)

s_{1 }= 11.6

s_{2 }= 9.2

n_{1} = 65

n_{2 }= 41

SE = sqrt((s_{1})^{2}/n_{1} + (s_{2})^{2}/n_{2}) = sqrt ((11.6)^{2}/65 + (9.2)^{2}/41) = sqrt (2.0702 + 2.0644) = sqrt(4.1646)

= 2.03

T-stat = [(xbar_{1 }– xbar_{2}) – (u1-u2)]/SE

= (84-72 -0)/2.03

= 5.91

The Standard normal critical value is z_{0.025 }in each tail of the distribution. This value, read from the Standard Normal Table is **1.96**

Since 5.91 > 1.96, the t-statistic falls in the critical (rejection) region for the H0.

Conclusion: We reject the null hypothesis and conclude that there is a difference in noise output between the 2 engines.

*Like what you read? Get blogs delivered right to your inbox as I post them so you can start standing out in your job and career. There is not a better way to learn or review college level stats topics than by reading, It’s A Math, Math World*

## It’s A Math, Math World (CTL and CI’s)

Today’s blog post is really targeted as a “Statistics 101” for the early statistics student or non-statistician. Last week, we summarized basic probability and the normal distribution. This week we look at inferential statistics. Let me know your opinions!

We start by examining one of the main topics of this section which is the Central Limit Theorem (CLT):

Suppose we have a sample of 10,000 measurements and we take a random sample of 500 of these measurements and calculate a mean, ybar_1. We then repeat this process 299 times and get a total of 300 separate estimates of the mean and we call these ybar_1, ybar_2, …, ybar_300. We can consider these 300 measurements of the sample mean to be a sample of its own kind which has its own mean and variance. We can calculate these 2 quantities and look at the *distribution* of the sample mean. This is a sampling distribution.

*The variability of random samples from the same population is called sampling variability. A probability distribution that characterizes some aspects of sampling variability is a sampling distribution. We make the following conclusions:*

- The mean of the sampling distribution (u_x) is equal to the population mean (
**u**) - The standard deviation of the sampling distribution, or the
*standard error*of the mean, is equal to the population standard deviation (sigma) divided by the square root of the size of each subsample (in this case n=500).

i.e., Standard Error (SE) = sigma/sqrt(500)

If we *standardize* the sample mean (for instance ybar_1), then

(ybar_1 –u_x)/SE →* N(0,1),which is standard normal,*

* as the sample size gets very large, for any distribution. Thus we can use the standard normal table for probabilities for any distribution we are given.*

*We will use this concept in more depth when we look at the next 2 ideas.*

The process of drawing conclusions about the population, based on observations in the sample is known as *statistical inference*. To make these conclusions, we have to consider how likely it is that the sample is representative – that it closely resembles – the population.

These decisions can be divided into 2 categories:

- Confidence Intervals (to be examined next)
- Hypothesis Tests (we will look at next time)

A confidence interval is based on a probability, which in simplest terms, is the likelihood of something occurring given that it is repeated a great number of times. It is also the relative frequency represented as a percentage. Probabilities have some very intuitive properties and statistical inference is based on probability. Without getting into a big dissertation regarding probability theory (see last week’s blog post), we can say that every confidence interval had a degree of uncertainty, usually set at 0.05.

The principle underlying a confidence interval is that we want to build a continuous range of values (where the parameter may lie) such that the parameter falls in that range 1-0.05 = 0.95% of the time with repeated sampling.

Let’s take an example:

Consider the population mean which we will call **u**

We have a collection of sample data represented by y1, y2, y3,…, y1000

We calculate the sample mean which is ybar

Now, ybar is probably not exactly equal to u but we hope it is close. The *standard error of the mean, SE, *is how far ybar tends to be from **u**.

The confidence interval: (ybar – 2*SE, ybar + 2*SE) is an approximate 95% confidence interval for **u**. Don’t worry about where the 2 comes from. It is a constant determined from a probability table (for the Standard Normal distribution) that is rounded up for our purposes. The interpretation of the CI is as follows:

*IF under repeated sampling and repeated creation of CI’s from these samples, the true population mean will be contained in 95% of these CI’s.*

*The error rate of 5% means that the true population mean will not be contained in the remaining 5% of CI when the sampling is repeated a large number of times.*

Example: Confidence Interval for the difference **u1 – u2** of 2 population means.

CI: (ybar_1 – ybar_2) ± 2*SE(ybar_1 – ybar_2)

*This interpretation is that, if the confidence interval included the point 0, then there is no difference between the means u1 and u2, or that u1 = u2*

Examples:

(-1.023 < u1-u2 < 3.25)

Since the interval includes the point, 0, we conclude that there is **no difference** between the means u1 and u2.

(2.14 < u1-u2 <4.25)

Since this interval does not include the point, 0, we conclude that the points **are not equal** to reach other.

Next week, we continue with an examination of hypothesis tests.

*Like what you read? Get blogs delivered right to your inbox as I post them so you can start standing out in your job and career. There is not a better way to learn or review college level stats topics than by reading, It’s A Math, Math World*

## It’s A Math, Math World (Probability)

Today’s blog post is really targeted as a “Statistics 101” for the early statistics student or non-statistician. It is a multi-part post on basic stats. Last week, we summarized descriptive statistics and next week we look at inferential statistics. We have some groundwork to complete in the meantime.

Before we dive into inferential statistics, we need to look at the notion and application of probability. Statistics are based on probabilities and used in applications such as margins of error, confidence intervals, p-values, etc. In general terms, probability is how “likely” something is to occur. We often throw probabilities around in a subjective fashion (i.e. “It’s ‘50-50’ that I will go to the store” and I’m 90% certain that I know that answer”), but probability is a clearly defined concept in mathematics and has certain axioms that we accept and that can be proven.

By definition, *the probability of an event E, P(E), is the relative frequency of an event occurring given an indefinitely long series of repetitions of a chance experiment.*

For example, we start with an “experiment” which is anything that produces “outcomes”. Using these “rules of probability”, we can assign a probability to each outcome in the sample space (the set of all possible outcomes).

For example, when we toss a die once, the sample space is given by S= {1, 2, 3, 4, 5, 6}, because one of these 6 numbers will appear every time we toss a die.

A random variable is a variable that can take on different values; each with a certain probability. A random variable (r.v.) can be either discrete (having a countable or finite # of values) or continuous (having an infinite number of values). Let A and B be discrete random variables from the die tossing problem. Each r.v. can be assigned a value and a probability as follows:

- Let A = the event that a ‘1’ occurs on the die throw

P (A) = (# of favorable outcomes)/ (Total # of outcomes)

P (A) = 1/6

- Let B = the “event” (collection of outcomes) that a die roll is an even number

P (B) = (# of favorable outcomes)/ (total number of outcomes)

Favorable outcomes are {2, 4, 6} so numerator = 3

Total number of outcomes = 6

P (B) = 3/6 = ½

Each event can be assigned a probability as can every point in the sample space.

Let A= an event in sample space S

- 0<= P (A) <=1
- P(S) = 1
- Let S’ (the
*compliment of S*) be the event that S does not occur then P(S’) = 1 – P(S)

Which implies P(S’) = 0

This makes sense, because the probability that an event from the sample space occurs is 1. Since every time a die is tossed, one of the 6 outcomes must come up. The c*omplement*, P(S’), the event that no outcomes in S happen is zero.

We have looked at probabilities of random variables in the discrete case were we had a finite number of values that could be enumerated. However, some probabilities are in the continuous case when the values of the random variable can take on an infinite number of values on a continuum. For example if Y = the lifespan of a refrigerator and it is bounded by 0 and 7 years, then the r.v. can take on any value in that interval. Rather than assigning probabilities in a roster or tabular method, we will graph the respective probabilities using a function, called a density, which generates the probabilities. This graph is called a frequency distribution or density curve.

The most common density curve is the *Normal Distribution* which resembles the “bell curve” used in many real world applications, including the natural sciences. Remember when your teacher/professor graded “on a curve”? This is an application of this distribution. You need the 2 parameters ;mean and standard deviation; of a normal distribution to define it. The formula for the density is rather complicated so we don’t use it to calculate the associated probabilities, instead we refer to the *Standard Normal Distribution* which has mean=0 and standard deviation=1. We have tables of values for these Standard Normal probabilities.

That is fine, but what do we do when we have a normal distribution which is NOT standard normal?

We can *standardize* any normal distribution to become standard normal. Then, we can use the Standard Normal tables to look up our values.

Suppose X has a normal distribution (mean = 5, stdev =10). Then we can *standardize* it to become Z* as follows:

Z* = (X- mean)/stdev = (X-5)/10 is standard normal distribution.

Ex. find p (X >25)

P (Z* > (25-5)/10) = P (Z* > 2) = 1 – P (Z* < 2) = 1-0.9772 (look up from table) = **0.0228**

The probability that an r.v. X >25 is 0.0228.

Next week, we will look at sampling distributions and the Central Limit Theorem (CLT), confidence Intervals and hypothesis tests. After next week, we will move onto more narrow topics and be a little more laser-focused.

## It’s A Math, Math World (Basic Stats 101)

Today’s blog post is really targeted as a “Statistics 101” for the early statistics student or non-statistician. It is a 2-part post on basic stats. Today, we summarize descriptive statistics and next week we look at inferential statistics. Let me know your opinions!

We will start with some definitions.

Statistics is the science of collecting, organizing, describing, analyzing and presenting data. It has applications in almost all fields such as the natural sciences, social sciences, manufacturing and business. The main reasons for the applications of statistics are two-fold.

- Everyone has data to be analyzed. As a matter of fact, we are sometimes overwhelmed with the volume of data that we have. We need some way to interpret it.
- All aspects of academia and business are concerned with understanding and controlling
*variation*in their data. In business, variation in production causes loss of revenue from sub–standard products being rejected in the factory or recalled from the marketplace. In research and academia, variation in data means that current processes are inconsistent and need to be re-tooled for more accurate results. For example, in medical research, uncontrolled variation in a clinical trial could result in unsafe products being approved due to invalid data.

These leads to us looking at the two branches of statistics which are:

- Descriptive statistics
- Inferential Statistics

We will look at descriptive statistics today and inferential statistics, briefly, next week. To understand this area of study, we need to understand the basic concept behind the study of statistics.

We consider a *population* of measurements (our data). For example, consider the weight of every US citizen as our population of interest. The average weight of the population is unknown and we call that a *parameter*. Theoretically, we could weigh all 300 million people in the US and average their weights, but this process would be time consuming and very expensive.

Therefore, we randomly select a representative subset of this population of measurements. This is called a *sample*. The sample size is predetermined by statistical methodology and sometimes by the cost restrictions of the collecting organization. The term “random” will be defined later and is the subject of a lot of research and debate.

For example, we select 5,000 measurements, at random, from the population of USA weights and find their average. If this sample is “random” (i.e. picked such that every element of the population is equally likely to get picked) then the sample average, known as a *statistic*, should estimate the population parameter within a certain error range.

Getting back to descriptive statistics, we have the following:

For numeric data, the two most common descriptive statistics (i.e. statistics that describe the data) are the measures of *central tendency* and *dispersion*. The measures of central tendency describe the “center” of the data and include the *mean* and *median*. The measures of dispersion include the *range*, *variance*, and *standard deviation* and describe the “spread” of the data around the “center”.

Descriptive statistics also include the use of *plots* to “graphically” describe the data. For example, we often use the following types of plots:

- Box Plots
- Histograms
- Line Graphs
- Bar Charts
- Pie Charts
- Scatter Plots (for bivariate data)
- etc.

Please let me know your comments. I eventually want to move into Statistical applications, concepts and SAS programming areas, but thought that a solid foundation in basic stats would be a good place to start. Is this background helpful to you? Should I just move on to more advanced concepts? Please advise me. Thank you.