## It’s A Math, Math World – Intro to DOE

In the field of statistics, data collected without the proper “context” is meaningless. Data has to be collected using proper statistical procedures and using proper experimental design for it to be valid and meaningful. In other words, we design experiments in order to (1) set up a direct comparison between treatments of interest, (2) minimize bias, (3) minimize the error is comparison and (4) make inferences about causation.

There is an entire field of statistics called Design of Experiments (DOE) or Experimental Design that tries to find the best design for a particular situation. Today we will begin a long series of posts dedicated to DOE and we start with an introduction and some terminology.

**Experiment** – a test in which changes are made to the input variables in a system in order to observe and study the changes made to the output variables.

**Explanatory Variables** – the input variables

**Response Variables** – the output variables

**Treatments** – different procedures being compared in an experiment.

**Factor **– Combine to form treatments in an experiment

**Level** – an individual setting of the level.

Ex. Suppose we have a kiln and we are baking ceramic pieces at different temperatures (500, 600 and 800 degrees F) and different humidity percentages (10%, 20%, 30%).

*Treatments* are different combinations of temp and humidity.

*Factors* are temperature and humidity

*Levels* of temperature are 500, 600 and 800

*Levels* of humidity are 10%, 20% and 30%

**Experimental Units** are the things to which we apply the treatments.

**Response** – an outcome we observe after applying a treatment to an experimental unit.

**Measurement Units** are the actual objects to which the response is measured (may differ from the experimental units)

Ex. If you are applying a standardized test to a classroom of children, then the classroom is the experimental unit and the children are the measurement units.

**Control** – There are two uses for this word.

An experiment is *controlled* if the experimenter assigns treatments to experimental units; otherwise it is an observational study.

A *control* treatment is a “standard” treatment that is used as a baseline or standard of comparison for other treatments. In clinical research, this could be either a common “gold standard” therapy or a placebo treatment.

**Confounding** – occurs when the effects of one treatment or factor cannot be distinguished from the effects of another treatment or factor. The two items are said to be confounded.

Ex. Consider an experiment in which you plant 2 varieties of corn; variety 1 in one NJ and variety 2 in Nebraska. We are unable to distinguish between state effects and variety effects, therefore the state and variety factors are *confounded*.

As we can see, experiments usually involve several factors and our goal is to discover which factors influence the response. There are different strategies to approaching how to plan and conduct these experiments.

1. The **best-guess approach** involves choosing a certain subset of factors to test simultaneously based on theoretical knowledge of the system being studied. It can work reasonably well but has some disadvantages. Suppose the initial guess is incorrect. Then the experiment has to be modified and run again until it is successful which costs time and money. Also, if it succeeds, the experimenter stops, and he may assume incorrectly that he has the best solution.

2. The **one-factor-at-a-time approach** consists of starting with baseline (starting levels) of each factor and then varying each factor, one at a time, over their range, while holding the other factors constant at the baseline levels. The major disadvantage of this method is that it fails to recognize and possible interaction between the factors. An *interaction* is the failure of one factor to produce the same effect on the response at different levels of another factor.

3. **Factorial analysis** is the correct approach to dealing with several factors. This is an experimental design in which several factors are varied *together*, instead of one at a time. We will look at these very soon.

Next time, we will begin our look at various types of experimental designs with examples.

*Note: Sources of research for this blog post include:*

*1) Design and Analysis of Experiments (Montgomery), 7 ^{th} Edition.*

*2) A First Course in Design and Analysis of Experiments (Dehlert).*

*Like what you read? Get blogs delivered right to your inbox as I post them so you can start standing out in your job and career. There is not a better way to learn or review college level stats topics than by reading, It’s A Math, Math World*

Email Marketing You Can Trust

Thank you Michael for the accuracy of your presentation which is very interesting and useful as usual.

Thanks for this page and thanks to Michael.

Hi Michael,

I congratulate you on choosing this valuable subject to present in such a simple and basic way that any one can understand with out much of effort. I work on creating design and experimental designs for FMCG and many other areas. It will be interesting to see how you will tackle the SOV and the interactions between the factors and levels.

I have a question to ask, when mentioning the experimental design is that an orthogonal or randomized experimental designs.

Thanks.

Hi Michael

Thank you for the clear explanations of the Design Of Experiment field.

Your post is very useful as usual. It will inspire me a lot of experiments aiming to explore health services intervention effects.

Sincerely.

Hi Michael,

An excellent and short summary of what a DOE is and definition of terms. Thanks, this is not a well known for most of commercial / govt sectors outside the experimental scenarios, and the applications are numerous. I would like you to deal deeper in to various commercial scenarios — marketing / health care etc. if you can and don’t mind.

Thanks,

C.S. Ganti