Elements of an Experiment

 

Yesterday we spent some time discussing the nature of science and the characteristics of a scientific hypothesis.  Once a scientific hypothesis has been formed, the next step would be to test that hypothesis with an experiment.  A crucial step in designing an experiment is to identify the variables involved.  Variables are things that may be expected to change during the course of an experiment.  The investigator deliberately changes the independent variable.  She/he measures the dependent variable to learn whether or not it changes depending on the value of the independent variable.  To eliminate the effect of anything else on the dependent variable, the investigator tries to keep standardized variables constant.

 

Dependent Variables

 

The dependent variable is what the investigator measures.  It is what the scientist thinks will change as a result of the experimental procedure.  For example, she may want to study peanut growth.  One possible dependent variable is the height of peanut plants.  Name some other aspects of peanut growth that might be measured:

 

 

 

All of these aspects of peanut growth may be measured and used as dependent variables of an experiment.  The investigator may chose one that they think is the most important, or choose to measure more than one.

 

Independent Variables

 

 The independent variable is what the investigator deliberately varies during the experiment.  It is chosen because the scientist thinks it may affect the value of the dependent variable.  Name some of the factors that may affect the number of peanuts produced by peanut plants:

 

 

 

In many cases the scientist may not manipulate the independent variable directly.  For example, the hypothesis that more crimes are committed during a full moon can be tested scientifically.  Obviously, the scientist can not cause a full moon to appear to observe the effects.  Instead, he can collect data (such as the number of crimes committed from police reports) during the naturally occurring phases of the moon and compare them.  In this case, phase of the moon is the independent variable and number of crimes committed is the dependent variable.

 

The scientist can measure as many dependent variables as she thinks are important.  However, a good experimental design will only include the manipulation of one independent variable at a time.  For example, if the scientist wants to investigate the effects of fertilizer on peanut growth, she will use different amounts of fertilizer on different plants and record the results.  She would not want to also give the plants different amounts of water and light at the same time.  Why is the scientist limited to one independent variable per experiment?

 

Identify the dependent and independent variable in the following examples:

 

Guinea pigs are kept at different temperatures for 6 weeks.  Percent weight gain is recorded.

 

 

 

The number of different algal specie is counted for a coastal area before and after an oil spill.

 

 

 

An investigator hypothesizes that the adult weight of a dog is higher when it has fewer litter mates.

 

 

Height of bean plants is recorded daily for 2 weeks.

 

 

 

Standardized variables

 

A third type of variable is the standardized variable.  Standardized variables are variables that are kept constant in all treatments, so that any changes in the dependent variable can be attributed to the changes the scientist made in the independent variable. 

 

Since the scientist wants to study the effect of one particular independent variable, she must try to eliminate the possibility that other variables are influencing the outcome.  This is accomplished by standardizing these variables.  For example, a group of students wants to examine the effects of temperature on bacteria growth:

 

What is the independent variable?

 

 

What is the dependent variable?

 

 

What are variables should be standardized?

 

 

 

 

 

 

 

 

 

 

Quantitative Data Presentation

 

 

A student team wanted to test the hypothesis that athletes have better cardiovascular fitness than nonathletes.  The students gathered some athletes and nonathletes and took their pulse before and after exercise.

 

What is the dependent variable(s) in this experiment?

 

 

What are the independent variable(s) in this experiment?

 

 

What prediction would make for what results the students will see if their hypothesis is correct?

 

 

What variable(s) should the students try to ÒcontrolÓ for?

 

 

 

 

I.               Tables

 

Suppose your lab team carried out the experiment from above and gathered the following data:

 

Nonathletes

 

Athletes

 

Resting Pulse

After Exercise

 

Resting Pulse

After Exercise

 

Trial

Trial

 

Trial

Trial

Subject

1

2

3

1

2

3

Subject

1

2

3

1

2

3

1

72

68

71

145

152

139

1

67

71

70

136

133

134

2

65

63

72

142

144

158

2

73

71

70

141

144

142

3

63

68

70

140

147

144

3

72

74

73

152

146

149

4

70

72

72

133

134

145

4

75

70

72

156

151

151

5

75

76

77

149

152

153

5

78

72

76

156

150

155

6

75

75

71

154

148

147

6

74

75

75

149

146

146

7

71

68

73

142

145

150

7

68

69

69

132

140

136

8

68

70

66

135

137

135

8

70

71

70

151

148

146

9

78

75

80

160

155

153

9

73

77

76

138

152

147

10

73

75

74

142

146

140

10

72

68

65

153

155

155

 

If the data were presented like this, readers would have difficulty discovering any meaning in them.  This is called raw data.  Since the students had each subject perform multiple trials, the data for each subject can be averaged, as in the table below:

 

 

Table: Average Pulse Rate for Each Subject (Average of three trials for each subject; pulse take before and after exercise).

 

Nonathletes

Athletes

 

Resting Pulse

After Exercise

 

Resting Pulse

After Exercise

Subject

Average

Average

Subject

Average

Average

1

70

145

1

70

134

2

67

148

2

70

142

3

67

144

3

73

149

4

71

139

4

72

151

5

76

151

5

76

155

6

74

150

6

75

146

7

71

146

7

69

136

8

68

136

8

70

146

9

78

156

9

76

147

10

74

143

10

68

155

 

Note that this table has a heading that explains what the numbers in the table represent.  This rough data table is still rather unwieldy and hard to interrupt.  A summary table could be used to convey the overall average for each part of the experiment.  For example:

 

Table: Overall Averages of Pulse Rate (10 subjects in each group; 3 trials for each subject; pulse taken before and after exercise).

 

Pulse (beats/min)

 

Before exercise

After exercise

Nonathletes

71.6

145.8

Athletes

71.9

146.1

 

 

 

 

 

 

 

 

 

 

Tables should be used to present results with relatively few data points.  Tables are also useful to display several dependent variables at the same time.  For example, average pulse rate before and after exercise, average blood pressure before and after exercise, etc. could all be put in one table.

 

 

II.             Graphs

 

Numerical results of an experiment are often presented in a graph rather than a table.  A graph is literally a picture of the results, so a graph can often be more easily interpreted than a table.  Generally, the independent variable is graphed on the x-axis (horizontal axis) and the dependent variable on the y-axis (vertical axis).

 

When you draw a graph, keep in mind that you want to show the data in the clearest, most readable form possible.  To achieve this, you should follow the rules below:

 

á      Plot the independent variable on the x-axis and the dependent variable on the y-axis.  For example, if you are graphing the effect of fertilizer on peanut weight, the amount of fertilizer is plotted on the x-axis and peanut wieht is on the y-axis.

á      Label each axis with the name of the variable and specify the units used to measure it. For example, the x-axis might be labeled ÒFertilizer applied (g/100 m2Ó and the y-axis might be labeled ÒWeight of peanuts per plant (grams)Ó.

á      The intervals labeled on each axis should be appropriate for the range of data so that most of the area of the graph can be used.  For example, if the highest data point is 47, the highest value label on the axis might be 50.  If you labeled the intervals on up to 100, there would be a lot of unused area on the graph.

á      The intervals that are labeled on the graph should be evenly spaced.  For example, you might label the intervals 0, 5, 10, 15, 10, etc. 

á      The graph should have a title, like that of a table, describes the experimental conditions that produced the data.

 

The figure below shows a well-executed graph:

Figure 1: Weight of peanuts produced per plant when amount of fertilizer applied is varied.  Average seed weight per plant in 100 m2 plots, 400 plants per plot.

 

 

 

 

 

The most commonly used forms of graphs are line graphs and bar graphs.  The choice of graph type depends on the nature of the independent variable. 

 

Continuous variables are those that have an unlimited number of values between points.  Line graphs are used to represent continuous data.  For instance, time is a continuous variable.  Although the units can be minutes, hours, days, months, etc., values can be placed in between any two values.  Amount of fertilizer in the above graph is a continuous variable.  In a line graph, data are plotted as separate points on an axis, and the points are connected to each other. 

 

More than one set of data can be plotted on a graph, to compare one set of data with another.  When this is done, it is necessary to provide a legend as a key to indicate which line corresponds to which data set.

 

Figure 1: Recovery rate of athletes and nonathletes after performing a step test for 5 minutes (average of 10 subjects, each subject performed the test 3 times).

 

 

 

 

 

 

 

 

Discrete variables, on the other hand, have a limited number of possible values.  Values fall into distinct and separate groups.  For example, in our experiment on pulse rates, our test subjects were categorized as either athletes or nonathletes, there were no Òin betweensÓ.  Discrete data are displayed using bar graphs like the one below:

 

Figure 1: Average pulse rates of athletes and nonathletes before and after performing a step test for 5 minutes.  (average of 10 subjects, each subject performed the test 3 times.)

 

 

 

In this example, before and after exercise data are discrete.  There are no intermediate possibilities.  The subjects used are also a discrete variable. A subject is either an athlete or nonathlete.  Note also that pulse rate is the dependent variable, and there are two independent variables: subject type and before/after exercise. 

 

 

 

 

 

 

 

 

 

The graph could have also been constructed as shown here:

 

Figure 1: Average pulse rates of athletes and nonathletes before and after performing a step test for 5 minutes.  (average of 10 subjects, each subject performed the test 3 times.)

 

What is the difference between the two graphs?

 

 

 

Explain why the first way would be a better graph to convey the results of the experiment.

 


Activity: Graphing Practice

 

Use the temperature and precipitation data provided in the table below to complete the following questions.

 

 

 

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Fairbanks, Alaska

T

-19

-12

-5

6

15

22

22

19

12

2

-11

-17

P

2.3

1.3

1.8

0.8

1.5

3.3

4.8

5.3

3.3

2.0

1.8

1.5

San Francisco, California

T

13

15

16

17

17

19

18

18

21

20

17

14

P

11.9

9.7

7.9

3.8

1.8

0.3

0

0

0.8

2.5

6.4

11.2

San Salvador, El Salvador

T

32

33

34

34

33

31

32

32

31

31

31

32

P

0.8

0.5

1.0

4.3

19.6

32.8

29.3

29.7

30.7

24.1

4.1

1.0

Indianapolis, Indiana

T

2

4

9

16

22

28

30

29

25

18

10

4

P

7.6

6.9

10.2

9.1

9.9

10.2

9.9

8.4

8.1

7.1

8.4

7.6

 

 

  1. Compare monthly temperatures in Fairbanks with temperature in San Salvador.

 

    1. Identify the dependent and independent variables in this comparison:

 

Date (month):

 

City:

 

Temperature:

 

 

    1. Can data for both cities be plotted on the same graph?

 

 

 

 

    1. What will go on the x-axis?

 

 

 

 

    1. What should go on the y-axis?

 

 

 

 

    1. What type of graph should be used?  Why?

 

 

 

  1. Compare the average temperature for September in Fairbanks, San Francisco, San Salvador, and Indianapolis.

 

    1. What are the dependent and independent variables in this comparison?

 

 

 

 

    1. Can data for all 4 cities be plotted on the same graph?

 

 

 

    1. What will go on the x-axis?

 

 

 

 

    1. What should go on the y-axis?

 

 

 

    1. What type of graph should be used? Why?