Penny Smeltzer began teaching AP Statistics when the course began and quickly created one of the largest programs in the nation in Austin, Texas. Her passion for the subject continues with decades of experience as an AP Statistics exam grader, grading leader, rubric writer, Test Development Committee member, and Question Leader. Her passion is creating meaningful interdisciplinary lessons with science using topics that matter in the lives of the human population and the planet, which she shares on her website Lessons That Matter. Penny was the question leader for 2024 #3.
A car maker produces four different models of cars: A, B, C, and D. A group of researchers is investigating which model of car has the longest distance traveled per gallon of gas (mileage). Higher mileage is considered better than lower mileage. The researchers will conduct a study in which they contact several owners of each model of car and ask them to estimate their mileage.
Part (a)
Is this an observational study or an experiment? Justify your answer in context.
Sample Responses
Response 1:
Observational study because no experiment was done.
Response 2:
Observational study because no variable was manipulated and nothing was done. Plus, the people only estimated their mileage and could have done that wrong.
Response 3:
Observational study because no treatment was imposed.
Response 4:
Observational study because the car models were not randomly assigned to the drivers.
Response 5:
Observational study. The cars were not assigned a treatment and the drivers just drove the ones they had.
WOULD THIS GET CREDIT?
To earn an E (Essentially Correct), a response must include all three of the following components:
Observational Study
No treatment imposed OR no random assignment
Context: car model (Note: mileage is not enough)
Response 1 correctly chooses observational study (1) but has no context. Claiming that no experiment was done does not give strong enough justification.
Response 2 correctly chooses observational study (1) but has no statistical justification. The context of “people” is too general and “mileage” is insufficient for credit. The “car models” are the appropriate context for this problem.
Response 3 correctly chooses observational study (1) and correctly justifies the choice (2). But no context is included to show an understanding of this particular study. The typical memorized statement would earn partial credit.
Response 4 correctly chooses observational study (1), includes the proper context (3), and justifies the choice (2) by stating that no random assignment was made. No random assignment or no treatment imposed both earn credit for justification.
Response 5 earns full credit because it chooses observational study (1), justifies the choice by stating no treatment was assigned (2), and includes the context of cars (3). Note that the best context would be “car models.”
Teaching Tips:
For full credit in describing a study, specific context is required. Spend time discussing the context in detail for all problems done in class. This component is required on most parts of all questions.
Use proper statistical vocabulary to justify an answer. In this case, “no treatment imposed” or “no random assignment”
Part (b)
Model D has an autopilot feature, in which the car controls its own motion with human supervision. James owns a Model D car and will investigate whether using the autopilot feature results in higher mileage than not using the autopilot. James will drive his car on 70 different days to and from work, using the same route at the same time each day. James will record the mileage each day.
(b) James will use a completely randomized design to conduct his investigation. Describe an appropriate method James could use to randomly assign the two treatments, driving using the autopilot feature and driving without using the autopilot feature, to 35 days each.
Sample Responses
Response 1:
Number the days from 1 to 70 and then use a random number generator to randomly select 35 numbers for James to use autopilot. Have James drive the car for the remaining days.
Response 2:
Every day James should flip a coin before he leaves for work. If he flips heads, he drives using autopilot. If he flips tails, he drives himself. He should do this for the next 70 work days.
Response 3:
Put 35 yellow marbles and 35 blue marbles in a bag. Let the yellow marbles represent driving with autopilot and the blue marbles represent driving without autopilot. Starting with the first work day, draw a marble out of the bag to determine which driving method will be used that day. Write the method chosen on the calendar for that day. Do not put the marble back in the bag. Pull out another marble for the next day and write that on the calendar. Put that marble aside as well. Continue this until the bag is empty and all 70 days have been labeled with the driving method chosen.
Response 4:
Assign each of the 70 days a unique integer from 1 to 70. Have a random number generator randomly select 35 integers from 1 to 70 with no repeats. Match each of the 35 selected number to the day with the corresponding number and have James drive using autopilot on those days. On the remaining 35 days, James should drive without using autopilot.
WOULD THIS GET CREDIT?
To earn an E (Essentially Correct), a response must include all three of the following components:
Appropriate labels
A random process such that every possible random assignment is equally likely
A random process that results in 35 days using autopilot and 35 days not using autopilot
Response 1 does label the days from 1 to 70 (1). It would have been stronger had the response used the word unique here to clarify that each day received a different number. The response did not clearly identify the parameters of the random number generator (1 to 70 with no repeats). The numbers chosen must be linked to the corresponding days for credit. This did not happen in this response.
Response 2 correctly labels the driving methods associated with a coin flip outcome (1). But the problem states that there must be 35 of each method. Although the expected value of 70 tosses resulting in heads is indeed 35, this is not guaranteed so full credit cannot be earned.
Response 3 correctly labeled the marbles with 35 of each treatment (1), drew out a marble without replacement each day, and noted the driving method for all 70 days (3). Unfortunately, the response neglected to mix the marbles before the selection began. This is very strong but not quite a full credit response. A real heartbreaker.
Response 4 earns full credit because the days are labeled appropriately (1), a correct random process is used that results in all possible samples of 35 being equally likely (2), and the selection process results in 35 of each driving method (3).
Teaching Tips:
For full credit, physical items used for a randomization process such as coins, marbles, slips of paper must be clearly labeled and tied to a specific treatment. If numbers are used, the response must link the numbers to the treatments and to the experimental units.
The random process must result in all possible samples of the same size having an equally likely chance of being selected.
Read the problem carefully to ensure all criteria is met – in this case 35 of each treatment. Although a coin tosses or dice rolls are good random processes, the results would not meet the criteria in this case. (Component 3)
Including a stopping rule such as “keep flipping a coin until you have 35 of one driving method and then assign the remaining days the other driving method” does not meet the criteria. In this case, all possible random assignments do not have an equally likely chance. The probability that the last few days are assigned to the same treatment is higher. (Component 2)
Part (c)
After the investigation was completed, James verified that the conditions for inference were met and conducted a hypothesis test. He discovered the mean mileage when using the autopilot feature was significantly higher than the mean mileage when not using the autopilot feature.
James is a member of a Model D club with thousands of members who all drive Model D cars. He will give a presentation at a Model D club members’ meeting later this year and would like to state that the results of his hypothesis test apply to all Model D cars in his club. Another member of the club who is a statistician tells James his findings do not apply to all Model D cars in the club. What change would James need to make to his original study to be able to generalize to all Model D cars in the club?
Sample Responses
Response 1:
James needs to get some more drivers of Model D cars to do the same thing.
Response 2:
In order to generalize to all Model D cars in the club, James needs to sample more Model D cars. Then he could do a two-sample t-test to compare the mileages of the two driving methods.
Response 3:
James should randomly select another Model D driver to do the same thing he did for 70 days.
Response 4:
Using the roster of the model D club members, several more members should be randomly selected and their cars should be used in the study.
WOULD THIS GET CREDIT?
To earn an E (Essentially Correct), a response must include all three of the following components:
More cars
Random sampling
Context (Model D cars from his club)
Response 1 does note the need for more cars (1) but does not specify that the cars must come from the Model D cars in the club. The response also lacks random selection of the cars.
Response 2 does note the need for more cars (1) but does not specify that the cars must come from the Model D cars in the club. Here it should be noted that the first phrase of the response is simply a repeat of the prompt and does not reference the population used for car selection. The response also lacks random selection of the cars. The last sentence of this response is extraneous.
Response 3 correctly uses random selection (2) but only for one more driver or car. In addition, the driver is not clearly noted as a member of the Model D club.
Response 4 earns full credit because the need for more cars (or members) is noted (1), randomly selection is used (2), and the population from which the cars are selected in clearly from the Model D club (3).
Teaching Tips:
For full credit, generalization statements require selection from the population of interest.
The selection must be random in order to be representative of the population of interest.
And again, the context must be specific enough to rule out any potential misunderstandings. For this question, “drivers in the Model D club” or the term “members” both would earn credit for the context necessary for the selection process.