June 20, 2019
StatPREP Project
Purpose: Provide faculty development opportunities for teaching intro stats.
Why?
- Statistics is seen by many as the gateway to data.
- Data drives many aspects of modern life and employment.
- Most UG statistics instructors have little or no background in stats.
- Statistics has changed and is changing as stronger associations are made with new fields:
- data science
- machine learning
- remote sensing
- medical records
- genomics
It’s data that’s behind the increased importance of statistics.
Question: What’s the relationship between statistics and data?
Answer …
Answer …
Answer
Traditionally, statistics methods and concepts are about the lack of data.
Changing faster and faster …
How has the content and application of our field changed?
Statistics:
- Bayes and MCMC (since 1990)
- Databases (since 1965)
- Confounders and causation (since 1960)
- Modern modes of graphics (since approx. 1990)
Machine learning (since 2010)
College algebra:
- ?
Calculus
- ?
Trigonometry
- ?
For examples of change, see Nick Horton’s article on changes in the New England Journal of Medicine over the 15 years to 2005.
How teaching calculus might distort a view of statistics
Example of calculus-like problem … of a sort that I very rarely see in statistical work.
The scores on the SAT verbal test in recent years follow approximately the N(517, 112) distribution.
- What is the proportion of students scoring under 400?
- What is the proportion of students scoring between 400 and 550?
- How high must a student score to place in the top 10% of all students taking the SAT? State answer as a whole number.
- Using the empirical rule, what is the probability that a randomly [selected] SAT test will have a verbal score between 629 and 853?
Computing and statistics
- 14 / 16 said proficiency in using computers to handle and manage data should be an important goal of a statistics course.
- 14 / 16 said they disagree that statistical concepts are rooted in algebraic notation.
- 12 / 16 said computing offers a framework for understanding statistical theory that is as legitimate as the theory based on probability rules and algebra.
- 14 / 16 said that technology tools should be used to illustrate most abstract statistical concepts.
Change is hard
Q24: What keeps you from making changes in your introductory statistics course?
- 12 / 15 Limited personal time
- 8 / 15 Student characteristics (e.g., ability, interest, etc.)
- 7 / 15 Technology constraints (e.g., lack of computer lab, cost of software)
- 5 / 15 Departmental or institutional constraints (e.g., choice of textbook, class size, mandated curriculum, etc.)
Goals countdown: 3, 2, 1, …
17 / 18 said learning new methods for teaching statistics is a top or important priority for them at this workshop.
Not so interested in GAISE: 10 / 18 said learning how to implement GAISE is an important priority for them of this workshop.
- GAISE Recommendation 3: Integrate real data with a context and a purpose.
- You: 18 / 18 said learning how to incorporate real data into classes is important or their top priority. 6 / 18 said “top priority” (So we’ll have to talk about what is “real data.”)
- GAISE Recommendation 2: Focus on conceptual understanding.
- “Pare down content of an introductory course to focus on core concepts in more depth.”
- You: 15 / 16 – The many methods covered in introductory statistics can be reduced to a small set of common principles.
What are those principles? Write them down here
“Perform most computations using technology to allow greater emphasis on understanding concepts and interpreting results.”
- You: 11 / 17 use graphing calculators in teaching stats.
- You: 12 / 16 Computing offers a framework for understanding statistical theory that is as legitimate as the theory based on probability rules and algebra.
You: Only 6 / 16 were confident in using computer programs to analyze data or data sets
GAISE Recommendation 1: Teach statistical thinking.
- Teach statistics as an investigative process of problem-solving and decision-making
“[I]nstructors [should] illustrate the complete investigative cycle with every example/exercise presented.”
- Give students experience with multivariable thinking.
- Characterizing relationships involving multiple (more than 3) variables: Never: 5 / 16, Rarely 8 / 16
You: Only 4 / 16 were confident in developing data models.
Our goals for you …
We hope you will leave this workshop able and interested to use real data to teach the topics you currently cover.
Ultimately, embracing real data in the ways needed in the workplace and as envisioned by GAISE is going to call for broader changes:
- Streamlining the topics/methods of your course
- Becoming a role model for proper management of data
- Confronting statistical issue that are important to contemporary uses of data but which are not part of the historic canon of statistics:
- prediction
- decision-making
- causality
Three components of the workshop
- Things you can directly use in teaching:
- Exploring data-driven activities via Little Apps
- We’ve got about 20 activities on a variety of topics in statistics.
- Faculty development: Statistics topics the textbook doesn’t cover
- Bootstrapping
- Unifying inference with regression
- Faculty development: Data science
- Sources of data, graphing data
- R ecosystem
Levels will range from the easy to the aspirational.