Think Stats 2e(English, Paperback, Downey Allen)
Quick Overview
Product Price Comparison
If you know how to program, you have the skills to turn data into knowledge using the tools of probability and statistics. This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python. You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts. Develop your understanding of probability and statistics by writing and testing code Run experiments to test statistical behavior, such as generating samples from several distributions Use simulations to understand concepts that are hard to grasp mathematically Learn topics not usually covered in an introductory course, such as Bayesian estimation Import data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics tools Use statistical inference to answer questions about real-world data About the Author Allen Downey is an Associate Professor of Computer Science at the Olin College of Engineering. He has taught computer science at Wellesley College, Colby College and U.C. Berkeley. He has a Ph.D. in Computer Science from U.C. Berkeley and Master’s and Bachelor’s degrees from MIT. Table of Contents Chapter 1 Statistical Thinking for Programmers Do First Babies Arrive Late? A Statistical Approach The National Survey of Family Growth Tables and Records Significance Glossary Chapter 2 Descriptive Statistics Means and Averages Variance Distributions Representing Histograms Plotting Histograms Representing PMFs Plotting PMFs Outliers Other Visualizations Relative Risk Conditional Probability Reporting Results Glossary Chapter 3 Cumulative Distribution Functions The Class Size Paradox The Limits of PMFs Percentiles Cumulative Distribution Functions Representing CDFs Back to the Survey Data Conditional Distributions Random Numbers Summary Statistics Revisited Glossary Chapter 4 Continuous Distributions The Exponential Distribution The Pareto Distribution The Normal Distribution Normal Probability Plot The Lognormal Distribution Why Model? Generating Random Numbers Glossary Chapter 5 Probability Rules of Probability Monty Hall Poincaré Another Rule of Probability Binomial Distribution Streaks and Hot Spots Bayes’s Theorem Glossary Chapter 6 Operations on Distributions Skewness Random Variables PDFs Convolution Why Normal? Central Limit Theorem The Distribution Framework Glossary Chapter 7 Hypothesis Testing Testing a Difference in Means Choosing a Threshold Defining the Effect Interpreting the Result Cross-Validation Reporting Bayesian Probabilities Chi-Square Test Efficient Resampling Power Glossary Chapter 8 Estimation The Estimation Game Guess the Variance Understanding Errors Exponential Distributions Confidence Intervals Bayesian Estimation Implementing Bayesian Estimation Censored Data The Locomotive Problem Glossary Chapter 9 Correlation Standard Scores Covariance Correlation Making Scatterplots in Pyplot Spearman’s Rank Correlation Least Squares Fit Goodness of Fit Correlation and Causation Glossary Colophon