Skip to main content
Ctrl+K
Think Data Science - Home

Part I. Fundamentals

  • 1. Introduction
    • 1.1. Data Science
    • 1.2. On Programming
  • 2. Python Basics
    • 2.1. Python Syntax
    • 2.2. Control Structures
    • 2.3. Lists
    • 2.4. Dictionaries
    • 2.5. Functions

Part II. Working with Data

  • 3. NumPy Arrays
    • 3.1. Arrays Basics
    • 3.2. Array Computation
  • 4. Pandas
    • 4.1. Pandas Series
    • 4.2. Handling Datasets
    • 4.3. DataFrames
    • 4.4. Missing Data
    • 4.5. Data Operations

Part III. Data Visualization

  • 5. Visualization
    • 5.1. Pandas Visualization
  • 6. Matplotlib Overview
  • 7. Seaborn

Part IV. Inferential Statistics

  • 8. Probability
    • 8.1. Distribution
    • 8.2. Sampling Variability
  • 9. Testing Hypothesis
    • 9.1. Assessing Model 1
    • 9.2. Assessing Model 2
    • 9.3. Hypotheses and p-Value
  • 10. Two Samples
    • 10.1. A/B Testing
    • 10.2. Deflategate
    • 10.3. Causality
  • 11. Estimation
    • 11.1. Percentiles
    • 11.2. The Bootstrap
    • 11.3. Confidence Intervals
  • 12. Regression
    • 12.1. Correlation
    • 12.2. Regression Line
    • 12.3. Least Square
    • 12.4. Visual Diagnostics

Part V. Machine Learning

  • 13. Multiple Regression
    • 13.5. Linear Regression
    • 13.6. Multiple Regression
  • 14. Classification
    • 14.1. Nearest Neighbors
    • 14.2. Training and Testing
    • 14.3. Rows of Tables
    • 14.4. Implementing the Classifier
    • 14.5. The Accuracy of the Classifier
  • 15. K-Means Clustering

Appendices

  • 16. Tooling
  • 17. Work Environment
    • 17.1. Python installation
    • 17.2. Virtual Environment
    • 17.3. Jupyter Notebook
    • 17.4. Launch Jupyter FAST!
  • 18. Bibliography
  • Repository
  • Open issue

Index

E | K | S

E

  • expression

K

  • keywords

S

  • statement

By Tsangyao Chen

© Copyright 2026.