Preface#

These lecture notes are prepared for IST-3420, Introduction to Data Science and Management at Missouri University of Science and Technology (MS&T) at Rolla, Missouri. The course aims to provide you with a solid foundation in data science concepts and practices, supporting the learner’s future academic and career development.

How to Use This Book#

Each chapter section is an interactive Jupyter Notebook. You can:

  • Read through the rendered content

  • Interact with the code directly by downloading the notebooks locally using Binder or github

  • Use the Live Code cells to practice

Live Coding#

Some sections include exercises that allow live coding, which can be enabled by clicking the Live Code button. It will take some time to launch a new session until you see a “ready” message. The live coding cells allow you to practice the concepts and skills you learn in the section.

thebe-loading

Custom Code#

If you get the notebook files and place them in a folder in your project directory, or you write your own notebooks and want to use some features such as jupyturtle in this book, you may need additional functionalities. For this, the cell below downloads the files specifically used for this book. You don’t need to understand this code yet, but you can see that this code downloads some *.py files from Dr. Allen Downey’s GitHub repository to enable the required functionalities.

You then create a folder in your project folder called shared and place the downloaded files in it and add an empty file called __init.py__, which makes the shared folder a package. Later in any of your project notebook, you can import the modules to use the functions in the modules in the notebook:

from shared import [module]

The download function:

from os.path import basename, exists

def download(url):
    filename = basename(url)
    if not exists(filename):
        from urllib.request import urlretrieve

        local, _ = urlretrieve(url, filename)
        print("Downloaded " + str(local))
    return filename

To use the downloaded files as modules, create a directory in the project root, call it shared, and drop the files in it. In each of the notebooks that you need to import the modules, place this code snippet below at the beginning of the notebook.

import sys
from pathlib import Path

current = Path.cwd()
for parent in [current, *current.parents]:
    if (parent / '_config.yml').exists():
        project_root = parent  # ← Add project root, not chapters
        break
else:
    project_root = Path.cwd().parent.parent

sys.path.insert(0, str(project_root))

from shared import thinkpython, diagram, jupyturtle

AI in Jupyter Notebook#

To install AI components in Jupyter Notebook, you may try:

%pip install "jupyter-ai"
  • Don’t forget to comment out the %pip install line after installation.

  • To learn about jupyter-ai, visit the Jupyter AI documentation site.

Credits#

Some of the Python parts of these notes are based on Allen Downey’s book Think Python, which is a great textbook for Python. For materials from other resources, the links are provided and/or proper citations referenced. Some statistics parts are based on UC Berkeley’s Data8, a textbook that I used in 2024 and 2025 for this course. I have adopted the chapters here and will replace them in response to their copyright policy.

Code license: MIT License Text license: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International