Summary and Schedule
This is a new lesson built with The Carpentries Workbench.
| Setup Instructions | Download files required for the lesson | |
| Duration: 00h 00m | Β 1. Python Notebook Introduction | |
| Duration: 02h 00m | Β 2. Acquiring and Exploration of Census Data |
What kinds of datasets are available from the U.S. Census
Bureau? How can you visualize and analyze these datasets for your region of interest? How do you combine spatial and tabular Census data? What variables are available in the ACS dataset? |
| Duration: 06h 00m | Β 3. Census Data Analysis with Python Notebook |
How do you clean and prepare raw Census data for analysis? How do you rename columns, sort data, and compute summary statistics? What is data visualization and why does it matter for Census analysis? What makes a visualization effective versus misleading? Which Python tools are best for creating publication-ready plots? :::::::::::::::::::::::::::::::::::::::::::::::: |
| Duration: 10h 00m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Workshop Overview
Enabling the Geospatial Turn through Cyberinfrastructure
Training
π https://spatialturn.github.io/IGIC2026/
This workshop introduces Jupyter Notebook as a platform for data analysis using U.S. Census data as a practical case study. Participants will learn how to access, clean, analyze, and visualize data while understanding key Census concepts such as variables and geographic units.
Designed for beginners, the session emphasizes hands-on learning and guides participants through a complete workflow β from data acquisition to interpretation. Basic Python experience is required.
Schedule
| Time | Session | Content | Activities |
|---|---|---|---|
| 9:00 β 10:00 | Notebook Basics | Introduction to Jupyter Notebook and Google Colab | Open notebook, run first code cell, create markdown notes |
| 10:00 β 11:00 | Python Libraries | Importing libraries (GeoPandas, pandas, matplotlib, NumPy) | Use libraries, create variables, load sample data, short exercise on data exploration |
| 11:00 β 12:00 | Introduction to Census Data | American Community Survey (ACS) vs.Β Decennial Census | β |
| 12:00 β 1:00 | Lunch Break | β | β |
| 1:00 β 2:00 | Accessing Census Data | Download data or access via the Census API | Retrieve data, convert to DataFrame, explore dataset structure, variables, tables, columns, and geographic units |
| 2:00 β 3:00 | Data Cleaning & Analysis | Cleaning data, renaming fields, sorting, summary statistics | Clean dataset, identify top counties, basic analysis |
| 3:00 β 4:00 | Visualization | Creating charts and maps of population trends | Maps and charts of population distribution, interpreting results |
Setup Requirements
We offer two setup paths:
- Google Colab β recommended for beginners; no installation needed
- Local installation with Anaconda Navigator β for offline work and full control
Option 1: Google Colab (Zero Installation β Recommended)
Google Colab is a free, cloud-based Jupyter notebook environment hosted by Google. It runs entirely in your browser, requires only a Google account, and comes with pandas, matplotlib, seaborn, and many other data science libraries pre-installed.
Steps
- Go to https://colab.research.google.com
- Sign in with your Google account (or create one if needed).
- Click New notebook (or File β New notebook).
-
(Optional) Rename it: File β Rename (e.g.,
Test_Notebook β YourName). - Test the libraries by running this in the first cell
(
Shift+Enterto execute):
PYTHON
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
print("pandas version:", pd.__version__)
print("matplotlib version:", plt.matplotlib.__version__)
print("seaborn version:", sns.__version__)
## Quick test plot (should appear inline)
tips = sns.load_dataset("tips") # built-in Seaborn dataset
sns.histplot(data=tips, x="total_bill", hue="time")
plt.title("Test: Restaurant Tips Distribution")
plt.show()
- If a package is missing or needs updating, install it with:
The
!prefix runs shell commands inside a Colab or Jupyter cell.
Advantages of Colab for this workshop
- No software installation required
- Free GPU/TPU access if needed later
- Easy sharing via File β Share
- Autosaves to Google Drive
- Perfect for following along with instructor demos
Tip: Upload your own data files using the left sidebar (Files β Upload), or mount Google Drive:
PYTHON
from google.colab import drive
drive.mount('/content/drive')
# Then read files like:
# pd.read_csv('/content/drive/MyDrive/penguins.csv')
Option 2: Local Installation (Anaconda Navigator)
Use this option if you prefer working offline or need a persistent local environment.
-
Download and install Anaconda Navigator:
- https://www.anaconda.com/products/navigator
- Choose your OS installer (Python 3.x version) and follow the default prompts.
After installation, launch Jupyter Notebook from the Anaconda Navigator home screen.
To install any missing packages, add the following to a code cell and run it:
Troubleshooting
| Problem | Fix |
|---|---|
| Colab: plots not showing | Add %matplotlib inline at the top of the notebook
(usually automatic) |
Local: ModuleNotFoundError
|
Run !pip install <package-name> in a code
cell |
| General help | Raise your hand during the workshop |