Spatial Analysis

Last updated on 2025-12-05 | Edit this page

Estimated time: 101 minutes

Overview

Questions

What is PySAL and what can it do for spatial analysis?
How do we compute spatial weights and perform spatial autocorrelation?
How do we interpret results like Moran’s I?

Objectives

Understand the purpose of PySAL in spatial data science
Learn how to load spatial data using GeoPandas
Construct spatial weight matrices
Compute Global Moran’s I using PySAL
Visualize spatial clustering and spatial autocorrelation

Why is PySAL Important?

PySAL (Python Spatial Analysis Library) is one of the most widely used toolkits for working with spatial data in Python. Unlike traditional statistical libraries, PySAL is designed specifically for datasets where location matters — where observations influence nearby observations, and spatial patterns may not be random.

Geographers, urban planners, environmental scientists, epidemiologists, and data analysts use PySAL to identify spatial relationships, detect clustering, and build models that incorporate proximity and geography.

What PySAL Helps Us Understand

Where events cluster or disperse across space
Whether high or low values form hotspots or coldspots
How neighborhoods influence one another (spatial dependency)
Spatial inequality patterns in income, population, crime, disease, etc.
Geographic diffusion (wildfire spread, disease transmission, migration flows)
Environmental change and land-use impacts

Why Researchers Use PySAL

Built for spatial statistics — tools that general libraries lack
Easy integration with GeoPandas, raster data, and shapefiles
Provides standard spatial methods such as:
- Spatial weights (Queen, Rook, KNN, Distance-based)
- Global & Local Moran’s I (LISA)
- Spatial clustering & hotspot detection
- Spatial regression models
Enables data-driven decision making in geography
Scales from local studies to large regional/global analyses
Helps test spatial hypotheses scientifically instead of visually

Spatial Analysis in Context

Spatial analysis answers questions like:

Question	PySAL Method
Do areas with high values cluster together?	Moran’s I
Where are hotspots located?	Local Moran / LISA maps
What counts as a neighbor?	Spatial weights matrices
Are patterns random or significant?	Monte Carlo permutation tests
How do variables influence each other across space?	Spatial regression

PySAL makes these methods accessible in Python, allowing analysts to move from maps to statistical evidence — revealing underlying spatial patterns that are not visible from visualization alone.

Introduction

PySAL is the Python Spatial Analysis Library — a powerful, open-source toolkit for working with spatial data. It provides tools for:

spatial weights
spatial autocorrelation
clustering
spatial regression
neighborhood analysis

This tutorial introduces the core PySAL workflow, closely following the structure used in your uploaded notebook.

We will cover:

Loading polygon or point data
Building spatial weights
Running Global Moran’s I
Visualizing results

This tutorial assumes basic familiarity with pandas, geopandas, and Python.

What you need to know for Carpentries lessons:

questions prime the learner for the lesson.
objectives state what skills will be gained.
keypoints summarize what was learned.

Instructor Note

Learners may struggle initially with spatial weights (rook, queen, k-nearest). Spend extra time walking through simple diagrams before showing code.

1. Loading Spatial Data

PySAL works seamlessly with GeoPandas.
Here’s a simple example using a polygon shapefile:

PYTHON

import geopandas as gpd

gdf = gpd.read_file("data/shapes.shp")
gdf.head()

Plot the boundaries:

PYTHON

gdf.plot(edgecolor="black", figsize=(6,6))

This ensures the geometry is valid and loads correctly.

2. Building Spatial Weights

Spatial weights define who is a neighbor of whom.

PySAL includes:

Rook contiguity
Queen contiguity
K-nearest neighbors
Distance-based neighbors

Example: Queen Contiguity

PYTHON

from libpysal.weights import Queen

w = Queen.from_dataframe(gdf)
w.transform = "R"  # row-standardization

Check neighbors of the first polygon:

PYTHON

w.neighbors[0]

3. Global Moran’s I

Moran’s I measures global spatial autocorrelation:

Positive values → clustering
Negative values → dispersion
Near zero → random pattern

Assume the dataset has a numeric column value:

PYTHON

import esda
import numpy as np

y = gdf['value']
mi = esda.Moran(y, w)

View the results:

PYTHON

mi.I, mi.p_sim

Plot the Moran scatterplot:

PYTHON

import splot.esda as esdaplot

esdaplot.moran_scatterplot(mi)

4. Local Moran’s I (Outlier Analysis)

Local Moran’s I finds hotspots and coldspots.

PYTHON

lisa = esda.Moran_Local(y, w)

Add LISA quadrant labels to the GeoDataFrame:

PYTHON

gdf["lisa_cluster"] = lisa.q

Map the clusters:

PYTHON

gdf.plot(column="lisa_cluster", cmap="Set1", figsize=(8,6), legend=True)

This creates a basic LISA cluster map.

Challenge

Challenge 1: Create Your Own Spatial Weights

Using the GeoDataFrame loaded above:

Create rook contiguity weights
Print the neighbor list for observation 10
Compare how rook vs queen differ

PYTHON

from libpysal.weights import Rook
w_rook = Rook.from_dataframe(gdf)
w_rook.neighbors[10]

Show me the solution

Queen neighbors may include diagonal touches. Rook neighbors require shared edges only. You should see fewer rook neighbors than queen neighbors.

Challenge

Challenge (continued)

Challenge 2: Compute Moran’s I on a New Variable

Choose any numeric variable in your dataset:

Extract the variable
Compute Moran’s I
Interpret whether clustering exists

Show me the solution

A positive Moran’s I with low p-value → strong clustering. Near zero → randomness. Negative → spatial dispersion.

Math

Global Moran’s I is defined as:

$ I = \frac{N}{W} \frac{\sum_i \sum_j w_{ij}(x_i - \bar{x})(x_j - \bar{x})} {\sum_i (x_i - \bar{x})^2} $

Where:

N = number of observations

W = sum of all spatial weights

w_ij = weight between units i and j

x = variable of interest

Key Points

PySAL provides tools for weights, autocorrelation, clustering, and modeling

Queen and rook weights define spatial neighbors differently

Moran’s I measures global autocorrelation

Local Moran (LISA) identifies hotspots and coldspots

GeoPandas and PySAL together form a powerful spatial analysis workflow

Module Overview

Lesson	Overview
Beginner	Introduction to Spatial Analysis using PySAL package.
Advanced	(to be added)