Spatial Analysis

Last updated on 2025-12-05 | Edit this page

Estimated time: 101 minutes

Overview

Questions

  • What is PySAL and what can it do for spatial analysis?
  • How do we compute spatial weights and perform spatial autocorrelation?
  • How do we interpret results like Moran’s I?

Objectives

  • Understand the purpose of PySAL in spatial data science
  • Learn how to load spatial data using GeoPandas
  • Construct spatial weight matrices
  • Compute Global Moran’s I using PySAL
  • Visualize spatial clustering and spatial autocorrelation

Why is PySAL Important?


PySAL (Python Spatial Analysis Library) is one of the most widely used toolkits for working with spatial data in Python. Unlike traditional statistical libraries, PySAL is designed specifically for datasets where location matters — where observations influence nearby observations, and spatial patterns may not be random.

Geographers, urban planners, environmental scientists, epidemiologists, and data analysts use PySAL to identify spatial relationships, detect clustering, and build models that incorporate proximity and geography.

What PySAL Helps Us Understand

  • Where events cluster or disperse across space
  • Whether high or low values form hotspots or coldspots
  • How neighborhoods influence one another (spatial dependency)
  • Spatial inequality patterns in income, population, crime, disease, etc.
  • Geographic diffusion (wildfire spread, disease transmission, migration flows)
  • Environmental change and land-use impacts

Why Researchers Use PySAL

  • Built for spatial statistics — tools that general libraries lack
  • Easy integration with GeoPandas, raster data, and shapefiles
  • Provides standard spatial methods such as:
    • Spatial weights (Queen, Rook, KNN, Distance-based)
    • Global & Local Moran’s I (LISA)
    • Spatial clustering & hotspot detection
    • Spatial regression models
  • Enables data-driven decision making in geography
  • Scales from local studies to large regional/global analyses
  • Helps test spatial hypotheses scientifically instead of visually

Spatial Analysis in Context

Spatial analysis answers questions like:

Question PySAL Method
Do areas with high values cluster together? Moran’s I
Where are hotspots located? Local Moran / LISA maps
What counts as a neighbor? Spatial weights matrices
Are patterns random or significant? Monte Carlo permutation tests
How do variables influence each other across space? Spatial regression

PySAL makes these methods accessible in Python, allowing analysts to move from maps to statistical evidence — revealing underlying spatial patterns that are not visible from visualization alone.

Introduction


PySAL is the Python Spatial Analysis Library — a powerful, open-source toolkit for working with spatial data. It provides tools for:

  • spatial weights
  • spatial autocorrelation
  • clustering
  • spatial regression
  • neighborhood analysis

This tutorial introduces the core PySAL workflow, closely following the structure used in your uploaded notebook.

We will cover:

  1. Loading polygon or point data
  2. Building spatial weights
  3. Running Global Moran’s I
  4. Visualizing results

This tutorial assumes basic familiarity with pandas, geopandas, and Python.

What you need to know for Carpentries lessons:

  1. questions prime the learner for the lesson.
  2. objectives state what skills will be gained.
  3. keypoints summarize what was learned.

Learners may struggle initially with spatial weights (rook, queen, k-nearest). Spend extra time walking through simple diagrams before showing code.

1. Loading Spatial Data


PySAL works seamlessly with GeoPandas.
Here’s a simple example using a polygon shapefile:

PYTHON

import geopandas as gpd

gdf = gpd.read_file("data/shapes.shp")
gdf.head()

Plot the boundaries:

PYTHON

gdf.plot(edgecolor="black", figsize=(6,6))

This ensures the geometry is valid and loads correctly.

2. Building Spatial Weights


Spatial weights define who is a neighbor of whom.

PySAL includes:

  • Rook contiguity

  • Queen contiguity

  • K-nearest neighbors

  • Distance-based neighbors

Example: Queen Contiguity

PYTHON

from libpysal.weights import Queen

w = Queen.from_dataframe(gdf)
w.transform = "R"  # row-standardization

Check neighbors of the first polygon:

PYTHON

w.neighbors[0]

3. Global Moran’s I


Moran’s I measures global spatial autocorrelation:

  • Positive values → clustering

  • Negative values → dispersion

  • Near zero → random pattern

Assume the dataset has a numeric column value:

PYTHON

import esda
import numpy as np

y = gdf['value']
mi = esda.Moran(y, w)

View the results:

PYTHON

mi.I, mi.p_sim

Plot the Moran scatterplot:

PYTHON

import splot.esda as esdaplot

esdaplot.moran_scatterplot(mi)

4. Local Moran’s I (Outlier Analysis)


Local Moran’s I finds hotspots and coldspots.

PYTHON

lisa = esda.Moran_Local(y, w)

Add LISA quadrant labels to the GeoDataFrame:

PYTHON

gdf["lisa_cluster"] = lisa.q

Map the clusters:

PYTHON

gdf.plot(column="lisa_cluster", cmap="Set1", figsize=(8,6), legend=True)

This creates a basic LISA cluster map.

Challenge

Challenge

Challenge 1: Create Your Own Spatial Weights

Using the GeoDataFrame loaded above:

  • Create rook contiguity weights

  • Print the neighbor list for observation 10

  • Compare how rook vs queen differ

PYTHON

from libpysal.weights import Rook
w_rook = Rook.from_dataframe(gdf)
w_rook.neighbors[10]

Queen neighbors may include diagonal touches. Rook neighbors require shared edges only. You should see fewer rook neighbors than queen neighbors.

Challenge

Challenge (continued)

Challenge 2: Compute Moran’s I on a New Variable

Choose any numeric variable in your dataset:

  • Extract the variable

  • Compute Moran’s I

  • Interpret whether clustering exists

A positive Moran’s I with low p-value → strong clustering. Near zero → randomness. Negative → spatial dispersion.

Math


Global Moran’s I is defined as:

$ I = \frac{N}{W} \frac{\sum_i \sum_j w_{ij}(x_i - \bar{x})(x_j - \bar{x})} {\sum_i (x_i - \bar{x})^2} $

Where:

N = number of observations

W = sum of all spatial weights

w_ij = weight between units i and j

x = variable of interest

Key Points

PySAL provides tools for weights, autocorrelation, clustering, and modeling

Queen and rook weights define spatial neighbors differently

Moran’s I measures global autocorrelation

Local Moran (LISA) identifies hotspots and coldspots

GeoPandas and PySAL together form a powerful spatial analysis workflow

Module Overview

Lesson Overview
Beginner Introduction to Spatial Analysis using PySAL package.
Advanced (to be added)