Tag Maps rendering with Python and Mapnik¶
Alexander Dunkel, TU Dresden; Institute of Cartography
TL;DR This notebook illustrates the process for rendering of tag maps shapefiles in Mapnik. This notebook can be run with the jupyter docker for cartography (docker tag jupyterlab:mapnik
), which contains the stable Mapnik installation and Python bindings, but must be accessed through the system Python installation (/usr/bin/python3
), not the conda environment.
The source files for this notebook are available in https://gitlab.vgiscience.de/ad/tagmaps-mapnik-jupyter.
Preparations¶
Import dependencies
from numba.core.errors import NumbaDeprecationWarning, NumbaPendingDeprecationWarning
import warnings
warnings.simplefilter('ignore', category=NumbaDeprecationWarning)
warnings.simplefilter('ignore', category=NumbaPendingDeprecationWarning)
import sys
import math
import fiona
import rasterio
import numpy as np
import geopandas as gp
import contextily as cx
import mapclassify as mc
import libpysal as lps
import geopandas as gp
import matplotlib.pyplot as plt
from IPython import display
from contextily import Place
from typing import Tuple, Optional
from rasterio.plot import show as rioshow
from pathlib import Path
from esda.getisord import G_Local
from shapely.geometry import shape
We are using a number of helper functions from py/modules/tools.py
to classify values.
module_path = str(Path.cwd().parents[0] / "py")
if module_path not in sys.path:
sys.path.append(module_path)
from modules.base import tools
%load_ext autoreload
%autoreload 2
Test Mapnik bindings
!/usr/bin/python3 -c "import mapnik;print(mapnik.__file__)"
Global Parameters
Some parameters are used throughout this notebook. Adjust to your needs.
INPUT: Path = Path.cwd().parents[0] / "input" # define path to input and working directory (shapefiles, stylesheets etc.)
OUTPUT: Path = Path.cwd().parents[0] / "output" # define path to output directory (map, notebook html etc.)
TMP: Path = INPUT / "intermediate"
MAP_X: int = 2500 # x dimensions of the final map, in pixels
MAP_Y: int = 1400 # y dimensions of the final map, in pixels
for folder in [OUTPUT, INPUT, TMP / "bg"]:
folder.mkdir(exist_ok=True, parents=True)
Download font set for Segoe UI Symbol Regular
, Aerial Bold
, and Noto Sans Regular
!wget https://raw.githubusercontent.com/mrbvrz/segoe-ui-linux/master/font/seguisym.ttf \
https://raw.githubusercontent.com/foursquare/foursquair/master/src/assets/fonts/Arial%20Bold.ttf \
https://github.com/openmaptiles/fonts/raw/master/noto-sans/NotoSans-Regular.ttf \
--directory-prefix {TMP}/fonts -nc --quiet
!rm /fonts && ln -s {TMP}/fonts /fonts
Set global font path variable for mapnik.
!export MAPNIK_FONT_PATH=/fonts
Test creating a map¶
- Data source: Shapefiles in
input/shapefiles
, created using TagMaps- Original data from Flickr, Twitter and Instagram for the TUD Campus
- Clustered and aggregated to illustrate collective use of terms (
allTagCluster.shp
) - and overall frequentation of places in the area (
allLocationCluster.shp
)
- see this notebook for how to generate these files
- A mapnik stylesheet
input/tagmap_style.xml
, containing the style information for how the data should be rendered - A python script
input/tagmap_xml.py
, using Mapnik Python bindings to process data and generate the map
The mapnik renderer can be accessed through Python bindings, available in the system python installation. Below, we use a python script that provides specific instructions to render a map in Mapnik.
%%time
!cd {INPUT} && /usr/bin/python3 ../input/tagmap_xml.py
Load the generated image to jupyter
display.Image(f"{INPUT}/intermediate/tagmap_style.png")
Add Basemap¶
Get bounds from shapefile:
CRS_WGS = "epsg:4326"
gdf = gp.read_file(INPUT / "shapefiles_gg" / "allLocationCluster.shp")
BOUNDS = list(gdf.to_crs(CRS_WGS).total_bounds)
Contextily allows to plot existing basemaps in Python.
dresden = Place("Grosser Garten, Dresden", source=cx.providers.Esri.WorldImagery)
ax = dresden.plot()
It also provides options to store results as geotiff, see the docs.
buf = 0.006
# filename = f"{TMP}/bg/dresden_carto.tif"
filename = f"{TMP}/bg/dresden_meingruen.tif"
img, ext = cx.bounds2raster(
dresden.w-buf, dresden.s, dresden.e+buf, dresden.n+0.005,
# filename, source=cx.providers.CartoDB.PositronNoLabels,
filename, source='https://maptiles.geo.tu-dresden.de/tiles/styles/meinGruen/{z}/{x}/{y}.png',
ll=True)
fig, ax = plt.subplots(figsize=(8, 8))
ax.set_axis_off()
with rasterio.open(f"{TMP}/bg/dresden_carto.tif") as r:
rioshow(r, ax=ax)
fig, ax = plt.subplots(figsize=(8, 8))
ax.set_axis_off()
with rasterio.open(f"{TMP}/bg/dresden_meingruen.tif") as r:
rioshow(r, ax=ax)
Get the basemap at a higher resolution, using the zoom
parameter, to get a higher level of detail.
zoom = 17
bg_name = f"{TMP}/bg/grossergarten_carto_{zoom}.tif"
if not Path(bg_name).exists():
img, ext = cx.bounds2raster(
dresden.w-buf, dresden.s, dresden.e+buf, dresden.n+0.005,
bg_name, source=cx.providers.CartoDB.PositronNoLabels,
ll=True, zoom=zoom)
bg_name = f"{TMP}/bg/grossergarten_meingruen_{zoom}.tif"
if not Path(bg_name).exists():
img, ext = cx.bounds2raster(
dresden.w-buf, dresden.s, dresden.e+buf, dresden.n+0.005,
bg_name, source='https://maptiles.geo.tu-dresden.de/tiles/styles/meinGruen/{z}/{x}/{y}.png',
ll=True, zoom=zoom)
Get bbox for map from cluster data
gdf = gp.read_file(INPUT / "shapefiles_gg" / "allTagCluster.shp", encoding='utf-8')
CRS_PROJ = gdf.crs
bbox_map = gdf.total_bounds.squeeze()
minx, miny = bbox_map[0], bbox_map[1]
maxx, maxy = bbox_map[2], bbox_map[3]
print(f'Projected bounds: {(minx, miny, maxx, maxy)}')
def add_buffer_bbox(
bbox: Tuple[float,float,float,float], buffer: int) -> Tuple[float,float,float,float]:
"""Add buffer to bbox tuple (Meters)"""
return (bbox[0]-buffer, bbox[1]-buffer, bbox[2]+buffer, bbox[3]+buffer)
bounds = add_buffer_bbox((minx, miny, maxx, maxy), -200)
print(bounds)
get bbox from raster:
west, south, east, north = BOUNDS
project:
from pyproj import Transformer
transformer = Transformer.from_crs("EPSG:4326", "epsg:32633", always_xy=True)
points = [(west, south), (west, north), (east, north), (east, south)]
points_proj = []
for pt in transformer.itransform(points):
points_proj.append(pt)
west, south, east, north = points_proj[0][0], points_proj[3][1], points_proj[2][0], points_proj[1][1]
print(f'Projected bounds: {(west, south, east, north)}')
Render tagmap with basemap¶
Below, use mapnik-cli
to use map generation with parameters.
stylesheet = "tagmap_bg_style.xml"
output_name = "tagmap_bg_style.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
Add Cluster Points¶
TagMaps puts out two shapefiles, one for Emoji+Tag Clusters (allTagCluster.shp
), and one for Location clusters (allLocationCluster
), which can be used to visualize overall frequentation patterns, regardless of the tags/emoji used.
The steps to visualize these clusters in ESRI ArcMap are explained in the docs. The process requires additional tools, such as Incremental Spatial Autocorrelation and the Getis-Ord GI* Statistic.
Below, the same approach is followed using Open Source Software in Python.
stylesheet = "tagmap_points_style.xml"
output_name = "tagmap_points_style.png"
For the actual Point, we need to create an svg
.
Create the svg first:
- the radius
r
defines how 'smooth' the svg curve is (how many vertices are used), adapt to your final output map dimension
point_svg = \
"""<?xml version="1.0" standalone="yes"?>
<svg height="128" width="128" version="2"
xmlns="http://www.w3.org/2000/svg">
<circle cx="0" cy="0" r="100" stroke="white" stroke-width="10" fill="#849EB9" />
</svg>
"""
Write svg to file:
with open(TMP / "circle.svg", "w") as svg_file:
svg_file.write(point_svg)
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
The Mapnik PointSymbolizer has little ability to modify size of points. Directly Mapping the [Join_Count]
or [Weights]
attribute from the Location cluster shapefile provides no useful output: The largest cluster dominates in the map above.
We can read in the shapefile using Fiona
and add a field, with values to be used as the svg
scale factor for points in Mapnik's PointSymbolizer. This direction provides more flexibility to influence the style of the location cluster layer. Below is mainly following the docs.
Read allLocationCluster.shp
data_src = Path(INPUT / "shapefiles_gg" / "allLocationCluster.shp")
locations = fiona.open(data_src, encoding='UTF-8', mode="r")
The Join_Count
refers to the number of user that took photos or posted from a particular cluster. Weights
is a normalized version of Join_Count, to the values range 1..1000
.
print(locations.schema)
print(locations.crs)
Copy the source schema and add a new property, for point size.
updated_schema = locations.schema
updated_schema["properties"]["point_size"] = "float"
Write a new shapefile with the same format and coordinate reference system as the source, with the new field.
The formula used for point size below:
4+(math.sqrt(feature["properties"]["Join_Count"])*8)
The use of square root means that the largest values will be affected most, which helps a bit reducing the dominant largest clusters on the map.
with fiona.open(
Path(TMP / "allLocationCluster_Size.shp"), "w",
crs=locations.crs,
driver=locations.driver,
schema=updated_schema,
) as shapefile:
for feature in locations:
_size = 4+(math.sqrt(feature["properties"]["Join_Count"])*8)
feature["properties"].update(point_size=_size)
shapefile.write(feature)
Create map
stylesheet = "tagmap_pointssize_style.xml"
output_name = "tagmap_pointssize_style.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
Create a map with all labels
stylesheet = "tagmap_pointssize_style_v2.xml"
output_name = "tagmap_pointssize_style_v2.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
Local Autocorrelation: Hot Spots, Cold Spots, and Spatial Outliers¶
The process below is based on the PySal Notebook Exploratory Analysis of Spatial Data: Spatial Autocorrelation and the API docs for esda.G_Local.
Loading location clusters and preparing the point data set for spatial analysis. We are using Join_Count
field as the weight for clusters in our spatial autocorrelation analysis.
points = []
weights = []
with fiona.open(data_src, encoding='UTF-8', mode="r") as shapefile:
for feature in locations:
point = (feature["geometry"]['coordinates'][0], feature["geometry"]['coordinates'][1])
points.append(point)
weights.append(float(feature["properties"]["Join_Count"]))
The points are already projected to a UTM coordinate system (33N
), meaning that the values are given in Meters.
points[:5]
weights[:5]
Below, libpysal
is used to create a weights object from points.
The threshold value referc to the distance threshold and is specified in Meters (based on the UTM Zone 33 Coordinate System). Smaller values will include less points, larger values will include more points in the distance matrix of the statistical calculation. The result is a smooth transition between colors (larger values), or more spatially fine grained color classification (smaller values).
distance_threshold = 350
w = lps.weights.DistanceBand(points, threshold=distance_threshold, binary=False)
Preparing the weights variable
y = np.array(weights)
Applying Getis and Ord local G* test using a binary weights object
There are a number of other examples in the pysal/esda docs for parameters. We are using the Getis Ord Gi*, where the star means focal observation.
%%time
lg_star = G_Local(y, w, transform='B', star=True)
lg_star.Zs
p-value based on standard normal approximation from permutations. Smaller p-values mean less significance.
round(lg_star.p_sim[0], 3)
Classify values¶
PySal mapclassify provides classification schemes such as Quantiles, HeadTailBreaks, or NaturalBreaks.
print(lg_star.Zs.min())
print(lg_star.Zs.max())
scheme_breaks = tools.classify_data(
values=lg_star.Zs, scheme="Quantiles")
scheme_breaks
cat_nr = scheme_breaks.find_bin(
lg_star.Zs)
cat_nr
Update schema and write to shapefile
updated_schema["properties"]["cat_nr"] = "int"
with fiona.open(
Path(TMP / "allLocationCluster_SizeClass.shp"), "w",
crs=locations.crs,
driver=locations.driver,
schema=updated_schema,
) as shapefile:
for ix, feature in enumerate(locations):
_size = 4+(math.sqrt(feature["properties"]["Join_Count"])*8)
feature["properties"].update(point_size=_size)
feature["properties"].update(cat_nr=int(cat_nr[ix]))
shapefile.write(feature)
Create SVG markers for classes¶
The last remaining step is to assign colors to classes. Since Mapnik style sheets provide no parameter for color values, everything must be defined in the svg. We will create a separate svg for each of the 7 classes, each having a different color defined.
Define list of colors (hot and cold spots):
hex_list = ['#4576b5', '#849eba', '#c0ccbe', '#ffffc0', '#fab984', '#ed7551', '#d62f27']
tools.display_hex_colors(hex_list)
def get_svg_circle(color: str) -> str:
"""Returns a circle as svg with the given color"""
return \
f"""<?xml version="1.0" standalone="yes"?>
<svg height="128" width="128" version="2"
xmlns="http://www.w3.org/2000/svg">
<circle cx="0" cy="0" r="100" stroke="white" stroke-width="10" fill="{color}" />
</svg>
"""
for ix, hex_color in enumerate(hex_list):
with open(TMP / f"circle_{ix}.svg", "w") as svg_file:
point_svg = get_svg_circle(color=hex_color)
svg_file.write(point_svg)
Produce Map¶
stylesheet = "tagmap_production.xml"
output_name = "tagmap_production.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
Focus regions¶
Commonly, these maps would be interactively explored using a tile server with multiple zoom levels. In order to zoom in to certain clusters, we can also create static close-ups of focus regions.
To locate elements, select a hashtag that is written bold (e.g. PALAIS). Bold tells us that this is the most important cluster of the hashtag PALAIS
. In the data table, this entry is highlighted with himp=1
. We can use this information to get the latitude and longitude of the element and tell Mapnik to zoom into this location.
data_src = Path(INPUT / "shapefiles_gg" / "allTagCluster.shp")
focus_tag = 'PALAIS'
def find_feature(data_src: Path, feature_name: str, add_buffer: Optional[int] = None) -> Tuple[float,float,float,float]:
"""Returns bounding box of a feature (x, y), if found"""
with fiona.open(data_src, encoding='UTF-8', mode="r") as shapefile:
for feature in shapefile:
properties = feature["properties"]
if properties["HImpTag"] == 1 and properties["ImpTag"] == feature_name.lower():
bounds = shape(feature["geometry"]).bounds
if add_buffer:
bounds = add_buffer_bbox(bounds, buffer = add_buffer)
return str(bounds).lstrip("(").rstrip(")").replace(" ","")
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag)
focus_bbox
Create map and zoom in to bounding box
output_name = f"tagmap_production_{focus_tag.lower()}.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x 500 \
--map_dimensiony_y 250 \
--input_path {INPUT} \
--output_path {OUTPUT} \
--bbox {focus_bbox}
display.Image(f'{OUTPUT}/{output_name}')
Sometimes, the bounding box returned from the cluster is small. We can add a buffer to crop the map to a larger area, for the cluster under investigation.
focus_tag = 'ZOODRESDEN'
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag, add_buffer=100)
output_name = f"tagmap_production_{focus_tag.lower()}.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x 500 \
--map_dimensiony_y 250 \
--input_path {INPUT} \
--output_path {OUTPUT} \
--bbox {focus_bbox}
display.Image(f'{OUTPUT}/{output_name}')
Experiment¶
Now it is time to experiment. For instance, changing the background map to aerial imagery and filtering only emoji.
focus_tag = 'BOTANISCHERGARTEN'
focus_bbox
zoom = 17
bg_name = f"{TMP}/bg/grossergarten_aerial_{zoom}.tif"
if not Path(bg_name).exists():
img, ext = cx.bounds2raster(
dresden.w-buf, dresden.s, dresden.e+buf, dresden.n+0.003,
bg_name, source=cx.providers.Esri.WorldImagery,
ll=True, zoom=zoom)
Create a white svg circle
with open(TMP / f"circle_white.svg", "w") as svg_file:
point_svg = get_svg_circle(color='#FFFFFF')
svg_file.write(point_svg)
Zoom to Campus
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag, add_buffer=450)
Visualize
stylesheet = "tagmap_aerial_emoji.xml"
output_name = f"tagmap_production_aerial_{focus_tag.lower()}.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x 1600 \
--map_dimensiony_y 900 \
--input_path {INPUT} \
--output_path {OUTPUT} \
--bbox {focus_bbox}
display.Image(f'{OUTPUT}/{output_name}')
Further work¶
Experiment further, e.g.:
- Emoji Font: Replace
Segoe UI Symbol Regular
ininput/*.xml
Stylesheets with another Emoji font, e.g.Twitter Color Emoji Regular
,Segoe UI Symbol Regular
,Segoe UI Emoji Regular
,Noto Emoji Regular
,Symbola Regular
etc. - Use gray location markers and use color to classify different types of emoji, e.g. all emoji of the unicode group "activity emoji" as red
- Adjust Rules in XML styles to affect label placement
- Create a custom basemap from vector/osm data
- Use of different svg markers to classify different types of places (e.g. based on the majority of posts from Instagram, Twitter, or Flickr)
- Output a tile map with mapnik, for different resolutions and areas
Create Notebook HTML¶
!jupyter nbconvert --to html_toc \
--output-dir=../resources/html/ ./01_tagmaps.ipynb \
--template=../nbconvert.tpl \
--ExtractOutputPreprocessor.enabled=False >&- 2>&- # create single output file and suppress output