Mapnik Generative AI workflow

Alexander Dunkel, Institute of Cartography, TU Dresden

No description has been provided for this image
•••
Out[1]:

Last updated: Oct-18-2023, Carto-Lab Docker Version 0.15.7

Mapnik rendering based on stable diffusion generative AI and social media data.

In order to be able to place images on a map, we need simplified illustrations that convey textual information as graphics. These graphics should ideally be isolated on white, so they can be used as an overlay on a map. Here, we use a LoRa (Low-Rank Adaption) that produces minimalist illustrations in japanese line drawing styles from the given prompts. ^1.

This is an example output batch for the prompt river, elbe, aerial.

title

We will generate graphics based on both collections of tags and single emoji. The central stable diffusion workflow is shown in the following schema for emoji:

title

Steps:

  1. Emoji from social media cluster
  2. Get text for emoji, in addition to the emoji character itself
  3. Combine with base positive prompt and base negative prompt
  4. Generate Images
  5. Place images on map in mapnik, scale based on frequency of emoji cluster

There is more to it. Major challenges inhere:

  • connect via API from Jupyter Lab to Stable Diffusion in HPC cluster
  • automatically generate images recursively for all clusters through the API, fetch images, name and store locally
  • use Mapnik to place images, scale size according to cluster frequency
  • Prevent overlap, produce final map without artifacts at the border of images/icons and edges
  • combine tags and emoji images to generate a balanced output map with objects, people, activities and associations

Prepare environment

To run this notebook, as a starting point, you have two options:

1. Create an environment with the packages and versions shown in the following cell.

As a starting point, you may use the latest conda environment_default.yml from our CartoLab docker container.

2. If docker is available to you, we suggest to use the Carto-Lab Docker Container (the mapnik-version)

Clone the repository and edit your .env value to point to the repsitory, where this notebook can be found, e.g.:

git clone https://gitlab.vgiscience.de/lbsn/tools/jupyterlab.git
cd jupyterlab
cp .env.example .env
nano .env
## Enter:
# JUPYTER_NOTEBOOKS=~/notebooks/mapnik_stablediffusion
# TAG=v0.12.3
docker network create lbsn-network
docker-compose pull && docker-compose up -d
•••
List of package versions used in this notebook
package python Fiona Shapely bokeh colorcet contextily esda geopandas geoviews holoviews
version 3.9.15 1.8.20 1.7.1 2.4.3 3.0.1 1.3.0 2.5.0 0.13.2 1.9.5 1.14.8
package hvplot ipywidgets libpysal mapclassify matplotlib numpy pandas python-dotenv rasterio rembg
version 0.8.4 8.0.7 4.7.0 2.5.0 3.7.1 1.22.4 2.1.0 1.0.0 1.2.10 2.0.50

Load dependencies:

In [3]:
from numba.core.errors import NumbaDeprecationWarning, NumbaPendingDeprecationWarning
import warnings

warnings.simplefilter('ignore', category=NumbaDeprecationWarning)
warnings.simplefilter('ignore', category=NumbaPendingDeprecationWarning)
In [4]:
import sys
import math
import fiona
import rasterio
import numpy as np
import contextily as cx
import mapclassify as mc
import libpysal as lps
import json
import requests
import io
import base64
import matplotlib.pyplot as plt
from PIL import Image, PngImagePlugin
from IPython import display
from contextily import Place
from typing import Tuple, Optional, List, Dict, Any
from rasterio.plot import show as rioshow
from pathlib import Path
from esda.getisord import G_Local
from shapely.geometry import shape

We are using a number of helper functions from py/modules/tools.py to classify values.

In [5]:
module_path = str(Path.cwd().parents[0] / "py")
if module_path not in sys.path:
    sys.path.append(module_path)
from modules.base import tools, raster

Activate autoreload of changed python files:

In [6]:
%load_ext autoreload
%autoreload 2

Test Mapnik bindings

In [7]:
!/usr/bin/python3 -c "import mapnik;print(mapnik.__file__)"
/usr/lib/python3/dist-packages/mapnik/__init__.py

Parameters

Define initial parameters that affect processing

In [38]:
INPUT: Path = Path.cwd().parents[0] / "input"   # define path to input and working directory (shapefiles, stylesheets etc.)
OUTPUT: Path = Path.cwd().parents[0] / "output" # define path to output directory (map, notebook html etc.)
images = OUTPUT / "images"
TMP: Path = INPUT / "intermediate"              # folder to store intermediate data
TMP_AI: Path = INPUT / "intermediate" / "gen-ai"   # folder to store generative ai data
MAP_X: int = 2500                               # x dimensions of the final map, in pixels
MAP_Y: int = 1400                               # y dimensions of the final map, in pixels
In [39]:
for folder in [OUTPUT, TMP_AI, images]:
    folder.mkdir(exist_ok=True, parents=True)

Files

The tagmaps-mapnik workflow for preparing the data for the map below is described in a previous notebook:

Show the directory tree:

In [13]:
tools.tree(Path.cwd().parents[0])
Out[13]:
Directory file tree
.
├── .cruft.json
├── py
│ ├── _02_generativeai.py
│ ├── _03_map_processing.py
│ ├── _01_tagmaps.py
│ └── modules
│ └── base
│ ├── tools.py
│ ├── raster.py
│ ├── README.md
│ ├── .gitignore
│ ├── hll.py
│ ├── pkginstall.sh
│ └── preparations.py
├── nbconvert.tpl
├── README.md
├── notebooks
│ ├── 03_map_processing.ipynb
│ ├── 02_generativeai.ipynb
│ ├── 01_tagmaps.ipynb
│ └── .gitkeep
├── pyproject.toml
├── .gitignore
├── resources
│ ├── mapnik_stablediffusion.webp
│ ├── html
│ │ ├── 01_tagmaps.html
│ │ ├── 02_generativeai.html
│ │ └── 03_map_processing.html
│ ├── workflow_schema.webp
│ └── sample_prompt_japanse_elbe.webp
├── jupytext.toml
├── .gitlab-ci.yml
├── input
│ ├── tagmap_style.xml
│ ├── tagmap_points_style.xml
│ ├── tagmap_xml.py
│ ├── tagmap_pointssize_style.xml
│ ├── shapefiles_gg
│ │ ├── allLocationCluster.dbf
│ │ ├── allLocationCluster.prj
│ │ ├── gen_img.dbf
│ │ ├── gen_img.shp
│ │ ├── gen_img_tags.prj
│ │ ├── allLocationCluster.cpg
│ │ ├── gen_img.cpg
│ │ ├── gen_img_tags.shx
│ │ ├── gen_img_tags.cpg
│ │ ├── allLocationCluster.shp
│ │ ├── gen_img.shx
│ │ ├── gen_img_tags.dbf
│ │ ├── allLocationCluster.shx
│ │ ├── gen_img.prj
│ │ ├── allTagCluster.cpg
│ │ ├── allTagCluster.shp
│ │ ├── allTagCluster.shx
│ │ ├── gen_img_tags.shp
│ │ ├── allTagCluster.prj
│ │ └── allTagCluster.dbf
│ ├── tagmap_pointssize_style_v2.xml
│ ├── tagmap_production.xml
│ ├── intermediate
│ │ ├── circle_1.svg
│ │ ├── circle.svg
│ │ ├── allLocationCluster_Size.cpg
│ │ ├── circle_0.svg
│ │ ├── allLocationCluster_Size.prj
│ │ ├── gen-ai
│ │ ├── allLocationCluster_SizeClass.prj
│ │ ├── fonts
│ │ │ ├── Arial Bold.ttf
│ │ │ ├── NotoSans-Regular.ttf
│ │ │ └── seguisym.ttf
│ │ ├── circle_4.svg
│ │ ├── allLocationCluster_Size.dbf
│ │ ├── circle_white.svg
│ │ ├── tagmap_style.png
│ │ ├── allLocationCluster_Size.shp
│ │ ├── circle_5.svg
│ │ ├── bg
│ │ │ ├── grossergarten_aerial_17.tif
│ │ │ ├── grossergarten_carto_17.tif
│ │ │ ├── grossergarten_meingruen_17.tif
│ │ │ ├── dresden_meingruen.tif
│ │ │ └── dresden_carto.tif
│ │ ├── circle_2.svg
│ │ ├── allLocationCluster_SizeClass.dbf
│ │ ├── circle_3.svg
│ │ ├── allLocationCluster_SizeClass.shp
│ │ ├── allLocationCluster_SizeClass.cpg
│ │ ├── allLocationCluster_SizeClass.shx
│ │ ├── allLocationCluster_Size.shx
│ │ └── circle_6.svg
│ ├── tagmap_bg_style.xml
│ └── tagmap_aerial_emoji.xml
├── CHANGELOG.md
├── LICENSE.md
├── .gitmodules
├── md
│ ├── 03_map_processing.md
│ ├── 02_generativeai.md
│ └── 01_tagmaps.md
├── conf.json
├── output
│ ├── tagmap_pointssize_style_v2.png
│ ├── tagmap_points_style.png
│ ├── tagmap_production_palais.png
│ ├── tagmap_production.png
│ ├── tagmap_bg_style.png
│ ├── tagmap_production_zoodresden.png
│ ├── tagmap_pointssize_style.png
│ └── tagmap_production_aerial_botanischergarten.png
└── .version
14 directories, 99 files

Test creating a (tag) map

In [14]:
stylesheet = "tagmap_production.xml"
output_name = "tagmap_production.png"

Set global font path variable for mapnik.

In [15]:
!export MAPNIK_FONT_PATH=/fonts

Download font set for Segoe UI Symbol Regular, Aerial Bold, and Noto Sans Regular

In [19]:
!wget https://raw.githubusercontent.com/mrbvrz/segoe-ui-linux/master/font/seguisym.ttf \
    https://raw.githubusercontent.com/foursquare/foursquair/master/src/assets/fonts/Arial%20Bold.ttf \
    https://github.com/openmaptiles/fonts/raw/master/noto-sans/NotoSans-Regular.ttf \
    --directory-prefix {TMP}/fonts -nc --quiet
!rm /fonts && ln -s {TMP}/fonts /fonts

Create tagmap using mapnik_cli

In [20]:
%%time
!/usr/bin/python3 -m mapnik_cli \
    --stylesheet_name {stylesheet} \
    --output_name {output_name} \
    --map_dimensiony_x {MAP_X} \
    --map_dimensiony_y {MAP_Y} \
    --input_path {INPUT} \
    --output_path {OUTPUT} \
    --bbox '412602.4948451808,5654046.735233658,413809.2816068467,5655787.42471083'
CPU times: user 2.03 s, sys: 236 ms, total: 2.26 s
Wall time: 1min 59s
In [21]:
display.Image(f'{OUTPUT}/{output_name}')
Out[21]:
No description has been provided for this image

Connect to stable diffusion API


We are using the API that is made available by automatic1111's stable-diffusion-webui.

The first step is to enable the API with the following variable:

export COMMANDLINE_ARGS="--xformers --api --nowebui"

The second step is to make a backwards ssh tunnel to our host running jupyter lab docker.

The tunneling here can be confusing: Our stable-diffusion CUDA environment likely runs somewhere else. In my case, I use the HPC environment that is made available by my university.

If you run this all on your local machine, nothings needs to be done.

The following will establish a reverse tunnel bound to port 7861 on both sides. The port 7861 is only necessary if you run --nowebui, otherwise it will be port 7860.

ssh user@jupyter.host -o ExitOnForwardFailure=yes \
    -o ServerAliveInterval=120 -f -R \
    :7861:127.0.0.1:7861 -p 22 -N; \
    ./webui.sh

A third, optional step is to use a forward ssh tunnel to connect to your Jupyter host. If this is your local machine, nothings needs to be done.

ssh user@jupyter.host -L :7861:127.0.0.1:7861 -p 22 -N -v

Also see my blog post on tunneling here.

Finally, if you are using CartoLab-Docker as your Jupyter instance, to be able to forward connect to the port 7861 that is listening on the docker host, we need to change the docker-compose.yml, disable the custom network from CartoLab Docker and add network_mode: "host". Afterwards, check with:

In [22]:
!wget -S --spider 127.0.0.1:7861/docs
Spider mode enabled. Check if remote file exists.
--2023-10-18 08:37:07--  http://127.0.0.1:7861/docs
Connecting to 127.0.0.1:7861... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  date: Wed, 18 Oct 2023 08:37:07 GMT
  server: uvicorn
  content-length: 931
  content-type: text/html; charset=utf-8
  x-process-time: 0.0004
Length: 931 [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

If you see a HTTP/1.1 200 OK, everything works.

Test API

Follow the webui guide to test the API:

In [23]:
payload = {
    "prompt": "Big tree on white background",
    "steps": 50
}
In [24]:
%%time
response = requests.post(url=f'http://127.0.0.1:7861/sdapi/v1/txt2img', json=payload)
CPU times: user 8.16 ms, sys: 864 µs, total: 9.03 ms
Wall time: 4.52 s
In [25]:
def process_response(
        response: str, target_size: Optional[int] = None, show: Optional[bool] = True,
        output: Path = OUTPUT, save_name: Optional[str] = None):
    """Get Images from json response and plot preview"""
    if target_size is None:
        target_size = 250
    r = response.json()
    for ix, i in enumerate(r['images']):
        image = Image.open(io.BytesIO(base64.b64decode(i.split(",",1)[0])))
        if save_name is not None:
            fname = f"{save_name}.png"
            if ix > 0:
                fname = f"{save_name}_{ix:02}.png"
            image.save(output / "images" / fname, "PNG")
        if not show:
            continue
        image.thumbnail(
            (target_size, target_size), Image.Resampling.LANCZOS)
        image.show()
In [26]:
process_response(response)
No description has been provided for this image

SD Parameters

Have a look at http://localhost:7861/docs#/default/text2imgapi_sdapi_v1_txt2img_post for parameters.

We can also get options, such as the currently loaded checkpoint:

In [27]:
APIURL = "http://127.0.0.1:7861"
In [28]:
response = requests.get(url=f'{APIURL}/sdapi/v1/options')
response.json()["sd_model_checkpoint"]
Out[28]:
'hellofunnycity_V14.safetensors [89c108e712]'

Here, we use a Model/Checkpoint/Merge called hellofunnycity, which provides a general cartoon style. The model page recommends to set CLIP skip parameter to 2. However, I observed more overtraining steps below, which is why I leave this at 1 - feel free to change and test below.

We also use a VAE (Variable Autoencoder) that helps increase sharpness ( sd-vae-ft-mse-original) of generated images. After downloading the VAE, refresh with:

In [29]:
requests.post(url=f'{APIURL}/sdapi/v1/refresh-vae')
Out[29]:
<Response [200]>

Then post options to the API:

In [30]:
payload = {
    "CLIP_stop_at_last_layers": 1,
    "sd_vae":"vae-ft-mse-840000-ema-pruned.safetensors",
}
requests.post(url=f'{APIURL}/sdapi/v1/options', json=payload)
Out[30]:
<Response [200]>

A Response 200 means: Ok.

Other parameters are applied on a job-by-job basis, such as the number of steps. We store these settings in a global dictionary:

In [31]:
SD_CONFIG = {
    "steps": 50,
    }

Turn generation of images into method.

In [32]:
def generate(
        prompt: str, save_name: Optional[str] = None, 
        show: Optional[bool] = True, sd_config: Optional[dict] = SD_CONFIG,
        api_url: str = None, negative_prompt: Optional[str] = None):
    """
    Generate image via SD prompt, optionally save to disk and show image

    Args:
    prompt: Text prompt to generate image. Required.
    save_name: Name to save image to disk. Without extension. Optional.
    show: Show image. Optional. Default: True
    steps: Sampling steps. Default: 50.
    api_url: URL to SDAPI v1. Optional.
    """
    if api_url is None:
        api_url = 'http://127.0.0.1:7861/sdapi/v1/txt2img'
    payload = {
        "prompt": prompt,
        "negative_prompt": negative_prompt
    }
    payload.update(sd_config)
    response = requests.post(url=api_url, json=payload)
    process_response(response, save_name=save_name, show=show)

Many parameters are available for SD image generation. The prompt has one of the biggest impact on generated images. Both positive prompt and a negative prompt can be specified.

Based on the introduction, we use a base template for negative and positive prompt for how we want to generate the image (style) and then substitute a part for what we actually want to generate.

Short intro on SD prompts:
  • Instructions are separated by comma, strictly this is not nessessary, but it helps better read the prompt
  • (x) emphasis. Multiplies the attention to x by 1.1. Equivalent to (x:1.1)
  • [x] de-emphasis, divides the attention to x by 1.1. Approximate to (x:0.91)
  • (x:number) emphasis if number > 1, deemphasis if < 1. Multiply the attention by number.
  • <> is for embeddings and LoRA. Decimals equally modify their weight.
(see)

These are our sample base prompts:
In [33]:
BASE_PROMPT_POS: str = \
    "white background,simple outline,masterpiece,best quality,high quality," \
    "<lora:Japanese_style_Minimalist_Line_Illustrations:0.2>"
BASE_PROMPT_NEG: str = \
    "(bad-artist:1),(worst quality, low quality:1.4),lowres,bad anatomy,bad hands," \
    "((text)),(watermark),error,missing fingers,extra digit,fewer digits,cropped,worst quality," \
    "low quality,normal quality,((username)),blurry,(extra limbs),BadDream," \
    "(three hands:1.6),(three legs:1.2),(more than two hands:1.4),(more than two legs,:1.2)," \
    "label,(isometric), (square)"
In [34]:
def concat_prompt(prompt: str, base_prompt_pos: str = BASE_PROMPT_POS) -> str:
    """Returns base prompt substitute with task prompt"""
    return f'{prompt},{base_prompt_pos}'
In [35]:
PROMPT: str = "(Big tree on white background)"
In [41]:
generate(concat_prompt(PROMPT), negative_prompt=BASE_PROMPT_NEG, save_name="test_image")
No description has been provided for this image

Remove background

For some images, the image background is not fully white. We can use rembg to remove the background with the isnet-general-use model (see Highly Accurate Dichotomous Image Segmentation).

This was only necessary if the image was plotted using the PointSymbolizer. If plotted with the RasterSymbolizer, we can use comp-op="multiply" in the Mapnik stylesheet to remove any white background during the overlay operation.

In [42]:
IMG_TEST = OUTPUT / "images" / "test_image.png"
In [43]:
!../py/modules/base/pkginstall.sh "rembg"
rembg already installed.

Get Model

In [44]:
!wget https://github.com/danielgatis/rembg/releases/download/v0.0.0/isnet-general-use.onnx \
    --directory-prefix {TMP} -nc --quiet
In [45]:
!ln -s {TMP}/isnet-general-use.onnx /root/.u2net/isnet-general-use.onnx 2> /dev/null
In [46]:
MODEL_NAME = "isnet-general-use"
In [47]:
from rembg import remove, new_session
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

SESSION = new_session(
    model_name=MODEL_NAME,
    alpha_matting=True) 
    #alpha_matting_foreground_threshold=240,
    #alpha_matting_background_threshold=10,
    #alpha_matting_erode_size=10)

def remove_background(img: Path, session=SESSION):
    """Remove background from image and overwrite"""
    input = Image.open(img)
    output = remove(input, session=session)
    output.save(img)
In [48]:
%%time
remove_background(IMG_TEST)
CPU times: user 8.06 s, sys: 558 ms, total: 8.62 s
Wall time: 729 ms
In [49]:
img = Image.open(IMG_TEST)
img.thumbnail(size=(128,128))
img.show()
No description has been provided for this image

Place test image on the map

We can create a GeoTiff (^2) from this image and place it with Mapnik's RasterSymbolizer. Since Mapnik 0.6.0, there are also several Composition Modes available, including transparency.

ToDo: Test MarkersSymbolizer

Get coordinates

We test this with tagmaps cluster data for the TUD Campus area.

Data is already projected to UTM coordinates. Get the bounding box for BISMARCKTURM cluster. We want to place the tree image at this location on the map.

In [50]:
data_src = Path(INPUT / "shapefiles_campus" / "allTagCluster.shp")
focus_tag = 'BISMARCKTURM'
with fiona.open(data_src, encoding='UTF-8', mode="r") as locations:
    map_crs = locations.crs
    driver = locations.driver
    schema = locations.schema
In [82]:
map_crs
Out[82]:
{'init': 'epsg:32633'}
In [51]:
Bbox = Tuple[float, float, float, float]
def add_buffer_bbox(
        bbox: Bbox, buffer: int) -> Bbox:
    """Add buffer to bbox tuple (Meters)"""
    return (bbox[0]-buffer, bbox[1]-buffer, bbox[2]+buffer, bbox[3]+buffer)

def bounds_str(bbox: Bbox) -> str:
    """Format bounds as mapnik-cli str"""
    return str(bbox).lstrip("(").rstrip(")").replace(" ","")
In [52]:
def find_feature(data_src: Path, feature_name: str, add_buffer: Optional[int] = None, centroid: bool = None) -> Bbox:
    """Returns bounding box of a feature (x, y), if found"""
    with fiona.open(data_src, encoding='UTF-8', mode="r") as shapefile:
        for feature in shapefile:
            properties = feature["properties"]
            if properties["HImpTag"] == 1 and properties["ImpTag"] == feature_name.lower():
                if centroid:
                    centroid = shape(feature["geometry"]).centroid
                    return centroid.x, centroid.y
                bounds = shape(feature["geometry"]).bounds
                if add_buffer:
                    bounds = add_buffer_bbox(bounds, buffer = add_buffer)
                return bounds

Get bounds for Bismarckturm cluster

In [53]:
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag)

Get values as float

In [54]:
west, south, east, north = focus_bbox

To not skew the image, we only use the width of the cluster, and calculate the height in projected coordinates.

In [55]:
width = east - west
In [56]:
h_ctr =south+((north-south)/2)
south = h_ctr - (width/2)
north = h_ctr + (width/2)

Display map with buffer

In [57]:
map_buf = 50 # meters
focus_bbox_str = bounds_str((west-map_buf, south-map_buf, east+map_buf, north+map_buf))

Convert regular png to georeferenced raster by applying these bounds to the image.

In [67]:
raster.georeference_raster(
    # raster_in=IMG_TEST, # raster with background removed with rembg
    raster_in=OUTPUT / "images" / "test_image_white.png", # raster with white background
    raster_out=INPUT / "test_image.tif", bbox=(west, south, east, north), 
    crs_out=map_crs, width_offset=map_buf, height_offset=map_buf)

Create map with the raster as an overlay and zoom in to the bounding box.

In [68]:
output_name = f"tagmap_production_{focus_tag.lower()}.png"
stylesheet = "tagmap_production_testraster.xml"
In [72]:
%%time
!/usr/bin/python3 -m mapnik_cli \
    --stylesheet_name {stylesheet} \
    --output_name {output_name} \
    --map_dimensiony_x 500 \
    --map_dimensiony_y 250 \
    --input_path {INPUT} \
    --output_path {OUTPUT} \
    --bbox {focus_bbox_str}
CPU times: user 5.92 ms, sys: 32.7 ms, total: 38.6 ms
Wall time: 997 ms
In [81]:
display.Image(f'{OUTPUT}/{output_name}')
Out[81]:
No description has been provided for this image

For placing multiple Raster images next to each other, we will have to write a few more methods to detect and avoid collisions, and offset images in these cases.

Use PointSymbolizer

Get centroid:

In [83]:
focus_centroid = find_feature(data_src=data_src, feature_name=focus_tag, centroid=True)
In [84]:
focus_centroid
Out[84]:
(411306.59401811595, 5653072.865833967)

Update the stylesheet and plot:

In [85]:
output_name = f"tagmap_production_{focus_tag.lower()}_pts.png"
stylesheet = "tagmap_production_testraster_points.xml"
In [88]:
%%time
!/usr/bin/python3 -m mapnik_cli \
    --stylesheet_name {stylesheet} \
    --output_name {output_name} \
    --map_dimensiony_x 500 \
    --map_dimensiony_y 250 \
    --input_path {INPUT} \
    --output_path {OUTPUT} \
    --bbox {focus_bbox_str}
CPU times: user 7.28 ms, sys: 32.1 ms, total: 39.4 ms
Wall time: 1.06 s
In [89]:
display.Image(f'{OUTPUT}/{output_name}')
Out[89]:
No description has been provided for this image

The point symbolizer is both easier to use and produces better rendering (no skewed image; no black artifacts at edges).

Prompt tests

The parameters can be modified. The following cell activates the batch parameter and changes the sampling method. I also reduced the number of steps from 50 to 20, for reducing computation time.

In [90]:
SD_CONFIG["batch_size"] = 4
SD_CONFIG["sampler_name"] = "DPM++ 2M SDE Exponential"
SD_CONFIG["steps"] = 20

Next, improve our display method for batches

In [91]:
def remove_backgrounds(imgs: List[Path]):
    """Remove backgrounds for all images in list, overwrite"""
    for ix, file in enumerate(imgs):
        remove_background(file)
        display.clear_output(wait=True)
        print(f"Removed bg from {ix+1} of {len(imgs)} images.")

def generate_rembg(base_path: Optional[Path] = None, **kwargs):
    """Generate images and remove background"""
    generate(**kwargs)
    if base_path is None:
        base_path = OUTPUT / "images"
    imgs = list(base_path.glob(f'{kwargs["save_name"]}*'))
    remove_backgrounds(imgs)

Also, Stable Diffusion doesn't require many words to focus on a topic. It was trained on a large corpus, based on common communication patterns that can be found online. E.g., below, we use a single "wave" emoji to generate illustrations in japanese line drawing style.

In [92]:
PROMPT = "(🌊)"
In [93]:
output_name = "test_image_emojiwave"
KWARGS = {
    "prompt": concat_prompt(PROMPT),
    "negative_prompt": BASE_PROMPT_NEG,
    "save_name": output_name,
    "sd_config": SD_CONFIG,
    "show": False
}
In [198]:
%%time
generate_rembg(**KWARGS)
Removed bg from 4 of 4 images.
CPU times: user 33 s, sys: 663 ms, total: 33.6 s
Wall time: 7.85 s

Show images in a grid

In [199]:
imgs = list((OUTPUT / "images").glob(f'{output_name}*'))
tools.image_grid(imgs)
No description has been provided for this image

The effect of all these parameters is quite difficult to judge.

The best way to test parameter effect is by observation.

Let's gradually change a single parameter, the weight of our Japanese line style LoRa, to study its effect.

In [94]:
def generate_series(
        prompt: str, part: str, save_name: str, kwargs: Dict[Any, Any],
        base_prompt_pos: Optional[str] = BASE_PROMPT_POS, 
        display: bool = True, rembg: bool = True):
    """Generate a series of images by 
    gradually modifying the weights for a part of the base prompt.
    Weights will be substituted based on 1.0 occurence in the part-string.
    """
    base_pos = base_prompt_pos.replace(part, "")
    weight_steps = ["0.2", "0.4", "0.6", "0.8", "1.0", "1.2"]
    for ix, weight in enumerate(weight_steps):
        base_prompt_pos = f'{base_pos}{part.replace("1.0", weight)}'
        output_name = f"{save_name}_{weight}"
        kwargs["save_name"] = output_name
        kwargs["prompt"] = concat_prompt(prompt, base_prompt_pos=base_prompt_pos)
        if rembg:
            generate_rembg(**kwargs)
        else:
            generate(**kwargs)
        if not display:
            continue
        imgs = list((OUTPUT / "images").glob(f'{output_name}*'))
        tools.image_grid(imgs)

The method will replace the default weight of 1.0 with steps from 0.2 to 1.2 and generate 4 images for each step.

In [95]:
SKWARGS = {
    "part":"<lora:Japanese_style_Minimalist_Line_Illustrations:1.0>",
    "base_prompt_pos":BASE_PROMPT_POS.replace("0.2>", "1.0>"), # reset to default
    "kwargs":KWARGS,
}
In [243]:
generate_series(
    prompt="(🌊)", save_name="test_image_wave", **SKWARGS)
Removed bg from 4 of 4 images.
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

We can see the progression towards the style of the Japanese Line Drawings. There seems to be an overweight for figures in the LoRa, which start to appear at a weight of 1.0. Below weights of 0.6, the LoRa only has a minimalist effect on generated images. In conclusion, maybe weights of 0.2 or 0.4 are good to reduce complexity of generated images towards a more minimalist style for the map.

Change the emoji from 🌊 to other emojis.

In [244]:
generate_series(
    prompt="(🌳)", save_name="test_image_tree", **SKWARGS)
Removed bg from 4 of 4 images.
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [245]:
generate_series(
    prompt="(🏡)", save_name="test_image_house", **SKWARGS)
Removed bg from 4 of 4 images.
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [82]:
generate_series(
    prompt="(river, elbe, aerial)", part=LORA_PART, 
    save_name="test_image_scene", rembg=False)
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Process data

The next step is to process social media metadata (tags, emoji) in descending importance (cluster-size), generate images for clusters, and place images on the map, according to the center of gravity for the cluster shape from tagmaps package.

Because this notebook got quite long, the processing is described in a follow-up notebook:

03_map_processing.html

Create notebook HTML

In [38]:
!jupyter nbconvert --to html_toc \
    --output-dir=../resources/html/ ./02_generativeai.ipynb \
    --template=../nbconvert.tpl \
    --ExtractOutputPreprocessor.enabled=False >&- 2>&-