Tag Maps Workshop, 01/2023, Alexander Dunkel
January 2023, Institute of Cartography, TU Dresden
##
Tag & Emoji Maps Workshop
Dr.-Ing Alexander Dunkel
Technische Universität Dresden
Department of Geosciences, Cartographic Communication
--- Short URL to these slides: ###
tud.link/
do5e
-- **This is the
workshop
.** Please see these link if you're looking for the presentation slides: - [#1 Facets Lecture](https://ad.vgiscience.org/lbsm-facets-lecture-2022/) - [#2 Python Data Science 101](https://kartographie.geo.tu-dresden.de/ad/python_datascience_2022/)
#
Focus
* Exploratory Visualizations of Big Data * Evaluating collective perception * Subjective, Human-scale values & attribution of meaning *Slide setup:* ➡
Right arrows
: Main steps ⬇
Down arrows
: Additional information
![IR](images/01_emojotagmap_zoo.png?01 "Map (Goal of Workshop)") We'll create a Tag Map for **a self chosen area**, from spatially referenced Twitter, Instagram and Flickr posts (and emoji). -- * These maps can be created for any scale and area, depending on data availability and a predefined granularity of information. * Some example maps on [maps.alexanderdunkel.com](http://maps.alexanderdunkel.com/). A paper ([Dunkel, 2015](http://dx.doi.org/10.1016/j.landurbplan.2015.02.022)) explains the conceptual and programmatic background. * check out some additional information provided on [my blog](http://blog.alexanderdunkel.com). * also see the [tagmaps documentation](https://ad.vgiscience.org/tagmaps/docs/), which includes this tutorial -- [![IR](images/01_emojotagmap_campus.png "Map (Goal of Workshop)")](https://www.flickr.com/photos/64974314@N08/28424941169/in/album-72157628868173205/) Emoji Tag Map TUD Campus -- For Tag Maps, the general idea is that tags & emoji are * placed on the map according their (collective) area of use and * scaled based on the overall number of people using respective tags/emoji --
**Tag Maps are scale dependent.**
In other words, data must be clustered again each time a new scale or area is explored.
## Workshop Links * Base Data and tools for Workshop: * [01_Tools.zip](https://cloudstore.zih.tu-dresden.de/index.php/s/zYSZ43ZHWzAyafP/download) (500MB) * includes Flickr, and (some) Instagram and Twitter data for 9 requested areas -- ![IR](images/extractFiles.gif) Extract Files to local Folder.
Avoid space characters in path
e.g. _not_ your 'My Documents' folder).
-- ## Tools (and Code) available on GitHub * [ClipGeo](https://github.com/Sieboldianus/ClipGeo): initial clipping of data * [TagMapsPy](https://github.com/Sieboldianus/TagMaps): clustering of data * [ArcMap/ArcPro](https://github.com/Sieboldianus/TagMaps/tree/master/resources): Final map (labeling) * or [Mapnik?](https://ad.vgiscience.org/tagmaps-mapnik-jupyter/01_mapnik-tagmaps.html)
**While sensitive User Information has been removed or encrypted with AES-256,
please do not share or publish the data provided.
** -- The data has been retrieved using the official APIs of Flickr, Twitter and Instagram (depreciated as of 2020). --
This includes only publicly visible information.
Nevertheless, this data is considered user-related ("pseudonymized") information, and must be handled with care.
## Workshop Overview * [**Part 1 (optional)**](#/part1): - Preview and clip large point datasets (Dresden) - Tools: ClipGeo * [**Part 2**](#/part2): - Cluster data, tools: TagMapsPy * [**Part 3**](#/part3): - Visualize data, tools: ArcGis --
Want to skip Part 1 and directly use the clipped sample area?
Continue with [Part 2](#/part2).
--
Frequently Asked Questions
Before starting, skim through the [FAQ-Section](#/faq) at the end of these slides.
## Part 1: ClipGeo Photo Exploration
[ClipGeo](https://github.com/Sieboldianus/ClipGeo) is a small tool that allows fast exploration and clipping of large spatial photo datasets. It is used here as a first step of exploration, before importing data to ArcGIS or clustering data for Tag Maps. -- ClipGeo does not include data or access to the data itself (e.g. through APIs). For the workshop, data is provided in a subfolder
`\01_ClipGeo\01_Input`
as simple CSV files.
In this workshop, we have provided several datasets for chosen regions. You can use your own data, as long as it is formatted in CSV with lat/lng coordinates available. -- [![IR](images/clip_geo_germany.jpg)](https://flic.kr/p/Tr4tDg) Example output of Flickr photo locations for Europe (2007-2017) from ClipGeo.
## Clipping Data
To start open
\01_ClipGeo\ClipGeo.exe
and
click on Load folder _once_.
--
Do not change the default path
...\01_Input\
-- See a Singleton-Error (e.g.
The type initialized for ‘SingletonCreator’ threw an exception
)? Or Tiles are not loading? ![IR](images/singleton.png) -- ClipGeo requires [DotNetFramework](https://dotnet.microsoft.com/download/dotnet-framework). -- Try the following workaround, before installing DotNetFramework: * Close ClipGeo; Go to folder
\01_ClipGeo\..
-- Check available .Net Versions with
DotNetVersions/DotNetVersions.exe
--
.. and unzip
ExtractForSingletonError_DOTNET
40
SQL.7z
(e.g.) & Overwrite
![IR](images/04_adjustArea.gif) Each red rectangle represents a dataset. Zoom and
right
-
click-and-drag
or left-click markers to adjust area to be clipped/mapped. -- ## Optional: Preview Map To create a preview map: 1. Click
Show Map
2. zoom-in/pan on the second map-window, 3. finally:
Sync Area
and click
search
selection -- ![IR](images/04_previewArea.gif) -- Sometimes, there are few photos on the map and the locations are difficult to see. -- ![IR](images/equalize.gif) Solution: Click
Equalize
and choose a different
color
for points. -- To
create a new preview map
, close the second window and open it again: ![IR](images/newpreview.gif)
Optionally provide an
*output name*
before exporting data. -- ![IR](images/04_review.gif) Output data is stored in
/Output/01_Clipped_Data/
-- This data can be imported to ArcGIS (import XY-Data) or used as input to *\02_TagMapsPy* to create a Tag Map. Tag Maps will be covered in [Part 2](#/14). -- **To use your own clipped area** First, export photo locations for a specific area and copy results * **from**:
`01_ClipGeo\02_Output\01_ClippedData\YourExport\*.csv`
* **to**:
`02_TagMapsPy\01_Input\*.csv`
-- **To skip clipping of data** Copy any Input data source (e.g.: `TR_Istanbul`): * **from**:
`01_ClipGeo\01_Input\TR_Istanbul\*.csv`
* **to**:
`02_TagMapsPy\01_Input\*.csv`
-- **To use the workshop sample area data for Großer Garten** Select CSVs and copy: * **from**:
`01_ClipGeo\02_Output\01_ClippedData\GrosserGarten_Sample_Clip\*.csv`
* **to**:
`02_TagMapsPy\01_Input\*.csv`
[![IR](images/01_emojotagmap_campus_Loc.png)](https://www.flickr.com/photos/64974314@N08/40797729021/in/dateposted-public/) ## Part 2: Tag & Emoji Maps
## Prepare Tagmaps environment
Open conda prompt inside folder
`\02_TagMapsPy `
--- Create a "
tagmaps
" environment and install the package: ```bash conda create -n tagmaps # creates an empty environment conda activate tagmaps # activates the environment conda config --env --set channel_priority strict # sets env options conda install tagmaps --channel conda-forge # installs tagmaps package from conda-forge ``` This is the recommended way to install tagmaps, see [the docs](https://ad.vgiscience.org/tagmaps/docs/quick-guide/).
Environment exists?
Use another name, e.g.
tagmaps_mary
-- Note that
conda_here
shortcut is prepared for the pools at TUD
Start conda at home?
Three steps:
Install
Miniconda (package manager)
Open conda prompt (e.g. Windows: Start Menu > "Conda")
Change folder to
01_Tools/02_TagMapsPy
(e.g.
cd ..
)
## Base Data
Have a look at the Base Data (Flickr, Instagram & Twitter) stored in folder
`01_Tools\02_TagMapsPy\01_Input\`
-- This data was extracted using [ClipGeo](https://github.com/Sieboldianus/ClipGeo) (see [first part](#/8)), it can be one or multiple CSV files.
## Data Clustering ![IR](images/start_tagmaps2.gif) Start the tool with the command
`tagmaps`
. -- After starting the tool, it will read all files in
*01_Input folder*
. Optionally, specify a range of unique tags to be processed. This is set to 1000 by default, meaning the tool will cluster the top 1000 used tags found. After the tool has cleaned up base data, a window will open with the list of top tags found. --
Found any Bugs?
[Create an issue](https://github.com/Sieboldianus/TagMaps/issues) on GitHub!
### Optional: Remove Tags ![IR](images/03_remove_items.gif) -- At this point, we can remove unwanted
tags
(+
emoji
, or
locations
) that are uninteresting for the current scale or for the specific context of investigation. Usually, the first few tags refer to general aspects for the area and provide little information. These tags can be removed, to focus on more locally relevant topics.
### Optional: Preview Tags on Map ![IR](images/03c_preview_tags.gif) -- Check "
Map Tags
" and select a tag. A preview map will be generated showing the general distribution of the current tag. Note that
Emoji
are listed with their *Unicode_Name*.
### Adjust cluster granularity ![IR](images/03d_cluster_granularity.gif) -- **Cluster granularity** affects how *detailed* or *clustered* a Tag Map appears. The default distance is calculated from the current scale, but can be adjusted to personal needs. Cluster granularity significantly affects the resulting map. For small datasets (<30.000 photos) larger distances are recommended, otherwise only few clusters will be found. For large datasets (>100.000) leave as is or reduce cluster granularity. -- Default values are calculated on the given analysis extent (and usually prodoce ok result). Fine-tune only if needed. Cluster Distance can be defined inependently for
tags
,
emoji
and
locations
. -- The Cluster distance can be separately defined for Tags, Emoji and Locations
![IR](images/04_cluster_proceed.gif) Click on proceed to cluster data.. -- The tool will now process each unique tag separately. The clustering is implemented using the single-linkage tree that is available from [HDBSCAN](http://hdbscan.readthedocs.io/en/latest/). In short, all cluster trees (Dendrograms) for tags will be *cut* at the same distance, so that their patterns can be compared for the given scale. HDBSCAN is a fast single-linkage clustering technique that is designed in a *bottom-up* manner, meaning that patterns mostly emerge from underlying data (and not the researcher's definition of input criteria).
Results will be saved to
*02_TagMapsPy\02_Output\*
-- A number of additional files provide some statistics: * Output_cleaned.txt * Output_topemojis.txt * Output_toptags.txt --
Output_cleaned.txt
can be loaded into ArcGIS (CSV/Add XY-Data).
The CSV includes the filtered data that has been used during clustering, e.g. excluding all tags of minor frequency.
-- ![IR](images/08b_viewCleaned.gif)
## Part 3: Visualization in ArcGIS ---
Go to folder
03_MappingArcGIS/
and open the template file. --
Using ArcMap 10.3 to 10.6?
The slides have been updated for ArcGIS Pro 2.5. You can view instructions for old versions of ArcMap [here](https://ad.vgiscience.org/tagmapsworkshop_tud/#/22).
-- For mapping
Emoji
, make sure you have a suitable font installed prior to opening ArcGIS. The font *TwitterColorEmoji* is used in the mxd and is available [here](https://github.com/eosrei/twemoji-color-font/releases). (this _may_ not be necessary for Windows 10, which has native Emoji support) -- To install TwitterColorEmoji: 1. Download [TwitterColorEmoji-SVGinOT-Win-*.*.*.zip](https://github.com/eosrei/twemoji-color-font/releases) 2. Extract it somewhere 3. Open `cmd` with elevated permissions (Start > Type `cmd` > run in Administrator Mode) 4. Cd into the folder where you have extracted the zip, e.g.: ``` cd C:\temp\TwitterColorEmoji-SVGinOT-Win-13.0.1 ``` 5. run install.cmd by typing `install.cmd` 6. the tool will compile Segoe UI Symbol for your system, and offer you to install the two ttf files 7. Type `Y` and install the two fonts -- This will make sure you have the proper font installed. Segoe UI Symbol should be available by default, but it `a)` doesn't include all emoji and `b)` there may be something wrong with the font on your system. The following slide shows a video of the process. --
--- Your data should be automatically linked from
`02_TagMapsPy/02_Output`
-- Important:
Do not use `Data View`
in ArcMap, use the (pre-set) **Layout View** (see [why](#/35/3))
Zoom to area under investigation, based on the extent of our clustered shapefile. -- Check data links. This red
!
is there deliberately,
do not fix
: ![IR](images/redx.png) These red
!
's
are not ok
(and shouldn't happen): ![IR](images/redx_fix.png) View instructions below to fix. -- A red "!" means links are missing, fix first:
-- Do not use
`Data > Repair Data source`
, since this will remove any visualization rules contained in data layers. Fixing links by
`Properties > Source > Set Data Source`
will leave any special layer settings untouched and simply repair the reference to underlying shape data. -- All 3 Layers (e.g. *Top10*, *Other*, *HImp*) link to the same shapefile (
*allTagCluster.shp*
). Each layer has different queries defined, so that different classes of data can be visualized slightly different, e.g. Top10Layer: "Weights" >= 300 AND "HImpTag" = 1
Set Data Frame Projection to the projection of clustered
*allTagCluster.shp*
-- Original data is stored in
WGS1984
Projection (Decimal Degrees, EPSG:4326). TagMapsPy will automatically select a suitable
UTM Coordinate System
and project data upon shapefile export. Matching both the *Data Frame* and the *allTagCluster.shp* Projections speeds up ArcGIS significantly.
Enable the
**TagEmojiMap**
Layer-Group, Tags will draw.
![IR](images/01_emojotagmap_zoo.png?01) To view
emoji
-clustering, follow additional steps below. -- There's a bug in [Fiona/GDAL](https://github.com/Toblerity/Fiona/issues/549) that currently prevents ArcGIS displaying Emoji-Unicode that is directly written to Shapefiles from Python. Until this bug is solved, you can import Emoji through CSV and Join to TagCluster Shapefile. Follow steps below.. -- For mapping Emoji, make sure you have a suitable font installed prior to opening ArcGIS. The font *TwitterColorEmoji* is used in the mxd and is available [here](https://github.com/eosrei/twemoji-color-font/releases). In Windows 10, the default emoji font "Segoe UI Symbol" will otherwise be used (but: poor emoji coverage). -- Depending on the emoji font used, your map will look slighlty different: ![IR](images/emoji-comp.png) -- ![IR](images/font_replace_02.gif) Optional: To change the emoji font, go to Layer Properties > Labels > Select Class "Emoji" and edit the expression. -- ![IR](images/11_emojiAdd_02.gif) Add
allTagCluster.shp
to ArcGIS. -- ![IR](images/11_emojiTableAdd_02.gif) Add
emojiTable.csv
to ArcGIS. -- We'll now join the
`emojiTable.csv`
to the Tag Cluster Shapefile (based on FID) -- ![IR](images/11_emojiTableJoin_02.gif) -- We'll now select all Clusters that are Emojis (Emoji-Column = 1) ![IR](images/11_emojiTableSel_02.gif) -- Open Table and Copy Emojis from Join to Column "ImpTag" ![IR](images/11_emojiTableCopy_02.gif) -- Remove Join and Polygon Layer ![IR](images/11_emojiTableFinal_02.gif) -- Enable Emoji Tag Map Layer ![IR](images/11_emojiTableFinal_Zoom_02.gif)
[![IR](images/01_emojotagmap_campus_Loc.png)](https://www.flickr.com/photos/64974314@N08/40797729021/in/dateposted-public/) To add an additional layer showing
overall location clustering
, view steps below. -- Visualizing post clusters (e.g. photo locations) in background helps assessing overall frequentation patterns. The data for this layer is generated next to Tag Clusters and can be found in
`/Output/allLocationClusters.shp`
. -- ![IR](images/21_enable_photoLocations_02.gif) Enable Location Cluster Layer: shows base clustering. -- We'll analyze the patterns of photo clustering using the
`Getis-Ord Gi* Statistic`
that is available in Spatial Analyst Extension. The results are used to colorize clusters in
hot
and
cold
spots. -- **Optional Step: ISA** For the Getis-Ord Statistic, we can provide a value for
`Distance Threshold`
. We can use the Incremental Spatial Autocorrelation (ISA) tool to find a suitable Distance Threshold for our data. -- The ISA tool visualizes clustering of data at different scales. Its output curve can be used to find a distance, where the clustering is
"most pronounced"
for the given area and data. -- ![IR](images/isa_02.gif) Let's see: Find the tool under
`Spatial Statistics Tools/Analyzing Patterns`
. -- * Input Features: Add
`AllLocationCluster.shp`
* Input Field:
`Weights`
* Select a location to store
`Output Table`
and
`Output Report File`
Click Run. -- Open the results from ISA stored as PDF: ![IR](images/isa2_02.gif) -- Identify
the first peak of your curve
(from left to right). Sometimes, there is no peak, sometimes there is more than one peak. If one or several peaks are shown, a suitable Threshold Distance is usually at the first shown peak. -- ![IR](images/isa_results.png) In the example above, a suitable Threshold Distance would be at around
187 Meters
. Remember this (your) value for the next step. -- ![IR](images/21_hot_spot_02.gif) Find the Hot Spot Analysis (`GI* Statistic`) tool under
`Spatial Statistic Tools/Mapping Clusters`
. -- ![IR](images/HS250.gif) Add Location Clusters, select *Weights* field, optionally add a Distance Threshold (from ISA), specify output location, and click **Run** -- ![IR](images/LoadSymbol_02.gif) Load Symbology (double-click layer, Symbology: import from
`Layer HS250_Sorted`
). Activate "Update Ranges". -- [![IR](images/FinalMap_02.gif)](https://www.flickr.com/photos/64974314@N08/40797729021/in/dateposted-public/) Sort layers and enable Location & Tag Clusters to view the final result. -- ![IR](images/21_symbol_size_02.gif) Optionally, adjust the formula for min/max symbol size to reduce size of smallest and largest cluster dots. --- Optionally add or update: - Title - Legend - North Arrow, Scale Bar etc.
## Export map to pdf or png.
![IR](images/01_emojotagmap_campus.png)
--
If exporting to PDF, make sure to check
"Convert Marker Symbols to Polygons"
+
"Embed Fonts"
or
Export as PNG with (e.g.) 300DPI
[![IR](images/GrosserGarten_Tagmap.png "Map (Goal of Workshop)")](images/GrosserGarten_Tagmap.pdf) Our final map with Tag, Emoji & Photo Location Clustering. Yay! -- ## Submit results Please zip the following contents: * log.txt * allTagCluster.shp (the shapefile, all 5 parts) * allLocationCluster.shp (the shapefile, all 5 parts) * Basemap_World_ArcProV2.aprx (the ArcGIS Map) * optional: exported PDF or PNG of map ... as **
firstname_lastname.zip
** .. -- .. and upload to [Opal Tag Maps folder](https://bildungsportal.sachsen.de/opal/auth/RepositoryEntry/3920855040/CourseNode/1642131038683050011) (or upload to [TU Cloudstore](https://cloudstore.zih.tu-dresden.de/) and send link to [alexander.dunkel@tu-dresden.de](mailto:alexander.dunkel@tu-dresden.de)) -- ## Some evaluation criteria * selected region: - is the map showing a number of interesting locations (good!) or is there only one big cluster that dominates all others? - (not so good! ➡ select smaller region or different area) -- * are globally irrelevant tags excluded? - e.g.: "Germany" can be considered of little interest to Dresden - ➡ should be removed from cluster analysis -- * are single locations (e.g. Instagram "places") with unsuitable granularity excluded? - e.g. location "Dresden" is not suited to evaluate clustering at the city scale - ➡ remove these locations prior to clustering -- * bonus: * scale bar and legend updated * data frame projection matches projection of shapefile *
don't take this too seriously: experiment!
##
FAQ
A collection of frequently asked Tag Maps questions --
Issue: Tag Map looks noisy, there're too many tags places on the map.
- often caused by too large area selection because there's a limit to the number of labels that can be placed on the map - either choose less variety of tags (e.g. 100 instead of 1000) or select smaller area --
Issue: Tag Map shows one main hot spot surrounded by empty areas.
- this is a limitation of this type of visualization - in areas, where a single hot spot dominates, the Tag Map will simply illustrate this unequal distribution - either zoom in to hotspot and recluster or choose area with more equal distribution of locations --
Issue: Zooming in ArcGis does not provide higher information resolution.
- the final maps are static and must be reclustered for each scale - this means that it is not possible to zoom in (e.g. in ArcGIS) - therefore: - work in layout mode with locked extent,
do not use data mode
in ArcGIS - reselect data (ClipGeo) and recluster for zooming in
Thanks for participating! Don't forget: please delete your data the after workshop.