Blindfolded Cartography

Andy Woodruff, Axis Maps

OpenVis Conference, April 2015

@awoodruff | axismaps.github.io/blindfolded-cartography

Hello, my name is Andy. I make maps.

I live on the other (better, duh) side of the river and sometimes make maps of Boston.

I also learn things and help other people learn things at Maptime Boston.

But most of the time,
I make interactive web maps with Axis Maps.

For public health, historians, educators, environmentalists, researchers, economists, businesses, &c. &c.

We are often tasked with designing and building interactive maps for data that we can't yet see.

Traditional cartography allows for perfectionism. Crafting beautiful, effective maps and telling stories with them is one thing when you can obsess over every detail; it's quite another when you have to do it as though blindfolded.

A tale of compromises

What follows is a list of such challenges from our experience, be they data messes, design conundrums, or code gotchas. Solutions are never perfect, but they're out there.

¯\_(ツ)_/¯

Data classification

Classification matters. A lot.

John Nelson, Telling the Truth

Data distributions in designs

Data distributions in reality

Every. Damn. Time.

A classification scheme needs to work for any distribution, without our intervention.

Things to consider

Will the map be useful and look good?

Will charts be useful and look good?

Are the breaks meaningful?

Are the breaks understandable?

Are the breaks pretty numbers?

Jenks Optimal Breaks

Maximizes similarity within groups
and difference between groups.

See: Tom MacWright's JavaScript implementation.

BUT breaks will be weird numbers unique to every distribution. Difficult to make comparisons between datasets.

Quantile Breaks

Same number of items in each bin.

Guarantees some variety on the map.

See: D3 quantile scales.

BUT quantiles obfuscate magnitudes and can excessively split similar numbers or group disparate numbers.

A compromise

Breaks at unevenly spaced percentiles, while still masking actual magnitudes, can keep skews under control but highlight low and/or high ends. As a bonus, labels can always be pretty numbers.

A final trick

Percentiles based on unique values prevent a single value from spanning multiple bins.

// basic example
function getValueAtPercentile( data, percentile ){
  return data[ parseInt( (data.length - 1) * percentile ) ];
}
// using underscore.js
var dataArray = _.sortBy( [ 0, 0, 12, 23, 2, 5, 0, 5, 19, 0, 0, 0, 33, 9, 25, 0 ], Number );
var uniques = _.uniq( dataArray, true );
getValueAtPercentile( dataArray, .25 ); // = 0
getValueAtPercentile( uniques, .25 ); //  = 5

Density

Geography is messy and crowded

Things overlap

Make sure small things are on top of large things.

function drawMapForMonth(m){
  var circle = map.selectAll("circle")
    .sort(function(a,b){
      return Math.abs(b[m]) - Math.abs(a[m]);
    })
  // etc
}

Some things are large and some are small.

Take advantage of known geographies.

null

Expect holes in the data

Design for them and catch them in code. "No data" is data.

See The Design of Nothing: Null, Zero, Blank

Andy Kirk, OpenVis 2014

Know no data

// no data might be...
null
undefined
NaN
""
"NULL"
-9999
// &c. &c.
// a tempting way to catch "no data" in javascript
if ( data ){
  // yay, we have data!
} 

// but watch out!
var data = 0; // zero is real data
if ( data ){
  // zero won't get us in here :(
}

Texture on choropleth maps

See: Textures.js by Riccardo Scalco

Gaps in time series data

Gaps in time series data

Text

An often overlooked element of cartography is the prose that goes along with the map. Sometimes when you are anticipating a small snippet of text, you get a novel's worth instead. It's often a good idea to restrict the size of text boxes with CSS properties such as max-height and overflow:auto so that when a million words are thrown at you, your layout isn't ruined by text that overflows its allotted space and goes on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on and on .

Abbreviation

Mind your number formatting

For example:

function format(val){
  return val + " seat" + ( val != 1 ? "s" : "" );
}
function format(val){
  if ( val > 1000000000 ) return val / 1000000000 + " billion";
  if ( val > 1000000 ) return val / 1000000 + " million";
  if ( val > 1000 ) return val / 1000 + " thousand";
  return val;
}

The Lorem Ipsum Map

Real — or even realistic — data is not always available to designers and developers.

Designing UI around a placeholder lorem ipsum map (Roth and Harrower, 2008) can be detrimental to the overall look and feel, and to user experience.

Fake data, however, is a good test of code.

"Smart dummy data"

Carolyn Fish on using OpenStreetMap to stand in for more complete—but inaccessible—data

The human touch

“The machine does not desire to make you think or feel or learn anything in particular, as the artist does, and this is the heart of what is wrong with so much of cartography today. Only humans can make maps for other humans.”

Daniel Huffman, On Human Cartography

The Essential Geography of the United States of America

by David Imus

“He used a computer (not a pencil and paper), but absolutely nothing was left to computer-assisted happenstance.”

-Seth Stevenson, Slate (2012)

Some approximate math

Suppose he wanted to make a multiscale web map with the same level of care...

1:4,000,000 scale
“Nearly 6,000 hours” of work

Zoom level 0: 47 hours
Zoom level 1: 94 hours
Zoom level 2: 188 hours
Zoom level 3: 375 hours
Zoom level 4: 750 hours
Zoom level 5: 1,500 hours
Zoom level 6: 3,000 hours
Zoom level 7: 6,000 hours
Zoom level 8: 12,000 hours
Zoom level 9: 24,000 hours
Zoom level 10: 48,000 hours
Zoom level 11: 96,000 hours
Zoom level 12: 192,000 hours
Zoom level 13: 384,000 hours
Zoom level 14: 768,000 hours
Zoom level 15: 1,536,000 hours
Zoom level 16: 3,072,000 hours
Zoom level 17: 6,144,000 hours
Zoom level 18: 12,288,000 hours

*Conservatively doubling instead of quadrupling hours with each zoom level.

2,803 years

(and 222 days)

And that's only the United States.

Big design

"Big data" is a design problem

How can we accomplish good, human design when visualizing far more data than a human designer could ever inspect?

Global basemaps

OpenStreetMap is the dataset for most of us. With 2 million users mapping the world, quality and consistency aren't guaranteed.

Your challenge is to come up with design rules that ensure the map looks pretty good everywhere, at all scales.

To summarize Nicki Dlugash of Mapbox (NACIS 2014):

  1. Prioritize areas you care about for your specific use case.
  2. Prioritize the most typical or common examples of a feature.
  3. Look for the most atypical examples of a feature to develop a good compromise.

taginfo and overpass turbo

See a problem? Fix it!

Thanks, and
Good luck!

@awoodruff | andywoodruff.com | axismaps.com
axismaps.github.io/blindfolded-cartography