Day 157 of 180 Days of Data Viz Learning #jfdi

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Code Learning:

Three Takeways

Chapter 2 Information Visualization Data Flow

  • You can sort quantitative data into categories by putting values in a range or binning them.
  • For example if you use d3.scale.quantile() you can use the .range() setting to specify number of parts and labels.  There will be no error if you have a mismatch between number of .domain() values and .range() values because it autosorts and bins the value in domains to the number of values in range.
    • d3.scale.quantile().domain(samplearray).range([“small”, “medium”, “large”]);
    • d3.scale.quantile().domain(sampleArray).range([0, 1, 2])
  • d3.nest() can use shared attributes of data and sort them into discrete categories and subcategories.  The example below combines tweets into arrays under new objects labeled with user attributes
    • d3.json(“tweets.json”, function(data) {
      • var tweetData = data.tweets;
      • var nestedTweets = d3.nest()
        • .key(function(el) { return el.user })
        • .entries(tweetData);
      • });
  • When you have arrays you’ll want to size and position attributes based on values in the array.  Below are common ways to determine distribution
    • d3.min(testArray, function(d) { return d});
    • d3.max
    • d3.mean
    • d3.extent // returns min and max
Chapter 6 Network Visualization
    • Directions of edges can also be represented using curved edges or edges that grow fatter on one end to another.  “To do something like that, you’d need to use a <path> rather than a <line> for the edges like we did with the Sankey layout or the arc diagram. “p 189
    • “If you think a force-directed layout is hard to read, you can pair it with another network visualization, such as an adjacency matrix, and highlight both as the user navigates either visualization” p 190
    • Weight is the strength of connection between two nodes.  The nodes in the system that are more mportant than others is referred to as centrality.  p 191


Reading and Learning Data Visualization Theoretically/Critically:

Edward Tufte Visual Display of Quantitative Information

Chapter One Graphical Excellence
Three Takeaways
  • Stacking of multiple time series from Voyager 2 allowed comparing of multiple intensities p 29
  • “Time-series displays are at their best for big data sets with real variability.  Why waste the power of data graphics on simple linear changes, which can usually be better summarized in one or two numbers?” p 30
  • Combination of NYC weather for 1980 with time series, trend lines, bar charts, and area charge shows multiple narratives of one story effectively p 30

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s