Jan-Feb Learning 2019

What’s This?

I’m trying to give myself at least half an hour during the workdays (or at least blocking two hours or so a week at least) to learn something new – namely taking classes/reviewing what I know on Treehouse, reading job related articles, and reading career-related books. Tracking notables here on a monthly basis as a self-commitment and to retain in memory and as reference. I tell off posting this the last six months with work and life has been insanely busy and my notes inconsistent across proprietary work versus my own, but worth a round-up here. Posting with good intentions for next year. Reminding myself that if I don’t capture every bit I did, it’s alright. Just keep yourself accountable.

Books Read

Inspired: How To Create Products Customers Love

Some favorite quotes from Kindle highlights:

  • This means constantly creating new value for their customers and for their business. Not just tweaking and optimizing existing products (referred to as value capture) but, rather, developing each product to reach its full potential. Yet, many large, enterprise companies have already embarked on a slow death spiral. They become all about leveraging the value and the brand that was created many years or even decades earlier. The death of an enterprise company rarely happens overnight, and a large company can stay afloat for many years. But, make no mistake about it, the organization is sinking, and the end state is all but certain.
  • The little secret in product is that engineers are typically the best single source of innovation; yet, they are not even invited to the party in this process.
  • To summarize, these are the four critical contributions you need to bring to your team: deep knowledge (1) of your customer, (2) of the data, (3) of your business and its stakeholders, and (4) of your market and industry.
  • In the products that succeed, there is always someone like Jane, behind the scenes, working to get over each and every one of the objections, whether they’re technical, business, or anything else. Jane led the product discovery work and wrote the first spec for AdWords. Then she worked side by side with the engineers to build and launch the product, which was hugely successful.
  • Four key competencies: (1) team development, (2) product vision, (3) execution, and (4) product culture.
  • It’s usually easy to see when a company has not paid attention to the architecture when they assemble their teams—it shows up a few different ways. First, the teams feel like they are constantly fighting the architecture. Second, interdependencies between teams seem disproportionate. Third, and really because of the first two, things move slowly, and teams don’t feel very empowered.
  • I strongly prefer to provide the product team with a set of business objectives—with measurable goals—and then the team makes the calls as to what are the best ways to achieve those goals. It’s part of the larger trend in product to focus on outcome and not output.

In my experience working with companies, only a few companies are strong at both innovation and execution. Many are good at execution but weak at innovation; some are strong at innovation and just okay at execution; and a depressing number of companies are poor at both innovation and execution (usually older companies that lost their product mojo a long time ago, but still have a strong brand and customer base to lean on).

 

Articles Read

Machine learning – is the emperor wearing clothes

  1. “The purpose of a machine learning algorithm is to pick the most sensible place to put a fence in your data.”
  2. Different algorithms, eg. vector classifier, decision tree, neural network, use different kinds of fences
  3. Neural networks give you a very flexible boundary which is why they’re so hot now

Some Key Machine Learning Definitions

  1. “A model is overfitting if it fits the training data too well and there is a poor generalization of new data.”
  2. Regularization is used to estimate a preferred complexity of a machine learning model so that the model generalizes to avoid overfitting and underfitting by adding a penalty on different parameters of the model – but this reduces the freedom of the model
  3. “Hyperparameters cannot be estimated from the training data. Hyperparameters of a model are set and tuned depending on a combination of some heuristics and the experience and domain knowledge of the data scientist.”

Audiences-Based Planning versus Index-Based Planning

  • Index is the relative composition of a target audience of a specific program or network as compared to the average size audience in TV universe to give marketers/agencies a gauge of the value of a program or network relative to others using the relative concentrations of a specific target audience
  • Audience-based buying is not account for relative composition of an audience or the context within the audience is likely to be found but rather values the raw number of individuals in a target audience who watch given program, their likelihood of being exposed to an ad, and the cost of reaching them with a particular spot. Really it’s buying audiences versus buying a particular program
  • Index-based campaigns follow TV planning model: maximum number of impressions of a given audience at minimum price -> buy high-indexing media against a traditional age/demo: note this doesn’t include precision index targeting
  • Huge issue is tv is insanely fragmented so even if campaigns are hitting GRP targets, they’re doing so by increasing frequency rather than total reach
  • Note: GRP is measure of a size of an ad campaign by medium or schedule – not size of audience reached. GRPs quantify impressions as a percentage of target population and this percent may thus be greater than 100. This is meant to measure impressions in relation to number of people and is the metric used to compare strength of components in a media plan. There are several ways to calculate GRPs, eg. GRP % = 100 * Reach % * Avg Freq or even just rating TV rating with a rating of 4 gets placed on 5 episodes = 20 GRPS
  • Index-based planning is about impressions delivered over balancing of reach and frequency. Audience-based is about reaching likely customers for results
  • DSPs, etc. should be about used optimized algorithms to assign users probability of being exposed to a spot to maximize probabilities of a specific target-audience reach
  • Audience-based planning is about maximizing reach in most efficient way possible whereas index-based buying values audience composition ratios

Finding the metrics that matter for your product

  1. “Where most startups trip up is they don’t know how to ask the right questions before they start measuring.”
  2. Heart Framework Questions:
    • If we imagine an ideal customer who is getting value from our product, what actions are they taking?
    • What are the individual steps a user needs to take in our product in order to achieve a goal?
    • Is this feature designed to solve a problem that all our users have, or just a subset of our users?
  3. Key points in a customer journey:
    1. Intent to use: The action or actions customers take that tell us definitively they intend to use the product or feature.
    2. Activation: The point at which a customer first derives real value from the product or feature.
    3. Engagement: The extent to which a customer continues to gain value from the product or feature (how much, how often, over how long a period of time etc).

A Beginners Guide to Finding the Product Metrics That Matter

  1. It’s actually hard to find what metrics that matter, and there’s a trap of picking too many indicators
  2. Understand where your metrics fall under, eg. the HEART framework: Happiness, Engagement, Adoption, Retention, Task Success
  3. Don’t measure all you can and don’t fall into the vanity metrics trap, instead examples of good customer-oriented metrics:
    • Customer retention
    • Net promoter score
    • Churn rate
    • Conversions
    • Product usage
    • Key user actions per session
    • Feature usage
    • Customer Acquisition costs
    • Monthly Recurring Revenue
    • Customer Lifetime Value

Algorithms and Data Structures

Intro to Algorithms

  • Algorithm steps a program takes to complete a task – the key skill to derive is to be able to identify which algorithm or data structure is best for the task at hand
  • Algorithm:
    • Clearly defined problem statement, input, and output
    • Distinct steps need to be a specific order
    • Should produce a consistent result
    • Should finish in finite amount of time
  • Evaluating Linear and Binary Search Example
  • Correctness
    • 1) in every run against all possible values in input data, we always get output we expect
    • 2) algorithm should always terminate
  • Efficiency:
  • Time Complexity: how long it takes
  • Space Complexity: amount of memory taken on computer
  • Best case, Average case, Worst case

Efficiency of an Algorithm

  • Worst case scenario/Order of Growth used to evaluate
  • Big O: Theoretical definition of a complexity of an algorithm as a function of the size O(n) – order of magnitude of complexity
  • Logarithmic pattern: in general for a given value of n, the number of tries it takes to find the worst case scenario is log of n + 1 or O(log n)
  • Logarithmic or sublinear runtimes are preferred to linear because they are more efficient

Google Machine Learning Crash Course

Reducing Loss

  • Update model parameters by computing gradient – negative gradient tells us how to adjust model parameters to reduce lost
  • Gradient: derivative of loss with respect to weights and biases
  • Take small (negative) Gradient Steps to reduce lost known as gradient descent
  • Neural nets: strong dependency on initial values
  • Stochastic Gradient Descent: one example at a time
  • Mini-Batch Gradient Descent: batches of 10-1000 – losses and gradients averaged over the batch
  • Machine learning model gets trained with an initial guess for weights and bias and iteratively adjusting those guesses until weights and bias have the lowest possible loss
  • Convergence refers to when a state is reached during training in which training loss and validation loss change very little or not with each iteration after a number of iterations – additional training on current data set will not improve the model at this point
  • For regression problems, the resulting plot of loss vs. w1 will always be convex
  • Because calculating loss for every conceivable value of w1 over an entire data set would be an inefficient way of finding the convergence point – gradient descent allows us to calculate loss convergence iteratively
  • The first step is to pick a starting value for w1. The starting point doesn’t matter so many algorithms just use 0 or a random value.
  • The gradient descent algorithm calculates the gradient of the loss curve at the starting point as the vector of partial derivatives with respect to weights. Note that a gradient is a vector so it has both a direction and magnitude
  • The gradient always points in the direction of the steepest increase in the loss function and the gradient descent algorithm takes a step in the direction of the negative gradient in order to reduce loss as quickly as possible
  • The gradient descent algorithm adds some fraction of the gradient’s magnitude to the starting point and steps and repeats this process to get closer to the minimum
  • Gradient descent algorithms multiply the gradient by a scalar learn as the learning rate/step size
    • eg. If gradient magnitude is 3.5 and the learning rate is .01, the gradient descent algorithm will pick the next point .025 away from the previous point
  • Hyperparameters are the knobs that programmers tweak in machine learning algorithms. You want to pick a goldilocks learning rate, learning rate too small will take too long, too large, the next point will bounce haphazardly and could overshoot the minimum
  • Batch is the total number of examples you use to calculate the gradient in a single iteration in gradual descent
  • Redundancy becomes more likely as the batch size grows and there are diminishing returns in after awhile in smoothing out noisy gradients
  • Stochastic gradient descent is a batch size one 1 – a single example. With enough iterations this can work but is super noisy
  • Mini-batch stochastic gradient descent: compromise between full-batch and SGD, usually between 10 to 1000 examples chosen at random, reduces the noise more than SGD but more effective than full batch

First Steps with Tensorflow

  • Tensorflow is a framework for building ML models. TFX provides toolkits that allows you construct models at your preferred layer of abstraction.
  • The estimator class encapsulates logic that builds a TFX graph and runs a TF session graph, in TFX is a computation specification – nodes in graph represent operations. Edges are directed and represent the passing of an operation (a Tensor) as an operand to another operation

 

 

 

 

MLS 2015 Team Stat D3 Correlations

Practicing doing some scatterplots and exploring correlations with the 2015 MLS Team Stats Data.  Tooltips created using d3-tip.

Screenshot below and see interactive version on bl.ocks.

screenshot.PNG

After 180 Days of Data Viz Learning #jfdi #dataviz #done

I noticed this when I logged into my WordPress account and realized I really need to do this debrief now that I’ve more than properly decompressed and feel a surge from inspiration from attending the OpenVis Conf.

afterlearning.png

A summary of what I did:

I read:

  • The Functional Art by Alberto Cairo
  • The Visual Display of Quantitative Information by Edward Tufte
  • Data Points Visualization That Means Something by Nathan Yau
  • Visualize This: The Flowing Data Guide to Design, Visualization, and Statistics by Nathan Yau
  • The Wall Street Journal Guide to Information Graphics by Dona M. Wong
  • The Best American Infographics 2014 by Gareth Cook
  • Show Me The Numbers by Stephen Few

2016-03-21 09.13.12.jpg

I studied:

  • Knight Center Course on D3.js and Data Visualization
  • Treehouse D3 Course
  • Data Visualization and D3.js on Udacity

I coded/created:

  • Tableau business dashboards on digital marketing attribution, including customized DMA maps, etc that are beyond the typical drag and drop.
  • D3 scatterplots, scatterplot matrixes, node link trees, hierarchal force layouts, sankey, bar charts, bubble charts, sunbursts, histograms, and even pie chart

I accomplished:

  • Gaining a design sensibility for data visualization
  • Understanding data connections and issues around them (eg. Vertica, Cubes, SQL, S3, etc.)
  • Solid foundation of D3
  • Strong skills in Tableau
  • Conceptual understanding of visualization suites in general, such as R libraries, other Javascript libraries, and other Enterprise BI tools (Quikview, Power BI)
  • Being the thought leader in data visualization in my organization

To take a step back, I embarked on this journey because I got new role with the job title as Data Visualization Manager.  I talked about this in my first post and embarked on 180 Days of Data Viz Learning as inspired by Jennifer Dewalt’s Project 180 Websites in 180 Days.  It’s been a journey with highs and lows, good, bad, and ugly.  I walked away with a strong design and business sensibility, new hard skills and knowledge, and an understanding of data visualization at the both an enterprise and the open source level.  

Creating a Design, Business Intelligence, and Product Sensibility

One big thing I set out on as a differentiator was that I didn’t just want to learn to code or just be able to make pretty visualizations.  There are many people who can code better than me and will always be able to code better than me.  There are also many people who can make much more beautiful visuals than me.  I’m not naturally inclined toward art or to computer science in terms of innate talent or passion, but I recognize the importance of bridging those two disciplines would be for this endeavor in my career.  For me, I don’t consider coding my passion.  I’m also no designer or artist.  I’ve never considered myself in the “creatives.” I consider communication and storytelling as my passion, and code is a means to construct that narrative.  Being a bridger of ideas and practices is a number one priority in my life.

The Good

The process really forced me to learn and focus, even if in the end it took far longer than 180 Days, roughly seven months.  Not bad I think considering I did go on two overseas vacations and did a cross country move during that time.  I sincerely think I would not have gotten so much done had I not felt compelled to do some work everyday.  

For my own practical purposes as the “bridger.” I wanted to make sure I had a strong background on design concepts related to data visualization and also how gain a proficiency in the tools required for my role.  Tying that all together is what I wanted to develop out as a strength.  I can talk intelligently about how performance issues in a dashboard can be influenced by projections in HP Vertica or how the data needs to be prepared in a Hadoop cluster first and then how to query it into the right format for a visualization.  I can talk visual encodings and perception from a design perspective, the grammar of graphics and all that.  And I can talk about the strengths and weakness of Tableau/other enterprise tools and what libraries we can use to scale D3.  I can talk about these things, and I slowly get better at doing these things everyday.

Doing a little bit of these things everyday really pushed me in a way I don’t think I would have.  Sometimes it just ended up being five minutes of extra reading at night to five hour work sessions.  Ironically doing the 180 Days had a great side effect in making me aware of a larger strategic view at my work that I realize I lost when I stopped.  I also inadvertently lost 10 lbs and started reading more everyday because this whole endeavor made me much more mindful.

The Bad:

Learning theory and application at once isn’t easy from a motivational perspective.  I’m the only person working with Open Source tools and doing data visualization beyond a dashboard or business analyst (eg excel type) graphs perspective, but I had to do a lot of that in my day-to-day as well.  It can really grind on you reading and seeing beautiful visualization and then taking over ten hours to do a successful visualization in D3.  Prepare for that.

There’s a flipside in doing something everyday, in that by having to do a little bit everyday, it can become a quantity over quality game.  I had more nights that ended up later than I wanted because I rushed to read and learn before going to bed.  It might have made more sense on some days to just do a longer block to focus than try to something EVERY DAY.  I’m trying to learn Javascript and Front-End in general now along with a couple of other topics in my life, and I’m not going about the same way I did with the 180 Days of Data Viz Learning.

Lessons Learned and Tips

  1. Really go for easy wins if you do try to get in some work everyday.  My main goal was to have three takeaways for every lecture session where I was watching videos online or from reading I did that day to absorb the information.  Decomposing visuals was especially helpful and is a good process to learn when you need to turn the tables on your own.
  2. Find partners in your journey.  Up until the OpenVisConf last week, I had no barometer to measure to see how much I knew and learned.  I got down on myself more often than I needed to given how much I did learn and all the balls I had in the air at once.  Having a support group/sounding board would have made the journey better and I would have learned more.
  3. It was hard to learn theory and application at once – mainly because you’ll be making so little progress at first.  It was a bummer and is still a bummer to see how far I am from being able to do the work of people I admire.  Also, unlike other skills I have (eg. I’ve been doing advertising and project management for years), Data Ziz is still new.  I have bigger ambitions than what I can make, and that is reality and that is ok, but it’s hard to accept sometimes.  Maybe this is a mental thing for me, but mentioning it as I’m sure it’s something someone else may run into in their process.
  4. Procrastination is normal.  Figure out if you’re tired out/burned out or else just annoyed and anxious.  If annoyed and anxious, just try working for five minutes.  If you can keep working, chances are you aren’t tired or burned out and have some internal dialogue to work through or some process to improve on.
  5. Data Viz Burn out.  It can happen, and I definitely felt that way, which is why I actually advise against the 180 days straight versus working on a few projects concurrently and going through books and tutorials in a more paced versus trying to pick up bits and pieces everyday.  Also, I got to the point where I was doing Data Viz at work and going home to do more Data Viz, that I got to a point where didn’t like something I enjoyed anymore.  Also, on this note, take  time to just enjoy other Data Viz projects too like a regular audience would, rather than doing critique or learning.  Burnout is normal, rest accordingly.
  6. Don’t give up!  I didn’t get where I wanted to in terms of a creative or technical skillset that I had hoped for after a 180 days and even now, but I did progress significantly in what I know and enjoyed the journey.

 

Day 180 of 180 Days of Data Viz Learning #jfdi #done #dataviz

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Guess what?  It took longer than 180 days, but it’s been a pretty cool journey.  Did my daily learning will post a debrief early next week.  This has been quite the intellectual and emotional exercise for me.  Learned so much about data viz + more.

Eljiah Meeks D3.js in Action

Chapter 5 Layouts
Some last takeaways
  • One key with generators eg. d3.svg.arc is that they have particular settings p 144
  • “One of the core uses of a layout in D3 is to update the graphical chart. All we need to do is make changes to the data or layout and then rebind the data to the existing graphical elements” p 146
  • If transitions are distorted because of default data-binding keys, you may need to change sorts, eg pieChart.sort(null); in conjunction with exit and update behavior p 147
  • Layouts in D3 expect a default representation of data,  usually a JSON object array where the child elements in a hierarchy are stored in child attribute that points to an array.  p 149
  • You can either data munge or get use to using accessor functions p 149
  • Pack layouts have a built-in .padding() function to adjust spacing and a .value() function to size out spacing and influence size of parent nodes p 151

Reading and Learning Data Visualization Theoretically/Critically:

Show Me the Numbers by Stephen Few

Three Takeaways Today

Chapter 5 Visual Perception and Graphical Communication
  • “Our perception of space is primarily two dimension.  We perceive differences in vertical position (up and down) and in horizontal position (left and right) clearly and accurately. We also perceive a third dimension, depth, but not nearly as well.” p 71
  • “We perceive hues only as categorically different, not quantitatively different; one hue is not more or less than another, they’re just different.  In contrast, we perceive color intensity quantitatively, from low to high” p 71
  • Both size and color intensity are not the best way to code quantitative values.  The key is not good at matching a number to a relative size or color intensity -> use length or position if possible instead p 73

Day 179 of 180 Days of Data Viz Learning #jfdi #dataviz

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Eljiah Meeks D3.js in Action

Chapter 5 Layouts

Three Takeways Today

  • Layouts are D3 functions that help format data so it can be used for select group of charting methods p 139
  • Layouts do not actually draw the data nor are they called like components or referred to in the drawing code like generations.  They’re actually a preprocessing step that formats data so it’s ready to be displayed in the form of the visual.  You can update layouts and if you rebind the altered data, you can use D3 enter/update/exit syntax. p 139
  • “Many people get who started with D3 think it’s a charting library, and that they’ll find a function like d3.layout.histogram that creates a bar chart in a <div> when it’s run.  But D3 layouts don’t result in charts; they result in the settings necessary for charts.  You have to put in a bit of extra work for charts, but you have enormous flexibility (as you’ll see in this and later chapters) that allows you to make diagrams and charts that you can’t find in other libraries” p 141

Day 178 of 180 Days of Data Viz Learning #jfdi #dataviz

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Eljiah Meeks D3.js in Action

Chapter 4 General Charting Principles 

Three Takeways Today

// Callback function breakdowns

var n = 0 // counter to increment
for (x in data[0])
if (x != day) { //not drawing a line for day value of each object because this is giving x value of coordinate
var movieArea = d3.svg.area() // generator that iterates through each objet that corresponds to one of our movies using day vlaue for x coorindate but iterating through values for each movie for the y coordinates
.x(function(d) {
return xScale(d.day)
})
.y(function(d) {
return yScale(d,x))
})
.y0(function(d) {
return yScale(simpleStacking(d,x) – d[x])
})
}

// Stacking function. Takes incoming bound data and name of attr and loops throuhg the incoming data, adding each value until it reaches current named attribute. As a result, it returns the total value for every movie during this day up to the movie we’ve sent.

function simpleStacking(incomingData, incomingAttribute) {
var newHeight = 0
for (x in incomingData) {
if (x != “day”) {
newHeight += parseInt(incomingData[x]);
if (x == incomingAttribute) {
break;
}
}
}
return newHeight;
};
// Stacking function that alternates vertical position of area drawn p 136

Day 176 of 180 Days of Data Viz Learning #jfdi

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Eljiah Meeks D3.js in Action

Chapter 4 General Charting Principles 

Three Takeways Today

  • the d3.svg.area has helper functions that that bound lower end of paths to produce charts.  You need to define a .y0() accessor that corresponds to the y accessor and determine the bottom shape of the area p 132
  • p 133
  • for (x in data[0]) {
    if x != “day” { // iterating through data attributes with for loop, where x is name of each column in data, which allows us to dynamically create and call generators

    var movieArea = d3.svg.area()
    .x(function(d) {
    return xScale(d.day); // every line uses the day column for x value
    })
    .y(function(d) {
    return yScale(d[x]); // dynamically sets the y-accesor function of our line generator to grab the data from the appropriate movie for our y variable
    })
    .y0(function(d) {
    return yScale(-d[x]); // new accessor function provides us with ability to define where the bottom of the path is. In this case, we start by makign the bottom equal to hte inverse of the top, which mirrors the shape
    })
    .interpolate(“cardinal”);

    d3.select(“svg”)
    .append(“path”)
    .style(“id”, x + “Area”)
    .attr(“d”, movieArea(data))
    .attr(“fill”, “darkgray”)
    .attr(“stroke”, “lightgray”)
    .attr(“stroke-wdith”, 2)
    .style(“opacity”, .5);
    };
    }

    // Use d3.svg.line to draw most shapes and lines, whether filled or unfilled, closed or open. Use d3.sg.area() when you want the bottom of the shape to be calculated based on top of the shape. Suitable for bands of data, such as for stacked area charts or steamgraphs. p 135

Reading and Learning Data Visualization Theoretically/Critically:

Show Me the Numbers by Stephen Few

Three Takeaways Today
Chapter 4 Fundamental Variations of Tables
  • When you think of relationships, think between quantitative to categorial versus quantitative to quantitative p 56
    • Eg sales and regions versus different sales people across months
  • Unidirectional tables are categorial items laid in one direction (e.g. Sales by department e.g. across columns or down rows while bidirectional tables are categorial items laid out in both directions (cross tabs and picts, e.g. dept, expense type, and expenses p 57
  • Quantitative-to-Categorical relationships can be unidirectional or bidirectional p 60