Learning June-Dec 2019

Books Read

Crossing the Chasm

The Memo: What Women of Color Need to Know to Secure a Seat at the Table


Articles Read

The Empathy Delusion

  1. ‘People in the advertising and marketing industry and the modern mainstream have different ‘moral foundations’ and (unconscious) intuitions about what is right and wrong”
  2. “As a leading social psychologist, Haidt has been at the forefront of popularising the idea of WEIRD (Western, Educated, Industrialised, Rich and Democratic) morality and psychology. Haidt identifies five moral foundations and shows that, although WEIRD morality is dominant in political, cultural, media and professional elites in the United States, WEIRD people are actually statistical outliers whose moral foundations are unrepresentative of the general population.”
  3. The need is to combine empathy with efficiency in advertising


Why We shouldn’t trust our gut instinct

  1. Ad agency employees are “anywheres” with mobility and exposure to other cultures. This is a fundamental difference to 50% of the UK in terms of values to Somewheres where identities take a firm local root.
  2. Tldr: Agency employees and MBA employees do not think like mainstream they sell to
  3. “There is no universal ‘one size fits all’ model of perception and reasoning that goes across cultural differences



  1. “Operations is the sum of all of the skills, knowledge and values that your company has built up around the practice of shipping and maintaining quality systems and software.  It’s your implicit values as well as your explicit values, habits, tribal knowledge, reward systems.  Everybody from tech support to product people to CEO participates in your operational outcomes, even though some roles are obviously more specialized than others.”
  2. A critical path when considering trade-offs in going serverless is resiliency from a user’s perspective and preserving that. Figure out what your core differentiators are, and own those.
  3. You still need to understand your storage systems – there’s still a server underneath so many abstractions.


Orchestration vs. Choreography

  1. Orchestration is a central process that controls different web services and coordinates the execution of different operations on the Web services involved in the operation
  2. Choreography is when each web service knows exactly when to executive it’s operations and with whom to interact.
  3. Orchestration is more flexible paradigm


Scheduling a Meeting the Right Way

  1. Know the hierarchy, ask what works best for them first or do the team approach first
  2. Put fences around time – don’t ask blindly for times that work
  3. Keep paper trail


Serverless Architectures

  1. Serverless architectures are application designs that make use of 3rd party Backend as a Service or in managed ephemeral containers that Functionas as a Service. Serverless removes the need for a traditional always-on component that may reduce operational cost, complexity, and engineering hours with the trade-off of relying more on vendors and possibly immature support services.
  2. Benefits:
    1. More flexible to change, adding features requires less changes in architecture
    2. Fundamentally, you can run back-end code without managing owner server systems or applications. The vendor handles resource provisioning and allocation
    3. FaaS do not require coding to a specific framework or library
    4. You can bring up entire applications up and down to respond to an event
  3. Drawbacks:
    1. Requires better distributed monitoring capabilities. More moving pieces to manage and done by external parties
    2. Vendor management becomes a much more important function in a serverless org
    3. Multitenancy problems creep
    4. Security concerns, configurations become much more paramount


Managing communications effectively and efficiently

  1. Check-in with stakeholders about understanding project
    1. What is working in how we communicate with you about the project?
    2. What is not working or is not effective in our communications?
    3. Where can we improve our communications with you?
  2. How to figure out fine balance between too much and too little communications
    1. Who needs to know what information?
    2. How often must that information be communicated/shared?
    3. By what means will information be communicated/shared?
  3. Figure out stakeholders and their repsonsibilities on each project to tailor



All the best engineering advice I stole from non-technical people

  1. Most things get broken around the seems. Use the 100:10:1 approach
    1. Brainstorm 100things that could go wrong
    2. Pick 10on that list that feel like the most likely and investigate them
    3. Find the critical problem you’re going to focus on.
  2. Understand why they hired you – always ask yourself what am I asked to be the expert here? (In my case, translating and ruthlessly prioritizing and trading off features so that a finished product that makes money gets into the end users hands in a form where they can do that for the org)
  3. Figure out an observability that works for your team, not just facetime and don’t get into cycle of over attempts at over optimization and trust degradation. “Replacing trust with process is called bureaucracy.”


Why Companies Do “Innovation Theater” Instead of Actual Innovation

  1. “People who manage processes are not the same people as those who create product. Product people are often messy, hate paperwork, and prefer to spend their time creating stuff rather than documenting it. Over time as organizations grow, they become risk averse. The process people dominate management, and the product people end up reporting to them.”
  2. “In sum, large organizations lack shared beliefs, validated principles, tactics, techniques, procedures, organization, budget, etc. to explain how and where innovation will be applied and its relationship to the rapid delivery of new product.”
  3. “Process is great when you live in a world where both the problem and solution are known. Process helps ensure that you can deliver solutions that scale without breaking other parts of the organization… These processes reduce risk to an overall organization, but each layer of process reduces the ability to be agile and lean and — most importantly — responsive to new opportunities and threats.”


How Does The Machine Learning Library TensorFlow Work?

  1. Tensorflow allows for deep neural network models
  2. Tensorflow lets you display your computation as a data flow graph and visualize it using the in-built tensorboard
  3. You build a graph by defining constants, variables, operations, and then executing. Nodes represent operations and Edges are the carriers of data structures (tensors) where the output of one operation (from one node) becomes the input for another operation.


Whatever Happened To The Denominator? Why We Need To Normalize Social Media

  1. The denominator, especially in social media and data science around it is almost nonexistent so there’s no sense of how big a dataset is or if it’s growing or shrinking. We can’t normalize it, and therefore, can’t really understand it.
  2. Analyzing Twitter data is a frequent misuse of this, eg. interpreting retweets as a behavioral proxy for some sort of engagement about breaking news that is flawed when many are just forwards
  3. “We tout the accuracy and precision of our algorithms without acknowledging that all of that accuracy is for naught when we can’t distinguish what is real and what is an artifact of our opaque source data…. We tout the accuracy and precision of our algorithms without acknowledging that all of that accuracy is for naught when we can’t distinguish what is real and what is an artifact of our opaque source data.”


Stop the Meeting Madness

  1. Prework, Clearly defined goals, and meeting time managed against agenda
  2. Debrief once in awhile about meetings
  3. Institute no tech meetings when needed


How to choose the right UX metrics for your product

  1. Quality of user experience: Good old HEART framework
    1. Happiness: measures of user attitudes, often collected via survey. For example: satisfaction, perceived ease of use, and net-promoter score.
    2. Engagement: level of user involvement, typically measured via behavioral proxies such as frequency, intensity, or depth of interaction over some time period. Examples might include the number of visits per user per week or the number of photos uploaded per user per day.
    3. Adoption: new users of a product or feature. For example: the number of accounts created in the last seven days or the percentage of Gmail users who use labels.
    4. Retention: the rate at which existing users are returning. For example: how many of the active users from a given time period are still present in some later time period? You may be more interested in failure to retain, commonly known as “churn.”
    5. Task success: this includes traditional behavioral metrics of user experience, such as efficiency (e.g. time to complete a task), effectiveness (e.g. percent of tasks completed), and error rate. This category is most applicable to areas of your product that are very task-focused, such as search or an upload flow.
  2. Don’t necessarily need a metric in every one of HEART category, but it’s a useful framework to apply to your particular product
  3. Goals Metrics Signals – match to your user experience, grid with a HEART framework
    1. Goals
    2. Metrics
    3. Signals


The secrets to running project status meetings that work!

  1. Agenda need to be defined, team members need to be prepared, time management, handling input and topic control
  2. Focus on both the what (the content) and how (the meeting is run)
  3. Look back and forward on the meeting in a two week or appropriate interval to make sure it’s not a rehash and things are moving forward


The First Thing Great Decision Makers Do

  1. Commit to your default decision upfront as a habit, meaning framing the context before you seek data. You need to be a decision criteria set that is informed by background knowledge and a hypothesis.
    1. simple example of max price you’re will to pay before you see a price
    2. acknowledging sunk costs upfront
  2. The pitfall otherwise is “data-inspired decision making” that can be riddled with confirmation bias or misinterpretations
  3. A question to ask if, there is no data, what will by decision be and what’s that based on? If there is data, what is the magnitude of evidence to sway me from my default decision?


The ultimate guide to remote meetings 2019

  1. Build a virtual water cooler to encourage rapport and build relationships, eg. Slack channels. Make some time for small talk in beginning
  2. Agenda setting
    1. Key talking points
    2. Meeting structure (for example, when and for how long you plan to discuss each talking point)
    3. Team members/teams that will be in attendance
    4. What each team member/team is responsible for bringing to the meeting
    5. Any relevant documents, files, or research
    6. Actions for next meeting
      1. Deliverables and next steps
      2. Who’s responsible for following up on each item or task
  • When those deliverables are due
  1. When the next meeting or check-in will be
  1. Make sure everyone has a job and include the introverts


What SaaS Product Managers Need to Know About Customer Onboarding in 2019

  1. Every successful user journey consists of four main parts:
    1. Converting your website visitors
    2. Unleashing the “aha moment”.
    3. User activation – when user feels value
    4. Customer adoption – contact with secondary features and start using product
  2. Personalization as key and using right products to streamline process and need for strong customer support/portals/kb
  3. Take advantage of the Zeigarnik effect – people have the tendency to remember uncompleted tasks, eg checklists, breadcrumbs of more tasks


How to ask for a raise

  1. Key question ““What do I need to do to make a bigger difference to the company?”
  2. “If your manager visibly doesn’t believe in your capacity to get to the next level regardless of what you do, find a new manager; your career is at a dead end where you are.”
  3. The only piece of leverage that really matters is a counteroffer



  1. Range of techniques for altering or augmenting behavior of an OS, apps, or other software components by intercepting function calls, messages, or events passed through components. Code that handles such intercepted function calls, events, or messages, of called a hook.
  2. Methods:
    1. Source modification – modifying source of executable or library before app is running
      1. You can also use a wrapper library and make own oversion of a library that an application loads
    2. Runtime modification: inserting hooks at runtime, eg. modify system events or app events for dialogs


Webhook vs API: What’s the Difference?

  1. “Both webhooks and APIs facilitate syncing and relaying data between two applications. However, both have different means of doing so, and thus serve slightly different purposes.”
  2. Webhook: Doesn’t need to be a request, data is sent whenever there is new data available
    1. Usually performs smaller tasks, eg. new blog posts for CMS
  3. API: Only does stuff when you ask it to.
    1. Tends to be entire frameworks, eg. Google Maps API that powers other apps





SQL Functions

  • Concatenation operator joins two pieces of text ||
  • Single quotes should be used for String literals (e.g. ‘lbs’), and double quotes should be used for identifiers like column aliases (e.g. “Max Weight”)
  • Example
    • SELECT first_name || ‘ ‘ || last_name || ‘ ‘ || ‘<‘ || email || ‘>’ AS to_field FROM patrons;
  • Examples of using functions:
    • — SELECT LENGTH(<column>) AS <alias> FROM <table>;
      • SELECT username, LENGTH(username) as length FROM customers;
    • — LOWER(<value or column>)
      • SELECT LOWER(title) as lowercase_title, UPPER(author) as uppercase_author FROM books;
    • — SUBSTR(<value or column>, <start>, <length>)
      • SELECT name, SUBSTR(description, 1, 35) || “…” AS short_description, price FROM products;
    • — REPLACE(state, <target>, <replacement>)
      • SELECT street, city, REPLACE(state, “California”, “CA”), zip FROM addresses WHERE REPLACE(state, “California”, “CA”) = “CA”;

SQL Counting Results

  • — COUNT(<column>)
    • SELECT COUNT(DISTINCT category) FROM products;
  • — SELECT <column> FROM <table> GROUP BY <column>;
    • SELECT category, COUNT(category) as product_count FROM products GROUP BY category;
  • — SUM(<column>)
    • — having keyword works on aggregates after GROUP BY before ORDER BY
    • SELECT SUM(cost) AS total_spend, user_id FROM orders GROUP BY user_id ORDER BY total_spend DESC
    • SELECT SUM(cost) AS total_spend, user_id FROM orders GROUP BY user_id HAVING total_spend > 250 ORDER BY total_spend DESC;
  • — AVG(<column>)
    • SELECT user_id, AVG(cost) AS average_orders FROM orders GROUP BY user_id;
  • — MAX(<numeric column>) MIN(<numeric column>)
    • SELECT AVG(cost) AS average, MAX(cost) as maximum, MIN(cost) as minimun, user_id FROM orders GROUP BY user_id;
    • SELECT name, ROUND(price*1.06,2) AS “Price in Florida” FROM products;
  • –DATE(“now); in sqlite
    • SELECT * FROM orders WHERE status = “placed” AND ordered_on = DATE(“now”);

Apr-June 2018 Learning

Books Read (related to work/professional development/betterment):


Giving meaning to 100 billion analytics events a day

  1. Tracking events were sent by browser over HTTP to a dedicated component and enqueues them in a Kafka topic. You can build a Kafka equivalent in BigQuery to use as Data Warehouse system
  2. ‘When dealing with tracking events, the first problem you face is the fact that you have to process them unordered, with unknown delays.
    1. The difference between the time the event actually occurred (event time) and the time the event is observed by the system (processing time) ranges from the millisecond up to several hours.’
  3. The key for them was finding ideal batch duration

What is a Predicate Pushdown?

  1. Basic idea is certain parts of SQL queries (the predicates) can be “pushed” to where the data lives and reduces query/processing time by filtering out data earlier rather than later. This allows you to optimize your query by doing things like filtering data before it is transferred over a network, loading into memory, skipping reading entire files or chunks of files.
  2. ‘A “predicate” (in mathematics and functional programming) is a function that returns a boolean (true or false). In SQL queries predicates are usually encountered in the WHEREclause and are used to filter data.’
  3. Predicate pushdowns filters differently in various query environments, eg Hive, Parquet/ORC files, Spark, Redshift Spectrum, etc.

I hate MVPs. So do your customers. Make it SLC instead.

  1. Customers hate MVPS, too M and almost never V – simple, complete, and lovable is the way to go
  2. The success loop of a product “is a function of love, not features”
  3. “An MVP that never gets additional investment is just a bad product. An SLC that never gets additional investment is a good, if modest product.”

Documenting for Success

  1. Keeping User Stories Lean and Precise with:
    1. User Story Objectives
    2. Use Case Description
    3. User Interaction and Wireframing
    4. Validations
    5. Communication Scenarios
    6. Analytics
  2. Challenges
    1. Lack of participation
    2. Documentation can go sour
  3. Solutions
    1. Culture – tradition of open feedback
    2. Stay in Touch with teams for updates
    3. Documentation review and feedback prior to sprint starts
    4. Track your documents

WTF is Strategy

  1. Strategic teaming is what sets apart seniors from juniors
  2. Strategy needs
    1. Mission: Problem you’re trying to solve and who for
    2. Vision: Idealized solution
    3. Strategy: principles and decisions informed by reality and caveated with assumptions that you commit to ahead of dev to ensure likelihood of success in achieving your vision
    4. Roadmap: Concreate steps
    5. Execution: Day-today activities
  3. “Strategy represents the set of guiding principles for your roadmapping and execution tasks to ensure they align with your mission and vision.”

Corporate Culture in Internet Time

  1. “”The dirty little secret of the Internet boom,” says Christopher Meyer, who is the author of “Relentless Growth,” the 1997 management-based-on-Silicon-Valley-principles book, “is that neither startup wizards nor the venture capitalists who fund them know very much about managing in the trenches.”
  2. “ The most critical factor in building a culture is the behavior of corporate leaders, who set examples for everyone else (by what they do, not what they say). From this perspective, the core problem faced by most e-commerce companies is not a lack of culture; it’s too much culture. They already have two significant cultures at play – one of hype and one of craft.”
  3. Leaders need to understand both craft and hype cultures since they have to rely on teams that come from both to deliver. They need to set-up team cultures and infrastructure that supports inter-team learning.

Do You Want to Be Known For Your Writing, or For Your Swift Email Responses? Or How the Patriarchy has fucked up your priorities

  1. Women are conditioned to keep proving themselves – our value is contingent on ability to meet expectation of others or we will be discredited. This is often true, but do you want to a reliable source of work or answering e-mails?
  2. Stop trying to get an A+ in everything, it’s a handicap in making good work. “Again, this speaks most specifically to women, POC, queers, and other “marginalized” folks. I am going to repeat myself, but this shit bears repeating. Patriarchy (and institutional bigotry) conditions us to operate as if we are constantly working at a deficit. In some ways, this is true. You have to work twice as hard to get half the credit. I have spent most of my life trying to be perfect. The best student. The best dishwasher. The best waitress. The best babysitter. The best dominatrix. The best heroin addict. The best professor. I wanted to be good, as if by being good I might prove that I deserved more than the ephemeral esteem of sexist asshats.”

Listen to me: Being good is a terrible handicap to making good work. Stop it right now. Just pick a few secondary categories, like good friend, or good at karaoke. Be careful, however of categories that take into account the wants and needs of other humans. I find opportunities to prove myself alluring. I spent a long time trying to maintain relationships with people who wanted more than I was capable of giving

  1. Stop thinking no as just no but saying yes to doing your best work

Dear Product Roadmap, I’m Breaking Up with You

  1. A major challenge is setting up roadmap priorities without real market feedback, especially in enterprise software
  2. Roadmaps should be planned with assets in place tied closely to business strategy
    1. A clearly defined problem and solution
    2. Understanding of your users’ needs
    3. User Journeys for the current experience
    4. Vision -> Business Goals -> User Goals -> Product Goals -> Prioritize -> Roadmap
  3. Prioritization should be done through the following lens: feasibility, desirability, and viability

The 7 Steps of Machine Learning Google Video

  • Models are created via training
  • Training helps create accurate models that answers questions correctly most of the time
  • This require data to train on
    • Defined features for telling apart beer and wine could be color and alcohol percentage
  • Gathering data, quality and quantity determine how good model can be
  • Put data together and randomize so order doesn’t affect how that determines what is a drink for example
  • Visualize and analyze during data prep if there’s a imbalance in data in the model
  • Data needs to be split, most for (70-80%) and some left for evaluation to test accuracy (20-30%)
  • A big choice is choosing a model – eg some are better for images versus numerical -> in the beer or wine example is only two features to weigh
  • Weights matrix (m for linear)
  • Biases metric (b for linear)
  • Start with random values to test – creates iterations and cycles of training steps and line moves to split wine v beer where you can evaluate the data
  • Parameter tuning: How many times we through the set -> does that lead to more accuracies, eg learning rate how far we are able to shift each line in each step – hyperparameters are experimental process bit more art than science
  • 7 Steps: Gathering Data -> Preparing Data -> Choosing a Model -> Training -> Evaluation -> Hyperparameter Tuning -> Prediction

Qwik Start Baseline Infra Quest: 

  • Cloud Storage Google Consolae
  • Cloud IAM
  • Kubernetes Engine

Treehouse Learning:  

Javascript OPP

  • In JavaScript, state are represented by objects properties and behaviors are represented by object methods.
    • Radio that has properties like station and volume and methods like turning off or changing a station
  • An object’s states are represented by “property” and its behaviors are presented by “method.”
  • Putting properties and methods into a package and attaching it to a variable is called encapsulation.

Intro SQL Window Functions

  • Function available in some variations of SQL that lets you analyze a row in context of entire result set – compare one row to other rows in a query, eg percent of total or moving average

Common Table Expressions using WITH

  • CTE – a SQL query that you name and reuse within a longer query, a temporary result set
  • You place a CTE at the beginning of a complete query using a simple context
--- create CTES using the WITH statement
WTH cte_name AS (
  --- select query goes here

--- use CTEs like a table
SELECT * FROM cte_name
  • CTE name is like an alias for the results returned by the query, you can then use the name just like a table name in the queries that follow the CTE
WITH product_details AS (
  SELECT ProductName, CategoryName, UnitPrice, UnitsInStock
  FROM Products
  JOIN Categories ON PRODUCTS.CategoryID = Categories.ID
  WHERE Products.Discontinued = 0

SELECT * FROM product_details
ORDER BY CategoryName, ProductName
SELECT CategoryName, COUNT(*) AS unique_product_count, 
SUM(UnitsInStock) AS stock_count
FROM product_details
GROUP BY CategoryName
ORDER BY unique_product_count
  • CTE makes code more readable, organizes queries into reusable modules, you can combine multiple CTEs into a single query, it can better match of how we think of results set in the real world
    • all orders in past month-> all active customers -> all products and categories
    • Each would be a CTE
  • Subqueries create result sets that look just like a table that can be joined to another tables
WITH all_orders AS (
  SELECT EmployeeID, Count(*) AS order_count
  FROM Orders
  GROUP BY EmployeeID
late_orders AS (
    SELECT EmployeeID, COUNT(*) AS order_count
    FROM Orders
    WHERE RequiredDate <= ShippedDate
    GROUP BY EmployeeID
SELECT Employees.ID, LastName,
all_orders.order_count AS total_order_count,
late_orders.order_count AS late_order_count
FROM Employees
JOIN all_orders ON Employees.ID = all_orders.EmployeeID
JOIN late_orders ON Employees.ID = late_orders.EmployeeID
  • Remember one useful feature of CTES is you can reference them later in other CTEs, eg. revenue_by_employee below pulling from all_sales
  • You can only reference a CTE created earlier in the query, eg first CTE can’t reference the third
all_sales AS (
  SELECT Orders.Id AS OrderId, Orders.EmployeeId,
  SUM(OrderDetails.UnitPrice * OrderDetails.Quantity) AS invoice_total
  FROM Orders
  JOIN OrderDetails ON Orders.id = OrderDetails.OrderId
  GROUP BY Orders.Id
revenue_by_employee AS (
  SELECT EmployeeId, SUM(invoice_total) AS total_revenue
  FROM all_sales
  GROUP BY EmployeeID
sales_by_employee AS (
  SELECT EmployeeID, COUNT(*) AS sales_count
  FROM all_sales
  GROUP BY EmployeeID
SELECT revenue_by_employee.EmployeeId,
revenue_by_employee.total_revenue/sales_by_employee.sales_count AS avg_revenue_per_sale
FROM revenue_by_employee
JOIN sales_by_employee ON revenue_by_employee.EmployeeID = sales_by_employee.EmployeeID
JOIN Employees ON revenue_by_employee.EmployeeID = Employees.Id
ORDER BY total_revenue DESC

Weekly Data Viz Decomp: The Guardian’s Premier League Transfer Window Summer 2016

Weekly data visualization decomps to keep a look out for technique and learning.

This week’s viz: Premier League: transfer window summer 2016 – interactive

Decomposition of a Visualization:

  • What are the:
    • Variables (Data points, where they are, and how they’re represented):
      • Bubble size for size of transfer
      • Color hue denoting transfer or out of team
      • Position for date close to transfer window
    • Data Types (Quantitative, Qualitative, Categorical, Continuous, etc.):
      • Qualitative and categorial
    • Encodings (Shape, Color, Position, etc.):
      • Shape, position, size, color hue
  • What works well here?
    • Showing a small multiples type view for each team and their transfers
  • What does not work well and what would I improve?
    • Having the totals summary numbers on the side of the charts is a little unorthodox and unintuitive
    • Bubbles seem to be placed arbitrarily without thought to the y-axis, even though the x-axis has meaning
    • Not immediately clear why some players are featured and noted in tooltips versus those that are not
  • What is the data source?  Do I see any problems with how it’s cited/used
    • Seems to be original Guardian data collected about the English Premier League, but not as clearly stated as I’d like to expect
  • Any other comments about what I learned?
    • Example of something pleasing to the eye in terms of color hue and perhaps some flash factor, but perhaps not that functional to explore upon closer examination.
      • Certainly sense for the purposes of the Guardian though in putting out this story and is a technique I’d borrow if I had a use case
      • Good for showing a bigger picture view
    • Probably not worth it in terms of the work it would be taken incrementally as filters are difficult to work and can be computationally expensive, but the nerd in me would have liked to search for the player

More MLS 2015 Visual Exploration Tools


Previously, I created an interactive view just looking at goal breakdowns by Major League Soccer teams overall for the 2015 season.  I’ve added several more frames to look at the breakdown of the same dataset to this new view in a exploratory way to see variables that correspond or not.


As per verbatim from my previous post:

I’ve really into soccer in general, especially international play, after my grad school project where I worked at the Annenberg Innovation Lab collaborating with Havas Sports and Entertainment and IBM on a research project studying soccer fans (see Fans.Passions.Brands, Fan Favorites, and Sports Fan Engagement Dashboard). I also did my degree practicum on Marketing to Female Sports Fans.

I’m now in another universe creating data visualization at an advertising agency and am trying to combine the geeky fandom with practical practice related to my daily work.



I deliberately tried to use Tableau components and styling that was out of the ordinary, to some mixed level of success.  I put in a custom color palette in the Tableau repository preferences file.  Also, I tried to take advantage of using the context filters (when you click on one bar graph of a team for example, the other charts only show stats about that team you just click on instantly), scale filters, and pivoting the data on the third dimension, using both a color gradient and size of a value in a chart for instance.


Trying to stretch the design and exploratory strengths of Tableau here.  The one knock I give for Tableau in 2016 is that it doesn’t present data in a sexy enough way compared to Javascript-based visuals.  On the other hand, none of the Javascript based tools democratizes creating views you can explore with some pretty heavy statistical tools, one could some of the basic functions many people use SPSS for could be better left using Tableau as one integrated tool.  In particular, I utilized the r-squared and p-values to show correlations across different metrics that might matter or be interesting to see how they hold to some teams or not.  There’s not much of correlation between Corner Kicks and Goals for teams overall for instance, but there is greater negative correlation for those teams who have more losses.

Esc the City Un-Conference Talk: Simple Slide Design and Data Viz Crash Course

I’m currently attending a program called Escape the City which “helps mainstream professionals make proactive & entrepreneurial career changes.”  We’re part of the “Founding Members” cohort as the first iteration of this program stateside.

I’m not looking to leave my current job, which I’m pretty happy with. I definitely did want to meet other ambitious people and unconventional thinkers.  A mentor who had done the program in London, where Escape started, said I’d benefit.  I have been loving it so far.  I’ve found it beneficial to being more present, proactive, and creative at work and outside of it.

One of part of the program we did last night was the Un-Conference, where individuals in the program presented on different topics: everything from Learnings from Training for Endurance Races, Self-Acceptance and How to Love Yourself, Web Development 101, Relaxation with Origami, to name just a few.

As part of building on my knowledge and sharing it, I did a talk on Simple Design with a Data Visualization Crash Course below.  I hope my fellow participants found it useful, especially since many of them are thinking about starting and pitching their own businesses.

After 180 Days of Data Viz Learning #jfdi #dataviz #done

I noticed this when I logged into my WordPress account and realized I really need to do this debrief now that I’ve more than properly decompressed and feel a surge from inspiration from attending the OpenVis Conf.


A summary of what I did:

I read:

  • The Functional Art by Alberto Cairo
  • The Visual Display of Quantitative Information by Edward Tufte
  • Data Points Visualization That Means Something by Nathan Yau
  • Visualize This: The Flowing Data Guide to Design, Visualization, and Statistics by Nathan Yau
  • The Wall Street Journal Guide to Information Graphics by Dona M. Wong
  • The Best American Infographics 2014 by Gareth Cook
  • Show Me The Numbers by Stephen Few

2016-03-21 09.13.12.jpg

I studied:

  • Knight Center Course on D3.js and Data Visualization
  • Treehouse D3 Course
  • Data Visualization and D3.js on Udacity

I coded/created:

  • Tableau business dashboards on digital marketing attribution, including customized DMA maps, etc that are beyond the typical drag and drop.
  • D3 scatterplots, scatterplot matrixes, node link trees, hierarchal force layouts, sankey, bar charts, bubble charts, sunbursts, histograms, and even pie chart

I accomplished:

  • Gaining a design sensibility for data visualization
  • Understanding data connections and issues around them (eg. Vertica, Cubes, SQL, S3, etc.)
  • Solid foundation of D3
  • Strong skills in Tableau
  • Conceptual understanding of visualization suites in general, such as R libraries, other Javascript libraries, and other Enterprise BI tools (Quikview, Power BI)
  • Being the thought leader in data visualization in my organization

To take a step back, I embarked on this journey because I got new role with the job title as Data Visualization Manager.  I talked about this in my first post and embarked on 180 Days of Data Viz Learning as inspired by Jennifer Dewalt’s Project 180 Websites in 180 Days.  It’s been a journey with highs and lows, good, bad, and ugly.  I walked away with a strong design and business sensibility, new hard skills and knowledge, and an understanding of data visualization at the both an enterprise and the open source level.  

Creating a Design, Business Intelligence, and Product Sensibility

One big thing I set out on as a differentiator was that I didn’t just want to learn to code or just be able to make pretty visualizations.  There are many people who can code better than me and will always be able to code better than me.  There are also many people who can make much more beautiful visuals than me.  I’m not naturally inclined toward art or to computer science in terms of innate talent or passion, but I recognize the importance of bridging those two disciplines would be for this endeavor in my career.  For me, I don’t consider coding my passion.  I’m also no designer or artist.  I’ve never considered myself in the “creatives.” I consider communication and storytelling as my passion, and code is a means to construct that narrative.  Being a bridger of ideas and practices is a number one priority in my life.

The Good

The process really forced me to learn and focus, even if in the end it took far longer than 180 Days, roughly seven months.  Not bad I think considering I did go on two overseas vacations and did a cross country move during that time.  I sincerely think I would not have gotten so much done had I not felt compelled to do some work everyday.  

For my own practical purposes as the “bridger.” I wanted to make sure I had a strong background on design concepts related to data visualization and also how gain a proficiency in the tools required for my role.  Tying that all together is what I wanted to develop out as a strength.  I can talk intelligently about how performance issues in a dashboard can be influenced by projections in HP Vertica or how the data needs to be prepared in a Hadoop cluster first and then how to query it into the right format for a visualization.  I can talk visual encodings and perception from a design perspective, the grammar of graphics and all that.  And I can talk about the strengths and weakness of Tableau/other enterprise tools and what libraries we can use to scale D3.  I can talk about these things, and I slowly get better at doing these things everyday.

Doing a little bit of these things everyday really pushed me in a way I don’t think I would have.  Sometimes it just ended up being five minutes of extra reading at night to five hour work sessions.  Ironically doing the 180 Days had a great side effect in making me aware of a larger strategic view at my work that I realize I lost when I stopped.  I also inadvertently lost 10 lbs and started reading more everyday because this whole endeavor made me much more mindful.

The Bad:

Learning theory and application at once isn’t easy from a motivational perspective.  I’m the only person working with Open Source tools and doing data visualization beyond a dashboard or business analyst (eg excel type) graphs perspective, but I had to do a lot of that in my day-to-day as well.  It can really grind on you reading and seeing beautiful visualization and then taking over ten hours to do a successful visualization in D3.  Prepare for that.

There’s a flipside in doing something everyday, in that by having to do a little bit everyday, it can become a quantity over quality game.  I had more nights that ended up later than I wanted because I rushed to read and learn before going to bed.  It might have made more sense on some days to just do a longer block to focus than try to something EVERY DAY.  I’m trying to learn Javascript and Front-End in general now along with a couple of other topics in my life, and I’m not going about the same way I did with the 180 Days of Data Viz Learning.

Lessons Learned and Tips

  1. Really go for easy wins if you do try to get in some work everyday.  My main goal was to have three takeaways for every lecture session where I was watching videos online or from reading I did that day to absorb the information.  Decomposing visuals was especially helpful and is a good process to learn when you need to turn the tables on your own.
  2. Find partners in your journey.  Up until the OpenVisConf last week, I had no barometer to measure to see how much I knew and learned.  I got down on myself more often than I needed to given how much I did learn and all the balls I had in the air at once.  Having a support group/sounding board would have made the journey better and I would have learned more.
  3. It was hard to learn theory and application at once – mainly because you’ll be making so little progress at first.  It was a bummer and is still a bummer to see how far I am from being able to do the work of people I admire.  Also, unlike other skills I have (eg. I’ve been doing advertising and project management for years), Data Ziz is still new.  I have bigger ambitions than what I can make, and that is reality and that is ok, but it’s hard to accept sometimes.  Maybe this is a mental thing for me, but mentioning it as I’m sure it’s something someone else may run into in their process.
  4. Procrastination is normal.  Figure out if you’re tired out/burned out or else just annoyed and anxious.  If annoyed and anxious, just try working for five minutes.  If you can keep working, chances are you aren’t tired or burned out and have some internal dialogue to work through or some process to improve on.
  5. Data Viz Burn out.  It can happen, and I definitely felt that way, which is why I actually advise against the 180 days straight versus working on a few projects concurrently and going through books and tutorials in a more paced versus trying to pick up bits and pieces everyday.  Also, I got to the point where I was doing Data Viz at work and going home to do more Data Viz, that I got to a point where didn’t like something I enjoyed anymore.  Also, on this note, take  time to just enjoy other Data Viz projects too like a regular audience would, rather than doing critique or learning.  Burnout is normal, rest accordingly.
  6. Don’t give up!  I didn’t get where I wanted to in terms of a creative or technical skillset that I had hoped for after a 180 days and even now, but I did progress significantly in what I know and enjoyed the journey.


Day 180 of 180 Days of Data Viz Learning #jfdi #done #dataviz

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Guess what?  It took longer than 180 days, but it’s been a pretty cool journey.  Did my daily learning will post a debrief early next week.  This has been quite the intellectual and emotional exercise for me.  Learned so much about data viz + more.

Eljiah Meeks D3.js in Action

Chapter 5 Layouts
Some last takeaways
  • One key with generators eg. d3.svg.arc is that they have particular settings p 144
  • “One of the core uses of a layout in D3 is to update the graphical chart. All we need to do is make changes to the data or layout and then rebind the data to the existing graphical elements” p 146
  • If transitions are distorted because of default data-binding keys, you may need to change sorts, eg pieChart.sort(null); in conjunction with exit and update behavior p 147
  • Layouts in D3 expect a default representation of data,  usually a JSON object array where the child elements in a hierarchy are stored in child attribute that points to an array.  p 149
  • You can either data munge or get use to using accessor functions p 149
  • Pack layouts have a built-in .padding() function to adjust spacing and a .value() function to size out spacing and influence size of parent nodes p 151

Reading and Learning Data Visualization Theoretically/Critically:

Show Me the Numbers by Stephen Few

Three Takeaways Today

Chapter 5 Visual Perception and Graphical Communication
  • “Our perception of space is primarily two dimension.  We perceive differences in vertical position (up and down) and in horizontal position (left and right) clearly and accurately. We also perceive a third dimension, depth, but not nearly as well.” p 71
  • “We perceive hues only as categorically different, not quantitatively different; one hue is not more or less than another, they’re just different.  In contrast, we perceive color intensity quantitatively, from low to high” p 71
  • Both size and color intensity are not the best way to code quantitative values.  The key is not good at matching a number to a relative size or color intensity -> use length or position if possible instead p 73

Day 179 of 180 Days of Data Viz Learning #jfdi #dataviz

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Eljiah Meeks D3.js in Action

Chapter 5 Layouts

Three Takeways Today

  • Layouts are D3 functions that help format data so it can be used for select group of charting methods p 139
  • Layouts do not actually draw the data nor are they called like components or referred to in the drawing code like generations.  They’re actually a preprocessing step that formats data so it’s ready to be displayed in the form of the visual.  You can update layouts and if you rebind the altered data, you can use D3 enter/update/exit syntax. p 139
  • “Many people get who started with D3 think it’s a charting library, and that they’ll find a function like d3.layout.histogram that creates a bar chart in a <div> when it’s run.  But D3 layouts don’t result in charts; they result in the settings necessary for charts.  You have to put in a bit of extra work for charts, but you have enormous flexibility (as you’ll see in this and later chapters) that allows you to make diagrams and charts that you can’t find in other libraries” p 141

Day 178 of 180 Days of Data Viz Learning #jfdi #dataviz

I’m doing some form of data visualization learning for 180 days because I need to #JFDI.

See post explaining how and why I’m doing this.

Eljiah Meeks D3.js in Action

Chapter 4 General Charting Principles 

Three Takeways Today

// Callback function breakdowns

var n = 0 // counter to increment
for (x in data[0])
if (x != day) { //not drawing a line for day value of each object because this is giving x value of coordinate
var movieArea = d3.svg.area() // generator that iterates through each objet that corresponds to one of our movies using day vlaue for x coorindate but iterating through values for each movie for the y coordinates
.x(function(d) {
return xScale(d.day)
.y(function(d) {
return yScale(d,x))
.y0(function(d) {
return yScale(simpleStacking(d,x) – d[x])

// Stacking function. Takes incoming bound data and name of attr and loops throuhg the incoming data, adding each value until it reaches current named attribute. As a result, it returns the total value for every movie during this day up to the movie we’ve sent.

function simpleStacking(incomingData, incomingAttribute) {
var newHeight = 0
for (x in incomingData) {
if (x != “day”) {
newHeight += parseInt(incomingData[x]);
if (x == incomingAttribute) {
return newHeight;
// Stacking function that alternates vertical position of area drawn p 136