Jul-Dec Learning

What’s This?

I’m trying to give myself at least half an hour during the workdays (or at least blocking two hours or so a week at least) to learn something new – namely taking classes/reviewing what I know on Treehouse, reading job related articles, and reading career-related books. Tracking notables here on a monthly basis as a self-commitment and to retain in memory and as reference. I tell off posting this the last six months with work and life has been insanely busy and my notes inconsistent across proprietary work versus my own, but worth a round-up here. Posting with good intentions for next year. Reminding myself that if I don’t capture every bit I did, it’s alright. Just keep yourself accountable.

Books Read:

So Good They Can’t Ignore You

Key Points:

  • It’s not about passion – it’s about gaining career capital so you have more agency over a career you want.
  • Control traps 1) you don’t have enough career capital to do what you want 2) employers don’t want you to change/advance/slowdown because you have skills valuable to them
  • Good jobs have autonomy, financial viability, and mission – you can’t get there on passion alone.
  • Figure out if the market you wish to succeed in is winner-take-all, one killer skill, eg. screenwriting is all about just getting a script read or auction-based, diverse collection of skills, eg. running a complex business.
  • Make many little bets and try different things that give instant feedback to see what is working or not and show what you’re doing in a venue that will get you noticed.
  • On Learning
    • Research bible-routine – summarize what you might work on – description of result and strategies used to do it.
    • Hour-tally and strain – just work on for an hour and keep track of it
    • Theory-Notebook – brainstorm notebook that you deliberately keep track of info in
    • Carve out time to research and independent projects
  • “Working right trumps finding the right work” p228
  • Good visual summary

The Manager’s Path

Key Points:

  • “Your manager should be the person who shows you the larger picture of how your work fits into the team’s goals, and helps you feel a sense of purpose in the day-to-day work”
  • “Developing a sense of ownership and authority for your work and not relying for manager to set the tone”
  • “Especially as you become more senior, remember that your manager expects you to bring solutions, not problems”
  • “Strong engineering managers can identify the shortest path through the systems to implement new futures”
  • Dedicate 20% of time in planning meetings to sustainability work
  • “Be carefully that locally negative people don’t stay in that mindset on your team for long. The kind of toxic drama that is created by these energy vampires is hard for even the best manager to combat. The best defense is a good offense in this case”
    • You are not their parent – treat them as adults and don’t get emotionally invested in every disagreement they have with you personally.

Articles:

What is a predicate pushdown? In mapreduce

  1. Concept is if you issue a query to run in one place you’d spawn a lot of network traffic, making that query slow and costly. However, if yo updush down parts of the query to where data is stored and thus filter out most of the data, you reduce network traffic.
  2. You filter conditions as True or False – predicates, and pushdown query to where the data resides
  3. For example, you don’t need to pass through every single column for every map reduce job in the pipeline for no reason so you filter so you don’t read the other columns

What is a predicate pushdown?

  1. The basic idea is to push certain parts of SQL queries (the predicates) to where the data lives to optimize the query by filtering out data earlier rather than later so it skips reading entire files or chunks of files to reduce network traffic/processing time
  2. This is usually done with a function that returns a boolean in the where cause to filter out data
  3. Eg example below where clause “WHERE a.country = ‘Argentina’”
SELECT *
  a.*
FROM
  table1 a
JOIN 
  table2 b ON a.id = b.id
WHERE
  a.country = 'Argentina';

The Leaders Calendar

  1. 6 hours a day of non-work time, half with family and some downtime with hobbies
  2. Setting norms and expectations with e-mail is essential. For example sending e-mails from CEO late at night sets a wrong example for the company or CEO’s time is spend cc’d on endless irrelevant items.
  3. Be agenda-driven to optimize limited time and also not only let the loudest voices stand out so that the important work can get done, not just the work that appears the most urgent be work on strategy.
    1. A key way to do this is to limit routine activities that can be given to a direct report

What People Don’t Tell You About Product Management

  1. “Product Management is a great job if you like being involved in all aspects of a product — but it’s not a great job if you want to feel like a CEO.”
    1. You don’t necessary make the strategy, have resources, and have the ability to fire people. Your job is to get it done by being resourceful and convincing.
  2. Product Managers should channel the needs to the customer and follow a product from conception, dev, launch, and beyond. Be a cross functional leader coordinating between R&D, Sales, Marketing, Manufacturing, and Operations. Leadership and coordination are key. Your job is to make strategy happen and convincing people you work with.
  3. “For me, product management is exciting and stressful for the same reason: there’s unpredictability, there’s opportunity to create something new (which also means it may be unproven), and you’re usually operating with less data than you’d like, and everything is always a little bit broken.”

Web Architecture 101

  1. In web dev you almost always want to scale horizontally, meaning you add more machines into your pool of resources, versus vertically, meaning scaling by adding more powers (eg. CPU, RAM) to an existing machine, this redundancy allows you to have another plan so your applications keep running if a server goes down and makes your app more fault tolerant. You can also minimally couple different parts of the app backend to run on different servers.
  2. Job queues store lists of jobs that need to be run asynchronously – eg Google does not search the entire internet every time you do a search, it crawls the web asynchronously and updates search indexes along the way
  3. Typical data pipeline: firehouse that provides streaming interface to ingest and process data (eg. Kinesis and Kafka) -> raw data as well as final transformed/augmented data saved to cloud storage (eg. S3) -> data loaded into a data warehouse for analysis (eg. Redshift)

Running in Circles – Why Agile Isn’t Working and What We Do Differently

  1. “People in our industry think they stopped doing waterfall and switched to agile. In reality they just switched to high-frequency waterfall.”
  2. “Only management can protect attention. Telling the team to focus only works if the business is backing them up.”
  3. Think of software development as going uphill when you’re finding out the complexity/uncertainty and then downhill when you have certainty.

Product Managers – You Are Not the CEO of Anything

  1. Too many product managers think their role is that of an authoritarian CEO (with no power) and often disastrous because they think they have all the answers.
  2. You gain credibility through your actions and leadership skills.
  3. Product management is a team sportafter all, and the best teams don’t have bosses – they have coaches who ensure all the skills and experiences needed are present on the team, that everyone is in the right place, knows where the goal is, and then gets out of the way and lets the team do what they do best in order to reach that goal.”

Product Prioritization: How Do You Decide What Belongs in Your Product?

  1. Radical vision with this mad lips template Today, when [customer segment]want to [desirable activity/ outcome], they have to [current solution] . This is unacceptable, because [shortcomings of current solutions]. We envision a world where [shortcomings resolved]. We are bringing this world about through [basic technology/ approach].
  2. Four components to good product strategy
    1. Real Pain Points means “Who is it for?” and “What is their pain point?”
    2. Designrefers to “What key features would you design into the product?” and “How would you describe your brand and voice?”
    3. Capabilitiestackles the “Why should we be the ones doing it?”and “ What is our unique capability?”
    4. Logisticsis the economics and channels, like “What’s our pricing strategy?” and “What’s the medium through which we deliver this?”
  3. Then prioritize between sustainable and good fit

To Drive Business Success Implement a Data Catalog and Data Inventory

  • Companies have a huge gap between knowing where the data is located simply and what to do with it
  • Three types of metadata
    • Business Metadata: Give us the meaning of data you have in a particular set
    • Technical Metadata: Provide information on the format and structure of data – databases, programming envs, data modeling tools natively available
    • Operational Metadata: Audit trail of information of where the data came from, who created it, etc.
  • “Unfortunately, according to Reeve, new open source technologies, most importantly Hadoop, Hive, and other open source technologies do not have inherent capabilities to handle, Business, Technical AND Operational Metadata requirements. Firms cannot afford this lack as they confront a variety of technologies for Big Data storage, noted Reeve. It makes it difficult for Data Managers to know where the data lives.” http://www.dataversity.net/drive-business-success-implement-data-catalog-data-inventory/

Why You Can’t be Data Driven Without a Data Catalog

  1. A lot of data availability in organizations is “tribal knowledge” which severly limits the impact data has in an organization. Data catalogs should capture tribal knowledge
  2. Data catalogs need to work to have common definitions of important concepts like customer, product, and revenue, especially since different divisions actually will think of those concepts differently.
  3. A solution that one company did was a Looker-power integrated moel with a GitBook data dictionary.

What is a data catalog?

  1. At its core, a data catalog centralizes metadata. “The difference between a data catalog and a data inventory is that a data catalog curates the metadata based on usage.”
  2. Different types of data catalog users falls into three buckets
    1. Data Consumers – data and business analysts
    2. Data Creators – data architects and database engineers
    3. Data Curators – data stewards and data governors
  3. A good data catalog must
    1. Centralize all info on data in one place – structure, quality, definitons, and usages
    2. Allow users to self-service
    3. Auto-populate consistency and with accuracy

Why You Need a Data Catalogue and How to Select One

  1. “A good data catalog serves as a searchable business glossary of data sources and common data definitions gathered from automated data discovery, classification, and cross-data source entity mapping. Automated data catalog population is done via analyzing data values and using complex algorithms to automatically tag data, or by scanning jobs or APIs that collect metadata from tables, views, and stored procedures.”
  2. Should foster search and reuse of existing data in BI tools
  3. Should almost be an open platform where many people can use to see what they want to do with that

10 Tips to Build a Successful Data Catalog

  1. Who – understand the owner or trusted steward for asset
  2. What – aim to for a basic description of an asset as a minimum: business terminology, report functionality, and basic purpose of a dataset
  3. Where – where the underlying assets are

The Data Catalog – A Critical Component to Big Data Success

  1. Most data lakes do not have effective metadata management capabilities that make using them inefficient
    1. Need data access security solutions (role and asset), audit trails of update and access, and inventory of assets (technical and business metadata)
  2. First step is to inventory existing data and make it usable at a data store level – table, file, database, schema, server, or directory
  3. Figure out how to ingest new data in a structure manner, eg. data scientist wants to incorporate new data in modeling

Data Catalog for the Enterprise – What, Why, & What to look for?

  1. With the growth of enormous data lakes, data sets need to be discovered, tagged, and annotated
  2. Data catalogs can also eliminate database duplicity
  3. Challenges of implementing data catalogs include educating org on the value of a single source of data, dealing with tribalism, and

Bridging the gap: How and why product management differs from company to company

  1. NYC vs SF product management disciplines are different due to key ecosystem factors: NYC driven by tech enhancing existing industries and thus sales driven while Bay Area creates entire new categories: vision and collaboration driven. NYC more stable exits but less huge ones
  2. This dichotomy in product management approaches is due to how to bring value to different markets
  3. Successful product managers need six key ingredients
    1. Strategic Thinking
    2. Technical proficiency
    3. Collaboration
    4. Communication
    5. Detail Orientation
    6. User Science

Treehouse Learning

Rest API Basics

  • REST (Representational State Transfer) is really just another layer of meaning on top of HTTP
  • API provides a programmatic interface, a code UI basically, to that same logic and data model. Communication is done through HTTP and burden on creating interface is on users of API, not the creator
  • Easy way to say it – APIs provide code that makes things outside of our application easier to interact inside of our application
  • Resources: usually a model in your application -> these retrieved, created, or modify in API in endpoints representing collections of records
    • api/v1/players/567
  • Client request types to API:
    • GET is used for teching either a collection of resources or single resource.
    • POST is used to add a new resource to a collection, eg. POST to /games to create a new game
    • PUT is HTTP method we use when we want to update a record
    • DELETE is used for sending a DELETE request to a detail record, a URL for a single record, or just deleting that record
  • Requests
    • We can use different aspects of the requests to change the format of our response, the version of the API, and more.
    • Headers make transactions more clear and explicit, eg. Accept (specifies file format requester wants), Accept-Language, Cache-Control
  • Responses
    • Content-Type: text/javascript – > header to specify what content we’re sending
    • Other headers include Last-Modified, Expired, and Status (eg. 200, 500, 404)
    • 200-299 content is good and everything is ok
    • 300-399 request was understood but the requested resource is now located somewhere else. Use these status codes to perform redirects to URLs most of the time
    • 400-499 Error codes, eg wrongly constructed or 404 resource no longer exists
    • 500-599 Server End errors
  • Security
    • Cache: usually a service running in memory that holds recently needed results such as a newly created record or large data se. This helps prevent direct database calls or costly calculations on your data.
    • Rate Limiting: allowing each user only a certain number of requests to our API in a given period to prevent too many requests or DDOS attacks
    • A common authentication method is the use of API toekns – you give your users a token and secret key as a pair and they use those when they make requests to your server so you know they are who they say are.

Planning and Managing the Database

  • Data Definition Language – language that’s used to defined the structure of a database

When Object Literals Are Not Enough

  • Use classes instead of object literals to not repeat so much code over and over again
  • Class is a specification, a blueprint for an object that you provide with a base set of properties and methods
  • In a constructor method, this refers to the object that is being created, which is why it’s the keyword here.

Google Machine Learning Course (30% through highlights)

ML – reduces time programming

  • Scales making sense of data
  • Makes projects customizable much more easily
  • Let’s you solve programming problems that humans can’t do but algos do well
  • Use stats and not logic to solve problems, flips the programming paradigm a bit

Label is the thing we’re picking, eg. Y in linear regression
Features are Xs or way we represent our data, an input variable
– eg. header, words in e-mail, to and from addresses, routing info, time of day
Example: particular instance of data, x, eg. an email

Labeled example has { features, label}: (x, y) – used to train model ( email, spam or not spam)
Unlabeled examples has {features, ?}: (x, ?) – used for making predictions on new data (email, ?)
Model: thing that does predicting. Model maps examples to predicted labels: y’ – defined by internal parameters, which are learned

Framing: What is Supervised Machine Learning? Systems learn to combine input to product useful predictions on never before seen data
* Training means creating or learning the model.
* Inference means applying the trained model to unlabeled examples to make useful predictions (y’)
* Regression models predict continuous values: eg. value of house, probability user will click on an head
* Classification model: predicts discrete values, eg. is the given e-mail message spam or not spam? Is this an image of a dog, cat, or hampster?

Descending into ML
y = wx + b

w refers for weight vectors, gives slope

b gives bias

Loss: loss means how well line is predicting example, eg. distance from line
* loss is on a 0 through positive scale
* Convenient way to define loss for linear regression
* L2Loss also known as squared error = square of difference between prediction and label (observation – prediction)2 = (y-y’)2
* We care about minimizing loss all across datasets
* Measure of how far a model’s predictions are from its label – a measure of how bad the model is

Feature is an input variable of x value – something we know

Bias: b or An intercept or offset from an origin. Bias (also known as the bias term) is referred to as b or w0 in machine learning models.

Inference: process of making predictions by applying trained models to unlabeled examples. In statistics, inference refers to the process of fitting the parameters of a distribution conditioned on some observed data

Unlabeled example
An example that contains features but no label. Unlabeled examples are the input to inference. In semi-supervised and unsupervised learning, unlabeled examples are used during training.

Logistic regression
Model that generates probability for each possible discrete label value in classification problems by applying a sigmoid function to a linear prediction. Can be used for binary or multi-class classifications

Sigmoid function
function that maps logistic or multinomial regression output (log odds) to probabilities, returning a value between 0 and 1. Sigmoid function converts variance

K-means: clustering algorithm from signal analysis

Random Forest
Ensemble approach to finding decision tree the best fits training data by creating many decision trees and then determining the average – the random part of the term refers to building each of the decision trees from a random selection of features, the forest refers to the set of decision trees

Weight
Coefficient for a feature in a linear model or edge in a deep network. Goal of training a linear model is to determine the ideal weight for each feature. If a weight is 0, then its corresponding feature does not contribute to the model

Mean squared error (MSE) average squared loss per data set -> sum squared losses for each individual examples and divide by # examples

Although MSE is commonly-used in machine learning, it is neither the only practical loss function nor the best loss function for all circumstances.

empirical risk minimization (ERM): Choosing the function that minimizes loss on the training set.

sigmoid function: A function that maps logistic or multinomial regression output (log odds) to probabilities, returning a value between 0 and 1. In other words, the sigmoid function converts sd from logistic regression into a probability between 0 and 1.

binary classification: classification task that outputs one of two mutually exclusive classes, eg. hot dog not a hot dog

Logistic Regression
* Prediction method that gives us probability estimates that are calibrated
* Sigmoid something that gives bounded value between 0 and 1
* useful for classification tasks
* regularization important as model will try to drive losses to 0 and weights may go crazy
* Linear Logistic Regression is fast, efficient to train, and efficient to make predictions and scales to massive data and good for low latency data
* A model that generates a probability for each possible discrete label value in classification problems by applying a sigmoid function to a linear prediction. Although logistic regression is often used in binary classification problems, it can also be used in multi-class classification problems (where it becomes called multi-class logistic regression or multinomial regression).

Many problems require a probability estimate as output. Logistic regression is an extremely efficient mechanism for calculating probabilities. Practically speaking, you can use the returned probability in either of the following two ways:
* “As is”
* Converted to a binary category.

Suppose we create a logistic regression model to predict the probability that a dog will bark during the middle of the night. We’ll call that probability:
* p(bark | night)
* If the logistic regression model predicts a p(bark | night) of 0.05, then over a year, the dog’s owners should be startled awake approximately 18 times:
* startled = p(bark | night) * nights
* 18 ~= 0.05 * 365

In many cases, you’ll map the logistic regression output into the solution to a binary classification problem, in which the goal is to correctly predict one of two possible labels (e.g., “spam” or “not spam”).

Early Stopping
* Regularization method the ends model before training loss finishes decreasing. You end when loss on validation dataset starts to increase

Key takeaways:
* Logistic regression models generate probabilities. In order to map a logistic regression value to a binary category, you must define a classification threshold (also called decision threshold), eg the value where you can categorize something as hotdog not a hotdog. (Note: Tuning a threshold for logistic regression is different from tuning hyperparameters such as learning rate)
* Log Loss is the loss function for logistic regression.
* Logistic regression is widely used by many practitioners.

Classification
* We can use logistic regression for classification by using fixed thresholds for probability outputs, eg, it’s spam if it exceeds .8.

You can evaluate classification performance by
* Accuracy: fraction of predictions we got right but has key flaws, eg. if there are class imbalances, such as when positives and negatives are extremely rare for example predicting CTRs. You can have model no features but a bias a feature that causes it ti predict false always that would be highly accurate but has no value

Better is to look are True Positives and False Positives
* True Positives: Correctly Called
* False Positives: Called but not true
* False Negatives: Not predicted and it happened
* True Negatives: Not called and did not happen
* A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative is an outcome where the model correctly predicts the negative class.
* A false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is an outcome where the model incorrectly predicts the negative class.

Precision: True positive/all positive predictions, how precisely was positive class right
Recall: True Positives/All Actual Positives: out of all the possible positives, how many did the model correctly identify
* If you raise classification threshold, reduces false positives and raises precision
* We might not know in advance what best classification threshold is – so we evaluate across many possible classification thresholds – this is ROC curve

Prediction Bias
* Sum of all these we predict to all things we observe
* ideally – average of predictions == average of observations
* Logistic predictions should be unbiased
* Bias is a canary, zero bias does not mean all is good but it’s a sanity check. Look for bias in slices of data to guide improvements and debug model

Watch out for class imbalanced sets, where there a significant disparity between the number of positive and negative labels. Eg. 91% accurate predictions but only 1 TP and 8 FN, eg. 8 out of 9 malignant tumors end up undiagnosed.

Calibration Plots Showed Bucketed Bias
* Mean observation versus mean prediction

Precision = TP/(TP + FP) number of labels correctly classified
Recall = TP/(TP + FN) = how many actual positives were identified correctly, attempts to answer the question, how many of the actual positives was identified correctly?
* To evaluate the effectiveness of models, you must examine both precision and recall which are often in tension because improving precision typically reduces recall and vice versa.
* When you increase the classification threshold, then number of false positives decrease, but false negatives increase, so precision increases while recall decreases.
* When you decrease the classification threshold, false positives increase and false negatives negatives decrease, so recall increase while precision decreases.
* eg. If you have a model with 1 TP and 1 FP = 1/(1+1) = precision is 50% and when it predicts a tumor is malignant, it is correct 50% of the time

Precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known as sensitivity) is the fraction of relevant instances that have been retrieved over the total amount of relevant instance
* Suppose a computer program for recognizing dogs in photographs identifies 8 dogs in a picture containing 12 dogs and some cats. Of the 8 identified as dogs, 5 actually are dogs (true positives), while the rest are cats (false positives). The program’s precision is 5/8 while its recall is 5/12. When a search engine returns 30 pages only 20 of which were relevant while failing to return 40 additional relevant pages, its precision is 20/30 = 2/3 while its recall is 20/60 = 1/3. So, in this case, precision is “how useful the search results are”, and recall is “how complete the results are”.
* In an information retrieval scenario, the instances are documents and the task is to return a set of relevant documents given a search term; or equivalently, to assign each document to one of two categories, “relevant” and “not relevant”. In this case, the “relevant” documents are simply those that belong to the “relevant” category. Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents, while precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search.
* In information retrieval, a perfect precision score of 1.0 means that every result retrieved by a search was relevant (but says nothing about whether all relevant documents were retrieved) whereas a perfect recall score of 1.0 means that all relevant documents were retrieved by the search (but says nothing about how many irrelevant documents were also retrieved).

ROC
* Receiver Operating Characteristics Curve
* Evaluate every possible classification threshold and look at true positive and false positive rates
* Area under that curve has probabilistic interpretation
* If we pick a random positive and random negative, what’s the probability my model ranks them in the correct order – that’s equal to area under ROC curve

Gives aggregate measure of performance aggregated across all possible classification thresholds
TP Rate X-axis FP Rate Y-Axis
AUC = area under curve
* Probably that model ranks a random positive example more highly than a random negative example:
* One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example. A model whose predictions are 100% wrong has an AUC of 0.0; one whose predictions are 100% correct has an AUC of 1.0.

Characteristics of AUC to note:
* AUC is scale-invariant. It measures how well predictions are ranked, rather than their absolute values. Note: this is not always desireable: sometimes we really do need well calibrated probability outputs, AUC does not provide that
* AUC is classification-threshold-invariant. It measures the quality of the model’s predictions irrespective of what classification threshold is chosen.
* Classification-threshold invariance is not always desirable. In cases where there are wide disparities in the cost of false negatives vs. false positives, it may be critical to minimize one type of classification error. For example, when doing email spam detection, you likely want to prioritize minimizing false positives (even if that results in a significant increase of false negatives). AUC isn’t a useful metric for this type of optimization.
Logistic regression predictions should be unbiased.
* That is: “average of predictions” should ≈ “average of observations”. Good models should have near-zero bias.
* Prediction bias is a quantity that measures how far apart those two averages are. That is:
* prediction bias = average number of predictions – average of labels in data set
* Note: “Prediction bias” is a different quantity than bias (the b in wx + b)

A significant nonzero prediction bias tells you there is a bug somewhere in your model, as it indicates that the model is wrong about how frequently positive labels occur.
* For example, let’s say we know that on average, 1% of all emails are spam. If we don’t know anything at all about a given email, we should predict that it’s 1% likely to be spam. Similarly, a good spam model should predict on average that emails are 1% likely to be spam. (In other words, if we average the predicted likelihoods of each individual email being spam, the result should be 1%.) If instead, the model’s average prediction is 20% likelihood of being spam, we can conclude that it exhibits prediction bias.
* Causes are: incomplete feature set, noisy data set, buggy pipeline, biased training sample, overly strong regularization

You might be tempted to correct prediction bias by post-processing the learned model—that is, by adding a calibration layer that adjusts your model’s output to reduce the prediction bias. For example, if your model has +3% bias, you could add a calibration layer that lowers the mean prediction by 3%. However, adding a calibration layer is a bad idea for the following reasons:
* You’re fixing the symptom rather than the cause.
* You’ve built a more brittle system that you must now keep up to date.
* If possible, avoid calibration layers. Projects that use calibration layers tend to become reliant on them—using calibration layers to fix all their model’s sins. Ultimately, maintaining the calibration layers can become a nightmare.

Advertisements

March 2018 Learning

Less than normal last month due to business travel

Books Read (related to work/professional development/betterment):

Articles:

Agile Died While You Were Doing Your Standup

  1. Agile has been implemented poorly to enterprise wholesale by consultancies that mechanizes and dehumanizes teams and doesn’t respect the craft – causing them to deliver outputs instead of outcomes that drive values for customers
  2. The problem Product management, UX, engineer, dev-ops, and other core competencies need to be one team under one leader and give it autonomy and accountability to connect solving problems. If implemented correctly – it empowers teams to work toward shared outcomes with both velocity and accuracy.
  3. Embrace discovery – discovery data matched along shipped experiences creates real customer value and trust that teams can work autnomously with accountability and shipping something that meets both company and user objectives.

 

Avoiding the Unintended Consequences of Casual Feedback

  • Your seniority casts a shadow or the org, your casual feedback may be interpreted as a mandate – make sure it’s clear whether its opinion, strong suggestion, or mandate
    1. Opinion: “one person’s opinion” your title and authority should to enter into the equation
    2. Strong suggestion: falls short of telling team what to do – senior executive draws on experience but provides team to feel empowered to take risks. This is the difficult balance to strike and requires taming of egos to do what’s best – you also have to trust the people you’ve empowered to have the final say.
    3. Mandate: issue to avoid prohibitively costly mistakes – but too often without right justification signals a demotivating lack of trust

 

Ask Women in Product: What are the Top 3 things you look for when hiring a PM?

  1. Influence without authority – figuring out what makes you tick, your team, your customers. Read in between lines. How did you deal with past conflicts
  2. Intellectual curiosity- how did you deal with ambiguous problem or were intimidated
  3. Product sense – name compelling product experience you built
  4. Empathy – unmet needs and pain points – how would you design an alarm clock for the blind
  5. Product intuition – access product, feature, or user flow
  6. Listening and communication skills – read rooms for implicit and explicit

 

Why Isn’t Agile Working?

  1. Waiting time isn’t addressed properly
  2. Doesn’t account well for unplanned work, multitasking, and impacts from shared services
  3. Even though dev goes faster in agile, it has no bearing on making the right product decisions and working to realize benefits. Agile is useful when it services as a catalyst for continuous improvement and the rest of the org structure is in line – eg. DevOps, right management culture, incremental funding v project-based funding, doing less and doing work that matters, looking at shared services, mapping value streams, etc.

 

Treehouse Learning:  

Changing object literal in dice rolling application into constructor function that takes in the number of sites as an argument. Each instance created calls the method for running the base.

function Dice(sides) {

            this.sides = sides;
            this.roll = function() {

                        var randomNumber = Math.floor(Math.random() * this.sides) +1;
                        return randomNumber;

            }

}

var dice= new Dice(6) // new instance of 8 sided die

 

Watch out for applications running code again and again unnecessarily, like in code above. The JavasScript property prototype is like an object literal that can be added to roll property, when we assign a function to it, it becomes a method and is no longer needed in the constructor function. Prototypes can be used as templates for objects, meaning values and behavior can be shared between instances of objects.

Dice.prototype.roll = function diceRoll() {

            var randomNumber = Math.floor(Math.random() * this.sides) +1;
            return randomNumber;

} // shared between all instances in template/prototype


function Dice(sides) {

            this.sides = sides;

}

 

 

 

Feb 2018 Learning

Books Read (related to work/professional development/betterment):

Creativity, Inc.

The Mythical Man Month

Articles:

pm@olin 10 Most Likely to Succeed and pm@olin 11 Capstone

  1. “ A lot of being a PM is rolling with what doesn’t cost very much, and helps make the team happy. You don’t always get the most done by optimizing.”
  2. “For a PM, It’s figuring out how to find a little extra time for the easter egg. It’s doing the extra work to get a cool side project into the product. It’s helping someone else learn a new skill. It’s the thank you cards or the day off after shipping.”
  3. Sometimes something as simple as colored markets to annotate pros and cons helps whiteboarding

Manager Energy Drain

  1. You can color-code your calendar based on what mental energy you will need (eg. 1-on-1 brain, teaching brain, planning brain) to manage that piece and defrag accordingly
  2. The best give you can give direct reports is a messy unscoped project with a bit of safety net to teach them -> give them guidance
  3. Say no to focus energy – don’t be afraid to go back and say no

The MVP is dead. Long live the RAT.

  1. RAT = Risk Assumption Test – after MVP is not a product, but a way of testing whether you’ve found a problem worth solving. RAT emphasizes on building on what’s required to rest beyond your largest unknown
  2. All about rapid testing rather than creeping into perfect code, design, and danger of becoming a product
  3. It’s about maximizing discovery and removing temptations of putting resources on creating a more polished product

Scaling Agile At Spotify: An Interview with Henrik Kniberg

  1. “Autonomy is one of our guiding principles. We aim for independent squads that can each build and release products on their own without having to be tightly coordinated in a big agile framework. We try to avoid big projects altogether (when we can), and thereby minimize the need to coordinate work across many squads.”
  2. “By avoiding big projects, we also minimize the need to standardize our choice of tools.”
  3. The technical architecture is hugely important for the way we are organized. The organizational structure must play in harmony with the technical architecture. Many companies can’t use our way of working because their architecture won’t allow it.
    • We have invested a lot into getting an architecture that supports how we want to work (not the other way around); this has resulted in a tight ecosystem of components and apps, each running and evolving independently. The overall evolution of the ecosystem is guided by a powerful architectural vision.
    • We keep the product design cohesive by having senior product managers work tightly with squads, product owners, and designers. This coordination is tricky sometimes, and is one of our key challenges. Designers work directly with squads, but also spend at least 20% of their time working together with other designers to keep the overall product design consistent.”

Product Management Is Not Project Management

  1. Product management is not about making sure products ship on time – it’s about knowing the customer needs and defining the right product and evangelizing that internally
  2. Too often, Product Managers spend time writing specs, Gantt charts, and workflows instead of on customer problems, customer data, and articulating that to the company.
  3. Measuring Religiously means both analytics + talking to customers

When should you hire a Product Manager?

  1. Toxic things to a Product Management team: when it is too large and has overlaps in responsibility, it results in politics, land grabs for credit, and no clear owner on how t to make decisions
  2. Don’t hire until there’s a pain point – eg can’t prioritize backlog, slow shipping bc of mismatched priorities and poor communication between teams, people don’t know why they’re building what they’re building
  3. “My least favorite way to slice a Product team is “I’ll do the high level strategy and they’ll do details” — it makes it hard for the detail-level person to make good calls. It also makes it harder for the high level person to connect with the rest of the team.”

Continuous Improvement + Quality Assurance

  1. Minimum viable feature set: releasing a feature is decoupled from deploying code. Large features deployed piecemeal over time.
  2. Debugging is twice as hard as writing code in the first place. Focus less on the mitigation of large, catastrophic failures – optimize for recovery rather than failure prevention. Failure is inevitable.
  3. Exploratory testing requires an understanding of the whole system and how it serves a community of users. Customer Experience is as much about technology as it is about product requirements

Building Your Personal Brand Where You Work

  1. Make your boss aware of what you’re doing – women often doers who don’t make it a point to highlight their accomplishments or how busy they are at work. Great tool is informal email reports. Template can be: weekly wins, areas of improvement for my team, what was coming next week, what you need from boss.
  2. Build brand equity with coworkers, because you will need people to defend you. Being liked matters more sometimes. You want an ally at every level, your boss should respect you but it’s also important entry level employees respect you too.
  3. Keep track of your success, remember you wins. Eg. tracking weekly, monthly, bi-annual, annual wins

Product Manger versus Product Owner

  1. “Product Owner is a role you play on a Scrum team. Product Manager is the job”
  2. Product Owner should spend half the time talking to customers and half working with the team is an ideal but should vary. External v internal work will shift depending on maturity and success of product
  3. Product Managers in senior roles should concentrate on defining vision and strategy for teams based on market resarhc, company goals, and current state of products. The ones without Scrum teams or smaller teams can help validate or contribute to strategy fo future products.

How to Run an Effective Meeting

  1. Set the agenda so there is a compass for conversation. Start on time and tend on time.
  2. End with an action plan that has next steps.
  3. Be clear, light bulb or gun – you have an idea or you want people to do it. “Your job as a leader is to be right at the ending of the meeting, not the beginning of the meeting.” Let people speak so you’ve heard all facts and opinions.

Managing Software Engineers *This is totally an article clearly from 2002 and all problematic attitudes therein about not considering people might have things like families

  1. Create work environment where best programmers will be satisfied enough to stay and where average programmers become good
  2. “One of the paradoxes of software engineering is that people with bad ideas and low productivity often think of themselves as supremely capable. They are the last people whom one can expect to fall in line with a good strategy developed by someone else. As for the good programmers who are in fact supremely capable, there is no reason to expect consensus to form among them.”
  3. Ideals to steal
    1. people don’t do what they are told
    2. all performers get the right consequences every day
    3. small, immediate, certain consequences are better than large future uncertain ones
    4. positive reinforcement is more effective than negative reinforcement
    5. ownership leads to high productivity

The What, Why, and How of Master Data Management

  1. Five kinds of data in corporations:
    1. “Unstructured—This is data found in e-mail, white papers like this, magazine articles, corporate intranet portals, product specifications, marketing collateral, and PDF files.
    2. Transactional—This is data related to sales, deliveries, invoices, trouble tickets, claims, and other monetary and non-monetary interactions.
    3. Metadata—This is data about other data and may reside in a formal repository or in various other forms such as XML documents, report definitions, column descriptions in a database, log files, connections, and configuration files.
    4. Hierarchical—Hierarchical data stores the relationships between other data. It may be stored as part of an accounting system or separately as descriptions of real-world relationships, such as company organizational structures or product lines. Hierarchical data is sometimes considered a super MDM domain, because it is critical to understanding and sometimes discovering the relationships between master data.
    5. Master—Master data are the critical nouns of a business and fall generally into four groupings: people, things, places, and concepts. Further categorizations within those groupings are called subject areas, domain areas, or entity types. For example, within people, there are customer, employee, and salesperson. Within things, there are product, part, store, and asset. Within concepts, there are things like contract, warrantee, and licenses. Finally, within places, there are office locations and geographic divisions. Some of these domain areas may be further divided. Customer may be further segmented, based on incentives and history. A company may have normal customers, as well as premiere and executive customers. Product may be further segmented by sector and industry. The requirements, life cycle, and CRUD cycle for a product in the Consumer Packaged Goods (CPG) sector is likely very different from those of the clothing industry. The granularity of domains is essentially determined by the magnitude of differences between the attributes of the entities within them.”
  2. Deciding what to manage it and how it should be managed depends on some of the following criteria: behavior (how it interacts with other data, eg customers buy product- which may be a part of multiple hierarchies describing how they’re sold), life cycle (created, read, updated, deleted, searched – a CRUD cycle), cardinality, lifetime, complexity, value, or volatility, reuse
  3. Master Data Management is the tech, tools, and processes required to create and maintain consistent and accurate lists of master data, including identifying sources of master data, analyzing metadata, appointing data stewards, data-governance program, developing master data model, toolset, infrastructure, generating and testing master data, modify producing and consuming systems, implementing maintenance processes, and creating Master List similar to ETL below:
    1. Normalize data formats
    2. Replace Missing values
    3. Stnadardize Values
    4. Map Attributes
    5. Needs versioning and auditing

Treehouse Learning:  

Object-Oriented-Javascript

  • An object is a container for values in the form of properties and functionality in the form of methods
    • Methods on values can return objects, but they don’t have to return anything at all
  • Accessing or assigning properties is known as getting and setting
  • Native Objects: no matter where your JavaScript programs are run, it will have these objects eg. number, string, object, boolean
  • Host Objects: provided by the host environment, eg. the browser, such as document, console, or element
  • Own Objects: created in own programming eg. characters in a game
  • Objects hide complexity and organize code – known as encapsulation
  • An object literal holds information about a particular thing at a given time – it stores the state of a thing.

Eg.

var person = {
            name: “Lauren”,
            treehouseStudent: true,
            “full name”: “Lauren Smith”
}

Access using dot notation or square brackets

person.name;
person.treehouseStudent;
person[“name”]
person[“treehouseStudent”]
person[“full name”]
  • Each key is actually a string, but Javascript interpreter interprets them as a string
  • Encapsulating code into a single block allows us to keep state and behaviors for a particular thing in one place and code becomes more maintainable

Adding method to an object

var contact = {
  fullName: function printFullName() {
  var firstName = "Andrew";
  var lastName = "Chalkley";
  console.log(firstName + " " + lastName);
  }
}

Anonymous Function

var contact = {
  fullName: function() {
    var firstName = "Andrew";
    var lastName = "Chalkley";
    console.log(firstName + " " + lastName);
  }
}

We don’t know the name of variable to access its properties. Depending on where and how a function is called, this can be different things. Think of this as owner of function, eg. the object of method that is called.

Eg.

var dice = {
            sides: 6,
            roll: function() {
                var randomNumber = Math.floor(Math.random() * this.sides) + 1; // this means object literal of dice in this case
                console.log(randomNumber);
            }
}

var dice10 = {
            sides: 10,
            roll: function() {
                 var randomNumber = Math.floor(Math.random() * this.sides) + 1; // refers to dice10 variable
                 console.log(randomNumber);

            }

}

Object  literals are great for one off objects, if you want to make multiple objects of one type you need constructor functions:

  • Constructor functions describe how an object should be created
  • Create similar objects
  • Each object created is known as an instance of that object type

Constructor function example and new contact instances (an instance is the specific realization of a particular type or object)

Function Contact(name, email) {
    this.name = name;
    this.email = email;
}

var contact = new Contact(“Andrew”, “andrew@andrew.com”);
var contact2 = new Contact(“Bob”, “bb@andrew.com”);

You can create as many object of same type as you like, eg. real world example of:

Media Player

  • Playlist object (initialized by constructor function)
  • Song objects

PM Hack Panel Notes

Two weeks ago, I got to go PM Hack for a hot second, a hackathon for PMs and aspiring PMs put together by Jason Shen and Johanna Beyenbach and hosted by Wayup. I’m really bummed I actually only got to stay for maybe half the day because my actual PM job called me in on a Sunday, but it was definitely unique and one of the cooler initiatives I’ve seen to get people’s hands dirty on Product Management work. In a previous life, I’ve gone to hackathons as a developer, and there is something really inspiring, educational, and rewarding about working with a group of strangers to create something workable in a matter of hours or days.

One thing I did get to stay for an enjoy was a panel by some esteemed folks in the business so to speak – so I thought I’d put down my notes here to keep top of mind:

pmhackpanel.jpg

Some awesome Product Managers: Elan Miller (Midnight), Inga Chen (Squarespace), Lauren Ulmer (Dormify), and Joan Huang (Flatiron Health)

  • Emotional intelligence > IQ in PM roles
  • You need to understand yourself and your vision first
  • Constant tension at work between tending to firedrills v longer range thinking -> one key to working on this is working internal marketing for buy-in on longer term strategy
  • Good pms are always obsessing or communicating and good listening
  • Status update at right level of context – know how to communicate to junior level devs to executives
  • Saying no is a part of your job
  • Your job is to also bring the team and org together
  • Be cognizant of what step of the product life cycle are you able to work in and think about what is possible to change and is it possible
  • Team Cultures (build it out) + Users (joy)
  • Managing different dependencies across teams is key
  • Your job is to also define and interpret metrics correctly
  • The bigger the org the more stakeholder communication versus direct time to users
  • Be careful not to over optimize for the negative vocal batch of users versus the majority of users
  • As with everything, it’s right place right time with right skill set so you gotta angle to make to happen
  • GV Design Sprint can be a useful problem solving process
  • When you’re interviewing for a PM job: communicate you know a company’s business when you interview :
    • Mini deck to intro yourself, how you can solve company’s problem, and show you’ve done your hw and are more than your resume
    • Understand levers to business model (how does business makes money)
    • Apply to fewer jobs and make sure you’re interested in problems the product is trying to solve
    • Find side projects outside of your typical product development life cycle
    • Treat yourself as a product
    • Having a POV and being polarizing can be an advantage
    • Remember you can help them with particular problem you’re trying to solve even if you aren’t from that vertical – you could be bringing a fresh perspective to their problems

Oct Learning

Just my “three key points” notes from various reading I thought was work helpful this month:

PSFK Advertising Playbook Overview

  1. Experiential marketing now is the most critical tool
  2. Shift from ads to customer relationships and decline of online ads
  3. Emotional connections realign brands -> engineered enjoyment, contextual calibration, and third space communities are opportunities

 

Knowns vs Unknowns — Are you building a successful company or just typing?

  1. First known unknown is that you envision a product that solves a problem that a small group of users have
  2. Engineer’s primary job isn’t really writing code per se, but improving product for you users
  3. “What I often hear from CEOs is that “my CTO thinks we need to rebuild the backend so it’s scaleable.” The reality is that if you haven’t yet solved for the product’s scaleable and repeatable growth, you don’t know what the backend needs to be. If you’ve hired people that care more about the programming languages/frameworks and not the KPIs of your product, you’ll constantly have this internal battle. Remind them that writing software is the easy part. Building a company that scales isn’t.”

6 lessons learned about technical debts management in Silicon Valley

  1. Product always needs to be improved and have tech debts happening at once (80/20 rule)
  2. Top Down vision on the importance of these debts “It is not about the money you can make, it is about the money you won’t lose”
  3. Before you kill features, identify who are using it, find an alternative, and explain why you are killing a feature

IGNORE EVERYTHING BETWEEN THE CLOUDS AND DIRT

  • “This is because the vast majority of people tend to play the middle—they focus on the vague minutiae that doesn’t matter”
  • Two things happen when you’re too focused on the middle:
    • You’re only successful to a certain level and then hit a plateau
    • You get stuck in one of two extremes: you get stuck either because you become too romantic on ideals and neglect the skills you need to execute or you get tied up in minutiae or politics and lose sight of the bigger picture.

Unit Economics

  1. “Unit economics are the direct revenues and costs associated with a particular business model expressed on a per unit basis.” Eg Lifetime Value, Customer Acquisition Cost (CPA)
  2. What you want to do as a product manager is increase average rev per user (ARPU), increase customer lifetime, and drive expansion revenue from existing cusotmers
  3. Make sure you know what your most profitable segment is and what their composite is of the user base

pm@olin: Buildiing (Class 5)

  1. Understand your personal work and productivity style
  2. Understand the style of your team and tailor your project management to the team – being cognizant of your personal style
  3. Understand your software processes (eg. Waterfall or Agile) and bug triage

Offshore Development: Pluses and Minuses for Product Managers

  1. Hard part is to learn and understand the team and learn what makes them tick and how you can leverage all this and control for issues such as different work cultures and different accents over conference phones
  2. Get to know them and make sure they know you
  3. Keep them informed, establish routines (especially communicating with remote team lead and holding them accountable, hold all-team meetings semi-frequently), and leverage tools

How we develop great PM / Engineering relationships at Asana

  1. Semi-formalized way for sharing leadership and credit
  2. Remember mantra product owns the problems and engineering owns solutions
  3. ‘Clarify roles and reinforce them with mutual respect’