Apr-June 2018 Learning

Books Read (related to work/professional development/betterment):

Articles:

Giving meaning to 100 billion analytics events a day

  1. Tracking events were sent by browser over HTTP to a dedicated component and enqueues them in a Kafka topic. You can build a Kafka equivalent in BigQuery to use as Data Warehouse system
  2. ‘When dealing with tracking events, the first problem you face is the fact that you have to process them unordered, with unknown delays.
    1. The difference between the time the event actually occurred (event time) and the time the event is observed by the system (processing time) ranges from the millisecond up to several hours.’
  3. The key for them was finding ideal batch duration

What is a Predicate Pushdown?

  1. Basic idea is certain parts of SQL queries (the predicates) can be “pushed” to where the data lives and reduces query/processing time by filtering out data earlier rather than later. This allows you to optimize your query by doing things like filtering data before it is transferred over a network, loading into memory, skipping reading entire files or chunks of files.
  2. ‘A “predicate” (in mathematics and functional programming) is a function that returns a boolean (true or false). In SQL queries predicates are usually encountered in the WHEREclause and are used to filter data.’
  3. Predicate pushdowns filters differently in various query environments, eg Hive, Parquet/ORC files, Spark, Redshift Spectrum, etc.

I hate MVPs. So do your customers. Make it SLC instead.

  1. Customers hate MVPS, too M and almost never V – simple, complete, and lovable is the way to go
  2. The success loop of a product “is a function of love, not features”
  3. “An MVP that never gets additional investment is just a bad product. An SLC that never gets additional investment is a good, if modest product.”

Documenting for Success

  1. Keeping User Stories Lean and Precise with:
    1. User Story Objectives
    2. Use Case Description
    3. User Interaction and Wireframing
    4. Validations
    5. Communication Scenarios
    6. Analytics
  2. Challenges
    1. Lack of participation
    2. Documentation can go sour
  3. Solutions
    1. Culture – tradition of open feedback
    2. Stay in Touch with teams for updates
    3. Documentation review and feedback prior to sprint starts
    4. Track your documents

WTF is Strategy

  1. Strategic teaming is what sets apart seniors from juniors
  2. Strategy needs
    1. Mission: Problem you’re trying to solve and who for
    2. Vision: Idealized solution
    3. Strategy: principles and decisions informed by reality and caveated with assumptions that you commit to ahead of dev to ensure likelihood of success in achieving your vision
    4. Roadmap: Concreate steps
    5. Execution: Day-today activities
  3. “Strategy represents the set of guiding principles for your roadmapping and execution tasks to ensure they align with your mission and vision.”

Corporate Culture in Internet Time

  1. “”The dirty little secret of the Internet boom,” says Christopher Meyer, who is the author of “Relentless Growth,” the 1997 management-based-on-Silicon-Valley-principles book, “is that neither startup wizards nor the venture capitalists who fund them know very much about managing in the trenches.”
  2. “ The most critical factor in building a culture is the behavior of corporate leaders, who set examples for everyone else (by what they do, not what they say). From this perspective, the core problem faced by most e-commerce companies is not a lack of culture; it’s too much culture. They already have two significant cultures at play – one of hype and one of craft.”
  3. Leaders need to understand both craft and hype cultures since they have to rely on teams that come from both to deliver. They need to set-up team cultures and infrastructure that supports inter-team learning.

Do You Want to Be Known For Your Writing, or For Your Swift Email Responses? Or How the Patriarchy has fucked up your priorities

  1. Women are conditioned to keep proving themselves – our value is contingent on ability to meet expectation of others or we will be discredited. This is often true, but do you want to a reliable source of work or answering e-mails?
  2. Stop trying to get an A+ in everything, it’s a handicap in making good work. “Again, this speaks most specifically to women, POC, queers, and other “marginalized” folks. I am going to repeat myself, but this shit bears repeating. Patriarchy (and institutional bigotry) conditions us to operate as if we are constantly working at a deficit. In some ways, this is true. You have to work twice as hard to get half the credit. I have spent most of my life trying to be perfect. The best student. The best dishwasher. The best waitress. The best babysitter. The best dominatrix. The best heroin addict. The best professor. I wanted to be good, as if by being good I might prove that I deserved more than the ephemeral esteem of sexist asshats.”

Listen to me: Being good is a terrible handicap to making good work. Stop it right now. Just pick a few secondary categories, like good friend, or good at karaoke. Be careful, however of categories that take into account the wants and needs of other humans. I find opportunities to prove myself alluring. I spent a long time trying to maintain relationships with people who wanted more than I was capable of giving

  1. Stop thinking no as just no but saying yes to doing your best work

Dear Product Roadmap, I’m Breaking Up with You

  1. A major challenge is setting up roadmap priorities without real market feedback, especially in enterprise software
  2. Roadmaps should be planned with assets in place tied closely to business strategy
    1. A clearly defined problem and solution
    2. Understanding of your users’ needs
    3. User Journeys for the current experience
    4. Vision -> Business Goals -> User Goals -> Product Goals -> Prioritize -> Roadmap
  3. Prioritization should be done through the following lens: feasibility, desirability, and viability

The 7 Steps of Machine Learning Google Video

  • Models are created via training
  • Training helps create accurate models that answers questions correctly most of the time
  • This require data to train on
    • Defined features for telling apart beer and wine could be color and alcohol percentage
  • Gathering data, quality and quantity determine how good model can be
  • Put data together and randomize so order doesn’t affect how that determines what is a drink for example
  • Visualize and analyze during data prep if there’s a imbalance in data in the model
  • Data needs to be split, most for (70-80%) and some left for evaluation to test accuracy (20-30%)
  • A big choice is choosing a model – eg some are better for images versus numerical -> in the beer or wine example is only two features to weigh
  • Weights matrix (m for linear)
  • Biases metric (b for linear)
  • Start with random values to test – creates iterations and cycles of training steps and line moves to split wine v beer where you can evaluate the data
  • Parameter tuning: How many times we through the set -> does that lead to more accuracies, eg learning rate how far we are able to shift each line in each step – hyperparameters are experimental process bit more art than science
  • 7 Steps: Gathering Data -> Preparing Data -> Choosing a Model -> Training -> Evaluation -> Hyperparameter Tuning -> Prediction

Qwik Start Baseline Infra Quest: 

  • Cloud Storage Google Consolae
  • Cloud IAM
  • Kubernetes Engine

Treehouse Learning:  

Javascript OPP

  • In JavaScript, state are represented by objects properties and behaviors are represented by object methods.
    • Radio that has properties like station and volume and methods like turning off or changing a station
  • An object’s states are represented by “property” and its behaviors are presented by “method.”
  • Putting properties and methods into a package and attaching it to a variable is called encapsulation.

Intro SQL Window Functions

  • Function available in some variations of SQL that lets you analyze a row in context of entire result set – compare one row to other rows in a query, eg percent of total or moving average

Common Table Expressions using WITH

  • CTE – a SQL query that you name and reuse within a longer query, a temporary result set
  • You place a CTE at the beginning of a complete query using a simple context
--- create CTES using the WITH statement
WTH cte_name AS (
  --- select query goes here
)

--- use CTEs like a table
SELECT * FROM cte_name
  • CTE name is like an alias for the results returned by the query, you can then use the name just like a table name in the queries that follow the CTE
WITH product_details AS (
  SELECT ProductName, CategoryName, UnitPrice, UnitsInStock
  FROM Products
  JOIN Categories ON PRODUCTS.CategoryID = Categories.ID
  WHERE Products.Discontinued = 0
)

SELECT * FROM product_details
ORDER BY CategoryName, ProductName
SELECT CategoryName, COUNT(*) AS unique_product_count, 
SUM(UnitsInStock) AS stock_count
FROM product_details
GROUP BY CategoryName
ORDER BY unique_product_count
  • CTE makes code more readable, organizes queries into reusable modules, you can combine multiple CTEs into a single query, it can better match of how we think of results set in the real world
    • all orders in past month-> all active customers -> all products and categories
    • Each would be a CTE
  • Subqueries create result sets that look just like a table that can be joined to another tables
WITH all_orders AS (
  SELECT EmployeeID, Count(*) AS order_count
  FROM Orders
  GROUP BY EmployeeID
),
late_orders AS (
    SELECT EmployeeID, COUNT(*) AS order_count
    FROM Orders
    WHERE RequiredDate <= ShippedDate
    GROUP BY EmployeeID
)
SELECT Employees.ID, LastName,
all_orders.order_count AS total_order_count,
late_orders.order_count AS late_order_count
FROM Employees
JOIN all_orders ON Employees.ID = all_orders.EmployeeID
JOIN late_orders ON Employees.ID = late_orders.EmployeeID
  • Remember one useful feature of CTES is you can reference them later in other CTEs, eg. revenue_by_employee below pulling from all_sales
  • You can only reference a CTE created earlier in the query, eg first CTE can’t reference the third
WITH
all_sales AS (
  SELECT Orders.Id AS OrderId, Orders.EmployeeId,
  SUM(OrderDetails.UnitPrice * OrderDetails.Quantity) AS invoice_total
  FROM Orders
  JOIN OrderDetails ON Orders.id = OrderDetails.OrderId
  GROUP BY Orders.Id
),
revenue_by_employee AS (
  SELECT EmployeeId, SUM(invoice_total) AS total_revenue
  FROM all_sales
  GROUP BY EmployeeID
),
sales_by_employee AS (
  SELECT EmployeeID, COUNT(*) AS sales_count
  FROM all_sales
  GROUP BY EmployeeID
)
SELECT revenue_by_employee.EmployeeId,
Employees.LastName,
revenue_by_employee.total_revenue,
sales_by_employee.sales_count,
revenue_by_employee.total_revenue/sales_by_employee.sales_count AS avg_revenue_per_sale
FROM revenue_by_employee
JOIN sales_by_employee ON revenue_by_employee.EmployeeID = sales_by_employee.EmployeeID
JOIN Employees ON revenue_by_employee.EmployeeID = Employees.Id
ORDER BY total_revenue DESC
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s