**Articles Read**** **

Engineering Management Philosophies and Why They Matter Even if You are Not a Manager

- Internal Team Success, External Team Collaboration, Company-wide Responsibilities & Culture, and Strategic Direction and Impact are key buckets of focus
- Do not take things at a face value and learn to overc-ommunicate
- Being an effective leader is helping the team make decisions rather than making decisions for them

How to choose the right UX metrics for your product

- Two-prong approach of the quality of user experience (using HEART framework) and goals of product/project (using Goals-Signals-Metrics)
- HEART Framework
- Happiness
- Engagement
- Adoption
- Retention
- Task Success

- Goals-Signals-Metrics layered on top

What is Data Egress? Managing Data Egress to Prevent Sensitive Data Loss

- Data egress refers to data leaving a network in transit to an external location. Examples include outbound e-mail messages, cloud uploads, files moved to external storage, copying to usb drive, FTP/HTTP transfer, etc. Data ingress refers to when data outside network is traveling into the network.
- Egress filtering involves monitoring egress traffic to monitor for signs of malicious activity.
- Data exfiltration refers to techniques that can result of loss, theft, or exposure of sensitive data. Eg. stealing USB drives or encrypting or modifying data prior to filteratio or using services to mask location or traffic.

- Deal identifier is the unique number of an automated ad buy. This is the identifier used to match buyers and seller individually. This identifier implies a previously agreed-upon set of parameters, a more narrow criteria to programmatic and private marketplaces.
- Deal ID allows publishers to specify terms and kind of inventory available to different types of advertisers.
- Deal ID can be thought of as an automated insertion order, better flexibility but controlling for the parameters of an ad deal.

Responses to Negative Data: Four Senior Leadership Archetypes.

- Most senior leaders in org came up when data wasn’t so accurate and available
- You have bubble kings who ignore the data and Attackers on the other end
- Deal with Bubbles: form relationships and justify decisions.
- Deal with Attackers: get out or provide solutions and not just data

- Rationalizer who sow doubt and Curious ones who ask the wwhy
- Deal with Rationalizers: need to bring overwhelming analytical competence
- Deal with Curious: be joyous and work hard

The Engineering Manager: Working with Product Marketing

- Great Marketing market makes code you’re writing something people need to have
- “A world-class engineer, designer, product manager and product marketer can really change the world.”

- Teach them your features and work with your team and let them practice their narrative to your team
- Build in feature toggling – batching sets of features as a campaign. Can do targeted or percentage-based rollouts for select feedback as well.

Building Customer Churn Models for Business

- “In its simplest form, churn rate is calculated by dividing the number of customer cancellations within a time period by the number of active customers at the start of that period. Very valuable insights can be gathered from this simple analysis — for example, the overall churn rate can provide a benchmark against which to measure the impact of a model. And knowing how churn rate varies by time of the week or month, product line, or customer cohort can help inform simple customer segments for targeting as well.”
- Churn can be characterized as
- Contractual
- Customers buy at intervals or otherwise observed. Eg subscriptions

- Non Contractual
- Free to buy or not anytime. Churn is not explicit, eg. ecommerce

- Voluntary
- Customers chose to leave service

- Involuntary
- Customers forced to discontinue or payments

- Good churn models should factor in things like different risk scores, predicting different use cases for probabilities of churn, and have metrics that stakeholders will understand respond to

- Contractual

Getting started with AI? Start here!

- Write down labels you’ll accept and how you’d know if the answer is right for one of them and what mistakes might look like. It will save you trouble downstream and put you in right paradigm.
- Remember: the goal of analytics is to generate inspiration/inform the decision maker.
- ML/AI is for proejcts where the goal is to sue data to automate thing-labeling.
- Data mining is about maximizing the speed of discovery while ML/AI is about performance in automation.

Once You’re in the Cloud, How Expensive Is It to Get Out?

- Negotiate a good egress rate or account for it just in case
- Ingress of course is usually free

How to Build an Amazing Relationship Between Product Management and Marketing

- Figure out how to align Product’s metrics/goals with Marketing
- User feedback v lead gen

- Start early on products/planning across divisions
- Have transparent strategic goals and defined roles off the bat

How Product Marketers Want to Work With Product Managers

- Show Product Marketers the full plan from the start
- Work with the Product Marketer to connecting your product or part of it to the full system experience
- Share dasta and customer stories as the perspective from either side is different but important for collaboration and positiong

Second-Order Thinking: What Smart People Use to Outperform

- Don’t seize the first available option, no matter how good it seems, before you’ve asked questions and explored. It’s asking and then what?
- Think about how others in ecosystem will respond to business decisions, your suppliers, regulators, etc.
- Think in terms of ten minutes, ten weeks, ten months, ten years, etc.

Why a High Performing Product Marketing Team Is the Key to Growth

- Product marketers should champion the voice of the customer, more akin to a sociologist or psychologist than a product manager or a technology
- Product marketers can perform win/loss analysis, customer profiles, segmentation, buyer personas, etc.
- Knows how to bundle feature, understands market + category insights vis a vis the competitive environment

Why Modeling Churn is Difficult

- Churn = CustomersLostDuringPeriod/CustomersAtBeginningOfPeriod
- Difficultly is looking in true rate of churn to account for the differences period to period as moving toward accounting for seasonality, etc.
- A stochastic model is one way to approach this problem as it allows for random variation of the inputs on a time basis.

Content Targeting Driving Brand Growth Without Collecting User Data

- There are avenues to meaning create content targeting strategies using channels and demo not reliant on user data effectively still
- Integrated Content Targeting improves overall media experiences still
- This also increase trust and length of time engaged

LetsLearnAI: What Is Feature Engineering for Machine Learning?

- “Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. If feature engineering is done correctly, it increases the predictive power of machine learning algorithms by creating features from raw data that help facilitate the machine learning process. Feature Engineering is an art.”
- Combining two columns like lat and long together into one feature is known as Crossed Column and can help the model learn better.
- Bucketized columns are sometimes useful, eg like pooling age ranges, 25-35, etc.

What Is Regularization In Machine Learning?

- Regularization is used to solve the problem of overfitting in machine learning models – that is when models learn too much from the noise in the training data that it negatively impacts the performance of the model on new data.
- There are two types of Regularization
- L1: Lasso regularization: Adds a penalty to the error function. The penalty is the sum of the absolute values of weights.
- L2: Ridge regularization: Adds penalty using the sum of squared values of the weights

- Generally, good models do not give more weight to a particular feature – the weights are evenly distributed using regularization to solve for overfitting.

What Are L1 and L2 Loss Functions? L1 vs L2 Loss Function

- L1 Loss Function minimizes the error in ML models by using the sum of all absolute differences between true and predicted value. Call LAD or Least Absolute Deviations
- L2 Loss Function Least Square Errors or LS minimizes the error using the sum of all squared differences between predicted and true value
- L2 generally is preferred but does not work well if the data set outliers because the squared differences will lead to a larger error.

**Learning to explain Gains/Lifts better as an outcome of models from Machine Learning **

Cumulative Gains and Lift Charts

- A Cumulative Gains chart shows the percentage of overall number of cases in a given category “gained” by targeting a percentage of total number of cases.
- Each point on curve, x-axis is percentage of total cases “gained” by y-axis category value
- Diagonal line is the baseline, eg. if you select 20% of cases from scored dataset at random you would expect to gain 20% in all cases of category
- What makes a desirable gain depends on the cost of errors, eg. Type 1 and Type II errors as you move up

- Lift Chart is derived from cumulative gains chart
- Values on y-axis correspond to ratio of cumulative chain for curve to baseline
- It’s another way of looking at Gains Chart

**Treehouse**

- O(1): Constant: takes constant time regardless of n, doesn’t change. Ideal because input time doesn’t matter
- O(log n): Logarithmic (sometimes called sublinear) runtime, as n grows large, the operation grows thoroughly and flattons out
- O(n) Linear Time: eg. reading item on every list
- O(n2) Quadratic Time: eg. for any given value of n, we carry out n^2 of operations
- Cubic Runtimes: n^3 of operations
- Quasilinear Runtimes: O(n log n)
- For every value of n, we are going to execute a log n number of operations. n times long n
- Lies between a linear runtime and quadratic runtime
- Sorting algorithm is where you see this
- Merge Sorts is an example that takes a long time in terms of quasilinear run time

- Polynomial runtime O(n^k) – if for a given value of time is n raised to k power
- Anything bounded by this is considered to have a polynomial runtime or be efficient

- Exponential runtime O(x^n) algorithmns are too expensive to be used, eg brute force algorithms analogy to manually testing each combo on a lock to break it, eg. three combo locker 1000 values, and four is 1000
- Traveling Salesman analogy (eg multiple routes) or factorial On!
- Knowing off the bat that a problem is somewhat unsolvable in a realistic time means you can focus your efforts on other aspects of the problem.

- Worse Case Complexity
- When evaluating the run time for an algorithm, we say that the algorithm has as its upper bound, the same run time as its least efficient step in algorithm.
- The run time of the algorithm in the worst case is O(logn n) or big O log of log n or a logarithmic.

linear_search.py

def linear_search(list, target): “”” Returns the index positioningg of the target if found, else returns home “”” for i in range(0, len(list)): if list[i] == target: return i return None def verify(index): if index is not None: print(“Target found at index: “, index) else: print(“Target not found in list”) result = linear_search(numbers, 12) verify(result) “Target not found in list”

In worse case scenario this loop here will have to go through entire range of values and read every element on list. This gives Big O value of N or running in linear time.

Binary Search

def binary_search(list, target): first = 0 last - len(list) - 1 while first <= last: midpoint = (first + last)//2 if list[midpoint] == target: return midpoint elif list[midpoint] < target: first = midpoint + 1 # point to value after midpoint else: last = midpoint - 1 # if greater than midpoint, point to value after midpoint return None

recusive_binary_search.py def recursive_binary_search(list, target): if len(list) == 0: return False else: midpoint = (len(list))//2 if list[midpoint] == target: return True else: if list[midpoint] < target: return recursive_binary_search(list[midpoint+1:], target)#new list using slice operation else: return recursive_binary_search(list[:midpoint], target) def verify(result): print("Target found: ", result) numbers = [1, 2, 3, 4, 5, 6, 7, 8] result = recursive_binary_search(numbers, 12) verify(result) result = recursive_binary_search(numbers,6) verify(result)

** **

- A recursive function is one that calls itself
- When writing a recursive function, you always need a stopping condition, often called the base case
- Eg the empty list in example of above or finding the midpoint

- The number of times a recursive function calls itself is called Recursive Depth
- An iterative solution means it generally implemented using a loop of some guide versus a recursive solution is one that involves a set of stopping conditions and a function that calls itself
- In functional languages, we avoid changing data that is given to a function
- Python on flipside, prefers iterative solutions and has Maximum Recursion Depth (or how many times function can call itself)

- Space Complexity
- Space Complexity is a measure of how much more working storage or extra storage is needed as an algorithm grows
- For example, recursive binary search runs in O(log n) in Python
- Tail Optimization in some programming languages, such as Swift, if the recursive call is the last line of code in the function, reduces space and computing overhead of recursive functions. Python does not implement tail optimization so the iterative version will be more optimal for safe

Reporting with SQL (Review time!)

- Ordering
- SELECT <columns> FROM <table> ORDER BY <column>;

SELECT * FROM customers ORDER BY last_name ASC, first_name ASC;

- Limiting
- SELECT * FROM <table> LIMIT <# of rows>;
- SELECT * FROM campaigns ORDER BY sales DESC LIMIT 3;

- Offset
- The offset keyword is used with SELECT and ORDER BY to provide a range to select records
- — SELECT * FROM <table> LIMIT <# of rows> OFFSET <skipped rows>;
- SELECT * FROM orders LIMIT 50 OFFSET 100;

- Manipulating Text
- Aggregation
- Date times

** **