Ask Women in Product: When and how do I add Machine Learning to a product roadmap?

Photo by Bea Rose via Twenty20

Answer from Kenlyn Terai

Author’s note: While working as a global product manager at Agilent, I became enamored with advanced statistics and its business applications during my MBA at UCLA Anderson, where I studied modeling and prediction with Dr. Richard Stern and One-to-One Marketing and Marketing Research with Dr. Anand Bodapati. When I later worked as an Engineering instructor and Developer, I sought best-practice insights from Machine Learning (ML) Product Management practitioners to understand how ML is utilized in building applications. Now that I’ve returned to product management work, I have integrated these experiences into a far better understanding of how to work ML into my product roadmaps.

Table of Contents

Introduction
— What is Machine Learning?
— Artificial Intelligence > Machine Learning > Deep Learning
When is a Machine Learning-based solution appropriate?
— Problems that are well suited for ML-based solutions
— Problems that don’t need ML-based solutions
What do I need to succeed?
How do I fit Machine Learning into my roadmap?

— 1. Understand the business need
— 2. Formulate the problem hypothesis
— 3. Define clear measures of success
— 4. Assess your candidate projects
— 5. Confirm that you have the data you need to succeed
— 6. Establish healthy data collection practices
— 7. Tackle the risks and challenges
— 8. Expect to iterate
Conclusion
Resources and References

Introduction

Are you looking for ways to improve personalization, natural language processing (NLP), or search customization? Machine Learning (ML) could be the tool you need. This article will describe what Machine Learning is, what problems it can solve for you, and how to incorporate this toolkit into your product roadmaps. More importantly, this article also explains when machine learning is not the right tool for the job.

What is Machine Learning?

Arthur Samuel, a pioneer of AI research, provides us with a concise definition: “Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed.” Where traditional programming requires us to provide data and rules to generate answers, Machine Learning requires us to provide desired answers and data to generate rules.

Image credit: https://towardsdatascience.com

Artificial Intelligence > Machine Learning > Deep Learning

Machine Learning (ML) is sometimes confused with Artificial Intelligence (AI) and Deep Learning (DL), which are related but differentiated. Whenever a machine completes tasks based on a set of stipulated rules that solve problems (i.e., algorithms), such an “intelligent” behavior is what is called Artificial Intelligence and includes ML and DL. ML is a subset of AI, and DL is a subset of ML. The progression from AI through DL generally represents more human-independent rule definitions and, as such, require more and more data. The diagram below shows where ML and DL fall within the AI discipline.

Image credit: https://blog.stateofthedapps.com/

When is a Machine Learning-based solution appropriate?

Problems that are well suited for ML-based solutions

Good use cases for Machine Learning include problems that require personalization, ranking, classification, regression, clustering, or identifying anomalies. Note that the type of problem you want to solve will drive the choice of algorithm to use (e.g. for clustering, you would use k-means).

  • Requires complex logic that’s impractical to solve with human-defined rules, or heuristics. For example, search engines often have multiple phases of ranking that happen in series, such as initial retrieval, primary ranking, contextual ranking, and personalized ranking. This is a great application for Machine Learning.
  • The problem will be scaling up very fast. If you expect that your problem will scale to thousands of users or more, then it could be a good use case for ML. Let’s say you have an online retail platform and expect to have thousands of customers avail of your new offer within three to six months. Retail customer expectations being what they are, you need a personalized experience and could give a good justification for ML.
  • Requires personalization at scale. Unless you can reduce the complexity of the problem space by, for example, creating solutions for a particular segment or category rather than for each individual, you’re better off using ML to define the rules you use.
  • Require rules that change quickly over time. If your rules generally remain static year after year, then a heuristic solution is preferred. However, if your business’s success depends upon quick adaptation and rule changes, then ML is a good route. For example, if you have a search product and Ed Sheeran drops a new album, your algorithm needs to adapt in real time and is amenable to an ML solution.
  • Has a known, pre-defined end result. For example, in online retail, you want your model to provide recommendations which result in a sale. In search, typing “shirt” should return results with lists of shirts that are most likely to lead to a purchase.
  • Does not require 100% accuracy. If business success can be achieved with a high probability of accuracy rather than with perfection, then ML is a good option. For example, recommendation systems will not be considered faulty if users don’t always want what is served. Users can still have a great experience and the program can learn from the lack of sales to deliver improved recommendations in the future.

Problems that don’t need ML-based solutions

Some features of Google’s Gmail are great examples of when a heuristic-based solution is preferred. In the screenshot below, Gmail looks for phrases including words like “attachment” or “attached” to pop up a reminder when someone may have forgotten to attach a file. Although an ML system would most likely catch more potential mistakes, it would be far more costly to build. The heuristic solution provides a good user experience and allows organizational resources to be used for more impactful projects.

A successful heuristic-based solution (Image by Google/Gmail)

What do I need to succeed?

A successful Machine Learning project needs the following:

  • A clearly defined business problem. Machine Learning shifts engineering work from a deterministic process to an experimental one, so there is an even greater need to know what you want to achieve upfront.
  • The right team. You’ll need a team with skills in both Data Science and Engineering. In Data Science, roles can include an ML Scientist, Applied Scientist, Data Scientist, and/or Research Scientist. In Engineering, roles can include a Business Intelligence Engineer, Data Engineer, Software Engineer, Development Manager, and/or Technical Program Manager.
  • A grasp of the project’s potential risks and returns. Even if a project is feasible from an ML perspective, the level of effort needed to develop and maintain an ML-based solution may make such an approach impractical. For example, if your model needs to be updated very frequently in production, to the point where it requires a lot of maintenance, it may not be worth it.
  • Enough data. You need at least thousands of rows of data for linear models and hundreds of thousands of rows for a neural network. If you don’t have the data, you’ll need strategies to acquire it, and may need to stick to heuristic-based approaches until you do.
  • Data with a clear pattern. Since algorithms require patterns to learn from, you need to have a sense of what patterns exist in your data, even if you don’t know the precise pattern before you get started. You should be able to articulate these patterns qualitatively or have a gut feeling at the minimum.
  • High-quality data. As the adage goes, “garbage in, garbage out.” Lean on your Data Scientists to help you determine your data’s status and/or how to acquire data that are:
    a. sufficient in completeness and simplicity
    b. relevant to the problem that you are solving
    c. recent, reflecting users’ current behaviors
    d. representative of the segment and timeframe you‘re addressing
    e. unbiased
    f. respects user privacy and is secure
  • The right technologies. There are several open-source tools and platforms, such as Amazon AI, TensorFlow (originally developed by Google), and many others that make machine learning accessible to virtually any company today.

How do I fit Machine Learning into my roadmap?

While the specific details may vary from one project and company to another, these are the general steps that you’d need to consider.

1. Understand the business need

Start by avoiding the mistake made by many companies and product teams: don’t jump right into product strategies that start with Machine Learning as a solution and skip right to focusing on a meaningful problem to solve.

2. Formulate the problem hypothesis

For each of the business needs identified in the preceding step, formulate and document the hypothesis that you intend to test. In general, your hypothesis statement will have the following parts:

  • A change that you are testing (“Improving the search ranking with ML will…”)
  • A desired outcome (“… allow our customers to find the correct product…”)
  • Success metrics (“…in 15% less time.”)
  • A description of the Model’s output (“The model will score each possible product…”)
  • Predictors (“…by using products recently viewed by the customer, the types of products previously purchased, the monetary value of previous purchases, and our own understanding of what products are frequently brought together …”)
  • Target (“…to predict the product that the customer ultimately selects for purchase.”)

3. Define clear measures of success

What does success look like if we were to solve the problem and meet the business need? You and your stakeholders need to get to a shared understanding of what that means, based on the problem hypothesis.

4. Assess your candidate projects

With a defined business need, customer problems to solve, and a clear measure of success, assemble a team of UX/UI professionals, ML experts, and data scientists early in your roadmap definition process.

Left: Impact/Effort Matrix (created by the author). Right: User Impact/ML Impact matrix (Image credit: Google)

5. Confirm that you have the data you need to succeed

Evaluate your team’s data needs and whether the available data meets the criteria set forth in the previous section. ML models require a lot of data, are complex, and can take a long time to develop and test before achieving production-readiness.

6. Establish healthy data collection practices

With regards to the actual task of data collection, here are some excellent recommendations on how to gather quality and standardized data from users, based on Peter Skomoroch’s recent Strata Data Conference presentation.

  • Guide user input when you can
  • Use auto-suggest fields
  • Validate user inputs, emails
  • Collect user tags, votes, and ratings
  • Track impressions, queries, and clicks
  • Sessionize logs
  • Disambiguate and annotate entities (company names, locations, etc.)

7. Tackle the risks and challenges

ML projects come with their own sets of risks and challenges. You’ll want to address these risks head-on to keep your project on track.

  • Be aware of the types of bias that might impact your model. Mitigate these biases by allowing your model to measure them, then take steps to counteract the biases.
  • Be aware that seemingly small UI changes may result in significant backend ML engineering work that may put the overall project at risk. For example, if the wording of a question is changed in your website or app, it can change the response from users and render your historical data useless. Be fully aware of the impact of any recommended changes, A/B Test, and get user feedback!
  • Even with the right data, you may still not end up with a working model. For example, if your model overfits the training data, over-learning the data’s details and noise, the model will fail to make correct predictions on new data. Data scientists can address this by “regularizing” the model.
  • Asking for too much user data without any visible benefit can cause the user to abandon your product. Provide value to your users as early as possible before asking them for more data.
  • You need to have the appropriate security and privacy precautions in place any time your model includes or relies on personal data. Proactively consult with your company’s legal, safety, and security teams for advice in this area.

8. Expect to iterate

Once your model is deployed, continue to iterate and improve on it. Easily 80% of the work happens after the first version of an ML model is shipped to production. This work includes model improvements as well as adding new signals and features into the model as more data becomes available and new insights come to light.

Conclusion

Machine Learning provides massively scalable solutions for recommendations, ranking, classification, anomaly identification, and more. Experimental in nature and time-consuming, ML projects require clear goal definitions and measures of success for evaluating the end results that are expected from your ML model.

About the Author

Kenlyn Terai is passionate about solving problems through an entrepreneurial lens, seeking to deeply understand customers, find product/market fit, and build products that delight users. Whether she’s at her own firm, a startup, or a Fortune 500 company, she loves the challenges of leading and coaching teams to reach shared goals and deliver great products. You can reach her through LinkedIn.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Women in Product

Women in Product

A global community of women working in Product Management.