D2M Blog

ML 101: Talking Machine Learning (to a non-data scientist)

How do you describe ML to a business lead?

One of the problems that we encounter again and again is how to best describe machine learning to the decision makers within a business. Often, these leaders come with sky-high, but vague, expectations and need to understand what advantages, specifically, machine learning can bring to their line of business. In addition, they need to understand the limitations of machine learning, and some of the risks that go along with it. In this first entry, we’re going to describe machine learning in its most basic terms, and give a few examples for how to look for use cases within your organization.

What is it good for?

At its core, machine learning is great at answering a set of very specific questions. If you have a business problem that can be phrased as one of these questions, and a set of quality data (we’ll get to that later…) then machine learning may be able to help.

These questions are:

How much / how many?
What kind of a thing is this?
Are there groups in my data?
Is this data point unusual or weird at all?

Let’s touch on each of these in order.

How much / How many? (Regression)

This one is simple. I know how many times something happened in the past, now I want to make predictions about the future. The most common tool for answering these kinds of ‘how much / how many’ questions is regression modeling. Here are a few examples of ‘how many’ questions that can be asked by a business:

How many service desk tickets am I likely to receive next Wednesday? (Ticket forecasting)
How much revenue are we likely to see in March? (Revenue forecasting)
How many days until the next likely equipment failure? (Predictive maintenance, using Time to Failure)

What kind of a thing is this? (Classification)

Imagine you’ve got a bowl of fruit, and want to teach a child the names of the apples, oranges, and bananas in the bowl? How would you go about it? Well, you might point to an apple, and say “This is an apple.” Then point to an orange, and say “This is an orange.” You might do that about 70 times, then pick up a fruit and ask, “What is this?” another 30 times to make sure the child has the names down.

That is exactly how classification in machine learning works. You provide examples mapping to discrete values, and ask the model to determine “what kind of a thing is this?”

Let’s think of another example. What do all of these statements have in common?

Hi!
Hello
What’s up?
Hi there.
Hey

The answer is they’re all mapping to the same intent. They’re all different ways of saying hello.

This is exactly how natural language in chatbots work. They take a statement (or utterance) and ask the question, “what kind of an intent is this?” The statement is then mapped to an intent so that the bot knows how to respond.

How can “what kind of thing is this?” questions help a business? Here are a few examples:

What kind of a form is this? (Form classification and routing)
What kind of a customer is this? (Marketing bucketing)
What kind of data is in this form? (ICR/OCR)
What kind of tone is the customer showing? (Sentiment analysis)
What problem is the user having? (Ticket classification in help desk)

Are there groups in my data? (Clustering)

Whereas the first two questions involve providing training data to create our model (supervised learning), this question involves just taking data, and finding clustered groups. How can this help a business? It turns out this is a powerful way to determine how different things are alike, and you see it in use every day. Your favorite streaming site may use clustering to suggest your next show. National retailers use clustering to find geo-specific buying patterns. Epidemiologists use clustering to predict disease outbreaks.

Some examples include:

Are there groups in my marketing data? (Market segmentation)
Are there groups in consumption data? (Recommendation engines)

Is this data point different from the others? (Anomaly detection)

Outside of my house sits an analog electricity meter that is read monthly by an energy company employee. A few years ago, here were, roughly, my monthly bills:

February: $55
March: $62
April: $64
May: $1564
June: $58
July: $61

Does anything in there seem out of place? If anomaly detection had been in place for billing, there is a good chance that bill would never have gone out, and customer dissatisfaction would have been avoided.

Other examples of anomaly detection include:

Fraud detection
IT Intrusion detection
Condition monitoring in health care
Log monitoring for machine maintenance

Conclusion

That’s it! If you know nothing else about machine learning, know that these four questions can help business line owners find use cases that will allow their organizations to effectively start their machine learning journey. To recap, these include:

How much / how many?
What kind of a thing is this?
Are there groups in my data?
Is this data point unusual or weird at all?

Learn how Machine Learning can provide efficiencies and accelerations to your business goals.