A Beginner's Guide to Machine Learning [expert interview]

Machine learning, deep learning, AI…everybody seems to be discussing it, but what is machine learning actually about and what value does it bring? We have asked an independent AI and machine learning expert to help us understand this complex topic a bit better. Jonathan Prom Scharff is currently working on his own project with the aim of reducing the national energy consumption by leveraging the newest developments in machine learning. As a machine learning consultant, Jonathan can also be hired for machine learning projects in other sectors.

1. How do companies leverage machine learning for business purposes today and how fast is the development in this field?

This is the standard question that people like to address at business-related conferences. Typically, it is followed by a slide saying “AI is the new electricity” and probably some quotes from a McKinsey-report saying that a huge percentage of jobs is going to be automated in the coming years. The speaker will then most likely look gravely at the audience and utter the feared words “if you don’t act now then it is a matter of time before you get disrupted”. While this is probably a good sales tactic – there’s nothing like fear to get people going – in my personal experience the everyday reality is quite different. Yes, Google, Facebook and the like are deploying machine learning solutions at a large scale. And, yes, a lot of jobs are undeniably going to be automated but, in my experience, most of the big firms use machine learning to a surprisingly limited degree. In cases where a machine learning solution is used, then it is usually either extremely simple – read borderline wrong – or extremely complex and not working very well. So, either some intern quickly did something, without having a lot of knowledge, or the firm hired a bunch of PhD’s who implemented this state-of-the-art research paper from scratch. Both of these extremes tend to not work very well. The first for obvious reasons and the second because getting a machine learning solution to work is generally not a theoretical exercise. It’s really about getting the right data, rephrasing the problem so it fits into the standard machine learning toolbox and then just use the standard tools. So, to sum up, yes, machine learning is probably going to have a huge impact on many industries but, with a few exceptions, things aren’t moving that fast. And disrupting industries is really hard. Even if a startup had everything figured out, with regards to the machine learning part of things, then a startup would typically not have access to the data needed to getting the machine learning algorithms to work.

2. So where does machine learning actually already generate value?

The answer to this question is probably going to vary quite a bit depending on whom you ask. Personally, I think machine learning is starting to generate a lot of value, when it comes to automating somewhat “simple” tasks. The key to identifying, whether it would make sense to use machine learning to automate a certain task is to consider, whether the problem can be decomposed into “well defined boxes”. Let’s say you would like to automate the process of determining, whether to provide a person with insurance. This could be turned into a well-defined problem by saying “given these data points, predict number 1 if the person should get insurance and number 0 if the person should not get insurance”. Such a problem description would make it likely that a machine learning solution would be successful. If on the other hand the problem was phrased as “given these data points, write an insurance contract” then it would be almost impossible to automate the process using machine learning. In my experience one can, however, often turn a very difficult problem into a well-defined problem by splitting it in to smaller sub-problems and compromising a bit. So instead of phrasing the problem as “write an insurance contract”, one could phrase the first sub-problem as “predict, which of these 7 standard contracts would be the right one to use”. The next sub-problems could then be to predict appropriate numbers in certain pre-specified places in the chosen standard contract. In this way, by splitting a complex problem into somewhat standardized sub-problems, one can automate large quantities of work by using machine learning. The important thing to consider is whether the task you want to automate can be turned into a well-defined prediction. So, “which of these seven possible standard contracts is the right contract?”, “What should the price be for insuring this specific circumstance?”, and so on. I think another important point is to not try and automate too much of the workflow. Often you will be in a situation, where you can get 70/80 percent of the workflow automated, relatively quickly, and instead of trying to completely remove the human from the equation then stop at the 70/80 percent of automation. Covering the last 30/20 percent is usually very costly, and often also quite error prone.

3. Try to explain Machine Learning using a specific example

All right, imagine you are a judge and you are considering automating your job. Who cares about criminals anyway, right? The problem with automating the process of sentencing people is essentially “that to every rule there is an exception”. So, if a person had to write a computer program that automatically sentenced people, then it would require a huge amount of code to take every imaginable special case into account. Furthermore, even the slightest change of the law would render the program useless.

But what if instead of having a human writing all the rules into a computer program, we let a machine do it? Can’t we just give a computer a lot of criminal cases along with the resulting sentences and then the computer figures out the rules by itself? Turns out we can, and this process is commonly referred to as machine learning. So machine learning can be thought of as a bunch of algorithms than can learn to copy observed behavior. Hence, in the case of creating an automatic judge, then you would compile a dataset of previous cases and the accompanying verdicts. The historical cases would then constitute the observed behavior and a machine learning algorithm would now try to come up with an elaborate rule system that allowed it to copy the judges’ sentencing behavior in the historical cases.

So, what would happen if you tried this in real life? Well, actually it has already been done in the US. The first problem people encountered when deploying the AI judge was that the AI judge seemed to be slightly racist in its sentencing. This behavior was a result of the historical court rulings having a tendency to give black people longer sentences. Hence, the AI judge copied that behavior. So, the notion of a machine algorithms not being subject to human biases is mostly wrong. But let us say we are interested in creating a non-racist AI judge. How do we remove its tendency to give black people longer sentences? This is, where creating a machine learning system becomes more of an art than a science. The solution is to either remove racist rulings from the dataset or give the machine learning algorithm information that allows it to learn that this sentence is slightly racist and this sentence is not. One solution could be to try and remove all rulings, which are unfavorable towards black people. Another would be to just train the AI judge on the racially biased data and when the AI judge is used in practice, then always tell it that it is a white person, who is up for trial. In that way all people would be judged as if they were white.

4. But, what algorithms should you use to create your AI judge?

So, let’s say you have assembled your dataset of historical cases and corresponding rulings that showcases behavior you want to copy. Hence, you are now ready to get that AI judge up and running. Most people would immediately turn to the much talked about neural networks and deep learning techniques. (Red.: Neural networks are a specific class of machine learning algorithms that lately have been re-branded as deep learning.) But I would argue that the inexperienced user would be able to easily get state-of-the-art results using a technique known by the impressively unsexy name: Gradient boosting decision trees. Outside machine learning competitions, no one has really heard about gradient boosting decision tree methods but when it comes to winning machine learning competitions this algorithm is one of the all-time greatest performers.

5. Does that mean that deep learning is completely overrated?

Good question, you might be tempted to remark: “why is everyone then talking about this deep learning thing? They cannot all be idiots?” The simple version of the answer is related to what is called unstructured data. So, when you are creating your data set of historical cases you will usually create a structured representation of the historical cases. This essentially means that you create a spreadsheet summary of each case. So the spreadsheet summary might contain variables such as race, number of previous convictions, number indicating how severe the offence was etc. For each historical case you then fill in the correct number for all the variables in your spreadsheet summary. Hence, you convert a bunch of historical cases into a spreadsheet, where each row corresponds to a historical case. This structured representation is then feed to a machine learning algorithm, which then learns the underlying rules connecting the historical cases with the corresponding sentences. The problem is that sometimes it is very difficult to create a spreadsheet summary of your input data. The most prominent examples of such data are pictures and text. So, even though it is very easy for humans to recognize the objects present in a picture, this task was very difficult for machine learnings algorithms. That is, however, before the rise of deep learning, which completely transformed machine learning systems ability to interpret image data and text data. Note, that deep learning also works very well on structured data but for structured data there exists much easier to use algorithms with comparable performance.

6. How do you recommend to get started with Machine Learning

Believe it or not but it is actually surprisingly easy to quickly get started with machine learning. Some wonderful courses on how to quickly get state-of-the-art results with machine learning are the courses freely available at course.fast.ai. These courses are mostly centered around deep learning but a new course with a focus on more traditional machine learning techniques is soon being released. Another great starting point is the website kaggle.com, where machine learning practitioners share their approaches to solving machine learning competitions.

Can AI give value to your business? Let an independent consultant help you.

If you are looking for an AI expert or AI project manager, a machine learning engineer, an automation developer or another consultant in this field, then contact us for a free offer on a carefully selected profile for your project.

A Beginner’s Guide to Machine Learning