Why Machine or Deep Learning, and How to Choose

Two incidents happened recently, which prompted me to write this article. First, I was writing a proposal for a potential partner for my startup, on how we could partner with them to bring AI into their products and applications. They then turned around and asked us why they should want to develop AI applications in the first place.

Second, I was chatting about my startup with an investor friend. After he understood that we did a platform to accelerate the adoption of AI techniques by enterprises, he told me: “Shekhar, you are two steps removed from the problem. Enterprises today don’t even understand what AI is, how it differs from traditional techniques. They need to get this first.”

Of course I knew this, and had been talking about it, but I realized I needed to put it down. Now, a real comprehensive reasoning will be customized to the specific customer and the problem. I do have some general points however and here they are.

Analytics has been and can be done without AI of course. In many cases, the non-AI approach can be severely limiting, for the following reasons. This is not intended to be an exhaustive list, it only contains some of the main reasons why many enterprises are moving to AI models for data analysis.

  • The traditional data analytics approach is reactive. Data is collected from various sources, and analyzed with tools that display it in various ways, with dashboards, graphs, logs, etc. The patterns that these tools look for are based on existing knowledge. This is akin to monitoring, and the response to triggers requires human intervention anytime something new is seen. With AI, the responses are proactive. AI is used to predict the triggers, and automatically apply the remediation, hence it is a further level of automation with a significantly reduced need for human intervention.
  • AI makes prediction decisions based on past and current input of data into the model. The model learns from the data input, and the more realistic data is used, the better it learns. This then improves the predictions and decisions the model makes – so, a prediction made in the past would generally be different than a prediction made later, even with the same data input, because in the time in between, the model has learnt more and has become more accurate in its predictions. In the case of non-AI predictive models, the same data input always results in the same predictions irrespective of when this happens, since the model is not learning.
  • Traditional models were designed to work with less data. Nowadays, every organization collects large amounts of data that were not possible before. The newest deep learning algorithms perform better with an abundance of data which was not the case before. The amount of data used for these techniques would often be overwhelming for traditional techniques.
  • With unsupervised learning, the newer AI techniques can detect patterns in the data that would not be obvious to humans. For example, recently it was determined that the plague or Black Death in 14th century Europe was more likely spread by humans via lice, rather than rats as was earlier thought. This was determined by simulating outbreaks in various cities with different models (rats, airborne, fleas/lice) and finding out which fit best. If the data had been fed to a deep learning model with unsupervised learning, it would have discovered patterns that led to the conclusion faster, without the need for simulations.

Within the broad category of AI, there are multiple categories of learning. The latest, and the one most cited these days is Deep Learning. The main difference between deep learning and traditional machine learning is in the classification of features required to solve the problem. With traditional machine learning, the feature classification is done manually. With deep learning, the system figures out which features are important for making the prediction and automatically evaluates them for making decisions.

For example, if our problem is weather prediction, and we have collected data on various weather phenomena from the past. To be more specific, let’s say we want to predict the likelihood of a natural fire in wooded areas. We may define the features required for this prediction are: humidity, temperature, the density of trees in a given area, among others. For traditional machine learning, we would need to create algorithms that predict the possibility of a fire as a function of these features. A deep learning algorithm would automatically figure out which features are needed to make a prediction. Suppose we added a feature such as the population of rabbits in the given area. Then, the traditional machine learning model would use this data for its prediction if it were programmed as such, however, the deep learning model would recognize from its training that the rabbit data is not relevant to the prediction and would not evaluate it.

Deep learning also works better with the amount of data fed to it for training. Traditional models taper off after a while, as seen from the graph below. Hence this is a good technique to use in cases where a lot of training data is available.

Some good examples of where to use deep learning is in threat detection where there are many samples of attacks (such as network intrusions) or predictive maintenance for IoT devices, where a lot of failure data is available (Hitachi uses this on their remote earthwork machines).

This also means that in the case where very little training data is available, deep learning is not a very good technique. An example is spearphishing attacks, where the number of successful attacks is very small, hence there is not much training data available for deep learning models, and the number of false positives is so high that this is not a good technique for predicting these specific types of attacks (see this paper on spearphishing).

Here is a cheat sheet for deciding what type of machine learning technique to use for a given problem. The original has links for each individual technique, to explore more details on that technique, and can be found here.