Website : WordPress = Machine Learning : KuberLab

HTML site

Over 15 years ago, I decided I needed to do a website for my photography hobby. I also wanted to do one for travel. Being a software engineer and a geek, I wrote it in HTML. Websites then were very simple, so what I had wasn’t fancy either,  just a couple of pages with a blurb about me, and a bunch of photos.

CGI-Perl

The next version was fancier – I wanted it to pick up photos automatically from a directory structure, so I wrote a bunch of Perl scripts and had an automated website. If I uploaded photos into the structure in a particular way, the scripts would automatically pick them up and display them.

Website Template

The third version (and partly my current fourth) was much fancier. I purchased a nice template, and customized it by hand. I was still fluent with Emacs and HTML, so it was easy.

WordPress

My current version is still a template, but now I’m not so good with Emacs and HTML any more, so for the parts I cannot manage, such as my blog, I use WordPress. WordPress (or Wix, Hugo, etc.) is what anyone would use these days if you wanted to build and run a website. This has become the norm – people expect slick websites, and nobody wants to actually code them. And WordPress is not just for building, it’s used to manage the website for ever and monitor, update – all those things required for production.

The Relationship to Machine Learning

So, what WordPress did to websites is mostly what KuberLab is doing to Machine Learning (KuberLab does more, but this is a good analogy). In the early days the companies like Google and Facebook, that have already crossed the Machine Learning Application deployment chasm, probably did the equivalent of what I did with my first HTML website. Then of course, they developed tools and frameworks, and now it became like my CGI-Perl website. With the current movement towards democratization of machine learning, there is now a lot of code available via the Kaggle competitions, Github repositories and so on. So, this is like the template website, one can pick up somebody’s code, tinker with it, and use it for oneself. This is also what is happening on the cloud, with AWS, GCP, Azure and others providing many tools, frameworks and templates for building machine learning applications.

Enter KuberLab

KuberLab is like the WordPress of machine learning. If you are an enterprise wanting to deploy a machine learning application, this is what you want. Sure, you could put time and effort, spend lots of money, and build it yourself and perhaps even feel good about it. But that effort doesn’t scale, deviates from your core business, and will not be production grade for everyone. For this, you need KuberLab. KuberLab is the platform that helps enterprises accelerate their adoption of AI applications. See http://www.kuberlab.com, or try it out at https://go.kuberlab.io

Why Machine or Deep Learning, and How to Choose

Two incidents happened recently, which prompted me to write this article. First, I was writing a proposal for a potential partner for my startup, on how we could partner with them to bring AI into their products and applications. They then turned around and asked us why they should want to develop AI applications in the first place.

Second, I was chatting about my startup with an investor friend. After he understood that we did a platform to accelerate the adoption of AI techniques by enterprises, he told me: “Shekhar, you are two steps removed from the problem. Enterprises today don’t even understand what AI is, how it differs from traditional techniques. They need to get this first.”

Of course I knew this, and had been talking about it, but I realized I needed to put it down. Now, a real comprehensive reasoning will be customized to the specific customer and the problem. I do have some general points however and here they are.

Analytics has been and can be done without AI of course. In many cases, the non-AI approach can be severely limiting, for the following reasons. This is not intended to be an exhaustive list, it only contains some of the main reasons why many enterprises are moving to AI models for data analysis.

  • The traditional data analytics approach is reactive. Data is collected from various sources, and analyzed with tools that display it in various ways, with dashboards, graphs, logs, etc. The patterns that these tools look for are based on existing knowledge. This is akin to monitoring, and the response to triggers requires human intervention anytime something new is seen. With AI, the responses are proactive. AI is used to predict the triggers, and automatically apply the remediation, hence it is a further level of automation with a significantly reduced need for human intervention.
  • AI makes prediction decisions based on past and current input of data into the model. The model learns from the data input, and the more realistic data is used, the better it learns. This then improves the predictions and decisions the model makes – so, a prediction made in the past would generally be different than a prediction made later, even with the same data input, because in the time in between, the model has learnt more and has become more accurate in its predictions. In the case of non-AI predictive models, the same data input always results in the same predictions irrespective of when this happens, since the model is not learning.
  • Traditional models were designed to work with less data. Nowadays, every organization collects large amounts of data that were not possible before. The newest deep learning algorithms perform better with an abundance of data which was not the case before. The amount of data used for these techniques would often be overwhelming for traditional techniques.
  • With unsupervised learning, the newer AI techniques can detect patterns in the data that would not be obvious to humans. For example, recently it was determined that the plague or Black Death in 14th century Europe was more likely spread by humans via lice, rather than rats as was earlier thought. This was determined by simulating outbreaks in various cities with different models (rats, airborne, fleas/lice) and finding out which fit best. If the data had been fed to a deep learning model with unsupervised learning, it would have discovered patterns that led to the conclusion faster, without the need for simulations.

Within the broad category of AI, there are multiple categories of learning. The latest, and the one most cited these days is Deep Learning. The main difference between deep learning and traditional machine learning is in the classification of features required to solve the problem. With traditional machine learning, the feature classification is done manually. With deep learning, the system figures out which features are important for making the prediction and automatically evaluates them for making decisions.

For example, if our problem is weather prediction, and we have collected data on various weather phenomena from the past. To be more specific, let’s say we want to predict the likelihood of a natural fire in wooded areas. We may define the features required for this prediction are: humidity, temperature, the density of trees in a given area, among others. For traditional machine learning, we would need to create algorithms that predict the possibility of a fire as a function of these features. A deep learning algorithm would automatically figure out which features are needed to make a prediction. Suppose we added a feature such as the population of rabbits in the given area. Then, the traditional machine learning model would use this data for its prediction if it were programmed as such, however, the deep learning model would recognize from its training that the rabbit data is not relevant to the prediction and would not evaluate it.

Deep learning also works better with the amount of data fed to it for training. Traditional models taper off after a while, as seen from the graph below. Hence this is a good technique to use in cases where a lot of training data is available.

Some good examples of where to use deep learning is in threat detection where there are many samples of attacks (such as network intrusions) or predictive maintenance for IoT devices, where a lot of failure data is available (Hitachi uses this on their remote earthwork machines).

This also means that in the case where very little training data is available, deep learning is not a very good technique. An example is spearphishing attacks, where the number of successful attacks is very small, hence there is not much training data available for deep learning models, and the number of false positives is so high that this is not a good technique for predicting these specific types of attacks (see this paper on spearphishing).

Here is a cheat sheet for deciding what type of machine learning technique to use for a given problem. The original has links for each individual technique, to explore more details on that technique, and can be found here.