How not to machine learn
Last Sunday (December 1st), we updated our app with usage and spending prediction:
Then on Wednesday, we added prediction to the feed graph:
We’d forgive you for thinking, “Gosh, these Zevvle folks really are pushing the limits of telecoms.” You wouldn’t be wrong, but for a different reason.
Today I thought we’d go into some of the details. Starting off…
What we should have done
First, we take the graph of your usage and create a high-resolution image before printing it on the super fine setting. You know the one:
We then take some photos of the graph at different angles (perspective ≈ accuracy) and load them back into the computer. These go through a well-trained machine-learning algorithm (a Convolutional Neural Network in this case, or CNN, which is excellent for classifying images – is it a cat, a dog, a graph?) so we can be 98% certain this is in fact a graph of your usage. We also “digitise” the graph at the same time (i.e., calculate the actual values) — a list of data points is more useful than the image.
With this information, we’re ready to predict: using another machine learning model (this time a Recurrent Neural Network, or RNN, which is better-suited for predicting future events based on past data), we input historical usage and watch as the future unfolds.
For good measure, we distribute this on a global cluster of computers1 across every continent – a lot of computers doing a little bit each, as opposed to a few computers doing a lot.
What we actually do
Unfortunately due to common sense, we didn’t get the above working. The reality isn’t nearly as exciting:
With your usage over the last 2 months, we take an average and use that to estimate your future usage. It’s a weighted average, so we give more precedence to recent events — your usage last week is a better indicator vs 2 months ago of what you’ll use next week.
And that’s it. We’re humans, not magicians.
What we will do
Satire aside, prediction like this is a prime candidate for machine learning (“AI” is a rogue term). It’s really good — better than humans, good — at finding weird and complicated relationships within datasets, like someone’s usage or spending history.
In the future we will implement this, likely with the previously-mentioned Recurrent Neural Network. But that’s for another day, and certainly nothing convoluted like the above scenario.
Have a great weekend,