Julia Evans wrote a post recently titled “Machine learning isn’t Kaggle competitions“.
It was an interesting post because it pointed out an important truth. If you want to solve business problems using machine learning, doing well at Kaggle competitions is not a good indicator of that skills. The rationale is that the work required to do well in a Kaggle competition is only a piece of what is required to deliver a business benefit.
This is an important point to consider, especially if you are just starting out and find yourself struggling to do well on the leaderboards. In this post we ruminate on how competitive machine learning relates to applied machine learning.
Competitions vs the “Real World”
Julia made an attempt at a Kaggle competition and did not do well. The problem was that she does machine learning as part of her role at Stripe. It was this disconnect from what makes her good at her job and what it takes to do well in a machine learning competition what sparked the post.
Scope must be limited to be able to assess skill. You know this if you have ever taken a test at school.
Think of a job interview. You can get the candidate to hack on the production codebase or you can get them to work through an abstract standalone problem. Both approaches have their merits, where the benefits of the latter is that it is simple enough to parse and work through in an interview environment. The former may require hours, days, weeks of context.
You can hire a candidate based purely from their test scores, you can hire a programmer based on their ranking on Top Coder and you can hire a machine learning engineer based on their Kaggle score, but you must have confidence that the skills demonstrated in their assessments translate to the tasks required of them on the job.
That last part is hard. Thats’s why you throw live questions at candidates to see how they think on the fly.
You can be awesome at nailing competitions and poor at ML on the fly or in the context of the broader set of expectations of an engineer in the workplace. You can also be great at machine learning in practice and do poorly in competitions as reasonably claimed in Julia case.
Broader Problem Solving Process
The key to Julia’s argument is that the machine learning required in a competition is but a piece of the broader process required to deliver a result in practice.
Julia uses predicting flight arrival times as the problem context for nailing this point home. She highlights facts of the broader problem as follows:
- Understand the business problem
- Choose a metric to optimize
- Decide what data to use
- Clean up your data
- Build a model
- Put the model into production
- Measure model performance
Julia points out that a Kaggle competition is only point 5 (build a model) in the above list.
It’s a great point and I totally agree. I would point out that I do think that what we do in a Kaggle competition is machine learning (hence the title of this post) and that the broader process is called something else. Maybe that was data mining, maybe it is applied machine learning, and maybe this is what people mean when they throw around data science. Whatever.
Machine Learning is Hard
The broader process is critical and I stress this all of the time.
Now think about the steps in the process in terms of the technical skills and experience required. Data selection, cleaning, and model building are hard technical tasks that require great skill to do well. To some degree a data analyst and even a business analyst can perform much of the duties, except the build a model step.
I may be out on a limb here, but perhaps that is why machine learning is put on such a high pedestal.
It is hard to build great models. Very hard. But, the great models as defined by a machine learning competition (score against a loss function) are almost always not the same as the great models required by the business. Such finely tuned models are fragile. They are hard to put into production, they are hard to reproduce, they are hard to understand.
In most business cases you want a model that is “good enough” at picking out the structure in the domain rather than the very best model possible.
Julia makes this point in referencing the failure of deploying the winning models in the Netflix Prize.
Competitions Are Great
Kaggle competitions, like conference competitions before them, can be great fun for participants.
Traditionally, they have been used by academics (mostly grad students) to test out algorithms and discover and explore the limits of specific methods and methodologies. Algorithm bake-offs are common in research papers, but of little benefit in practice. This is known.
The key point and the point I believe Julia set out to make is to not despair if you find yourself struggling to do well in Kaggle competitions.
It is very likely because the competition environment is hard and the evaluation of your skill disproportionally biased towards one facet of what is required to do well in practice, model building.
No comments:
Post a Comment