Artificial Intelligence , Machine Learning and Data Science Hubspot

Unlock the Power of Artificial Intelligence, Machine Learning, and Data Science with our Blog Discover the latest insights, trends, and innovations in Artificial Intelligence (AI), Machine Learning (ML), and Data Science through our informative and engaging Hubspot blog. Gain a deep understanding of how these transformative technologies are shaping industries and revolutionizing the way we work. Stay updated with cutting-edge advancements, practical applications, and real-world use.

Wednesday, 4 October 2023

How to Improve Machine Learning Results

Having one or two algorithms that perform reasonably well on a problem is a good start, but sometimes you may be incentivised to get the best result you can given the time and resources you have available.

In this post, you will review methods you can use to squeeze out extra performance and improve the results you are getting from machine learning algorithms.

When tuning algorithms you must have a high confidence in the results given by your test harness. This means that you should be using techniques that reduce the variance of the performance measure you are using to assess algorithm runs. I suggest cross validation with a reasonably high number of folds (the exact number of which depends on your dataset).The three strategies you will learn about in this post are:

Algorithm Tuning
Ensembles
Extreme Feature Engineering

Algorithm Tuning

The place to start is to get better results from algorithms that you already know perform well on your problem. You can do this by exploring and fine tuning the configuration for those algorithms.

Machine learning algorithms are parameterized and modification of those parameters can influence the outcome of the learning process. Think of each algorithm parameter as a dimension on a graph with the values of a given parameter as a point along the axis. Three parameters would be a cube of possible configurations for the algorithm, and n-parameters would be an n-dimensional hypercube of possible configurations for the algorithm.

The objective of algorithm tuning is to find the best point or points in that hypercube for your problem. You will be optimizing against your test harness, so again you cannot underestimate the importance of spending the time to build a trusted test harness.

You can approach this search problem by using automated methods that impose a grid on the possibility space and sample where good algorithm configuration might be. You can then use those points in an optimization algorithm to zoom in on the best performance.

You can repeat this process with a number of well

Ensembles

Ensemble methods are concerned with combining the results of multiple methods in order to get improved results. Ensemble methods work well when you have multiple “good enough” models that specialize in different parts of the problem.

This may be achieved through many ways. Three ensemble strategies you can explore are:

Bagging: Known more formally as Bootstrapped Aggregation is where the same algorithm has different perspectives on the problem by being trained on different subsets of the training data.
Boosting: Different algorithms are trained on the same training data.
Blending: Known more formally as Stacked Generalization or Stacking is where a variety of models whose predictions are taken as input to a new model that learns how to combine the predictions into an overall prediction.

It is a good idea to get into ensemble methods after you have exhausted more traditional methods. There are two good reasons for this, they are generally more complex than traditional methods and the traditional methods give you a good base level from which you can improve and draw from to create your ensembles.

Summary

I this post you learned about three strategies for getting improved results from machine learning algorithms on your problem:

Algorithm Tuning where discovering the best models is treated like a search problem through model parameter space.
Ensembles where the predictions made by multiple models are combined.
Extreme Feature Engineering where the attribute decomposition and aggregation seen in data preparation is pushed to the limits.

Resources

If you are looking to dive deeper into this subject, take a look at the resources below.

Machine Learning for Hackers, Chapter 12: Model Comparison
Data Mining: Practical Machine Learning Tools and Techniques, Chapter 7: Transformations: Engineering the input and output
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Chapter 16: Ensemble Learning

performing methods and explore the best you can achieve with each. I strongly advise that the process is automated and reasonably coarse grained as you can quickly reach points of diminishing returns (fractional percentage performance increases) that may not translate to the production system.

The more tuned the parameters of an algorithm, the more biased the algorithm will be to the training data and test harness. This strategy can be effective, but it can also lead to more fragile models that overfit your test harness and don’t perform as well in practice.

Artificial Intelligence , Machine Learning and Data Science Hubspot

Wednesday, 4 October 2023

How to Improve Machine Learning Results

Algorithm Tuning

Ensembles

Summary

Resources

No comments:

Post a Comment

AI:List the estimated population in the sky scraping building and in the middle class families and lower backward families and number of male female disputes every year and mental illness cases

Report Abuse

Labels

"Donate for a Noble Cause