Super vectorizer vs vector magic

5/7/2023

It will keep you updated about new stories and give you the chance to text me to receive all the corrections or doubts you may have. Follow me on Linkedin, where I publish all my stories B. If you liked the article and you want to know more about Machine Learning, or you just want to ask me something you can:Ī. The entire data analysis and Machine Learning project is reported in this GitHub repository.The dataset is pretty huge and it imposed some practical restrictions, but you could use this approach to a smaller dataset and maybe use the entire dataset for the SVM algorithm too.The result is better than you may think, as the SVM uses only the 30% of the dataset and they want to perform the remaining 70%.Nonetheless, it is often possible to improve the algorithm with some extra help (e.g. In this scenario the best algorithm was obviously the Random Forest, that gave an 80% three classes classification by itself. The Machine Learning methods “explore” the same dataset with different perspectives.In order to help the SVM algorithm let’s train again the Random Forest optimum model (best hyperparameters) on a portion of the test set of the SVM algorithm.

So why don’t we use the Random Forest optimum model on the messy zone of the dataset? The performance of this model are actually even better than the SVM model (80%) 3. The interesting things here are that you can use the entire dataset and all the features to have your classification, as this method is not so expensive! I don’t want to kill you, so I’m going fast on this one: you can see all the work on these hyperparameters on this notebook. This is per se an ensemble learning algorithm and it is called “random forest”.Įven the random forest have their hypeparameter tuning that can be applied on a bunch of different hyperparameters. You can build a massive amount of trees out of a dataset, and for this reason you can try to add randomness to this process and build not a single tree but an entire forest where each tree can see only a specific portion of the dataset or a specific number of features. In disgustingly simple terms, these algorithms try to see if applying a certain threshold to a specific feature and splitting the points into two different groups you can have an high number of the points that belong to a specific class. Decision Treesĭecision trees are powerful algorithms that are cheaper than the Support Vector Machine, but still able to get really good performances. What can we do to increase the performance of our algorithm without using days of computational power? 2. But the main problem is that increasing the features means stressing the computational power of our computer, so it is not the best scenario ever. You may think “Ok, but you only used 2 features out of 7”, and this is right. Concretely speaking: the “positive” zone is a mess. In fact a lot of negative points are actually classified as positive. Anyway the algorithm has huge problem in the recall of the negative points. Even the -1 zone for the Support Vector Machine algorithm is pretty precise. The 0 (null) class is actually well classified by the Support Vector Machine algorithm. The entire process is described here line by line: Now that we have them, the performance of the algorithm has been tested slightly increasing the training set size (30% of the dataset). You can see the hyperparameter tuning code in this Jovian link if you want, but it is not that interesting, so I’m going to show the results on the validation set: The training and validation set has been changed iteratively 5 times. The training set has been split in half (training and validation) to perform the hyperparameter tuning that has been explained above. A small portion (but still consistent) of the original data has been considered (10%) for the training set, and 90% for the test set too. Hey, I almost forgot, SVMs are damn expensive.įor this reason, only two features, the most informative two components of the P.C.A. Kernel Type: ‘linear’, ‘rbf’, ‘poly’,’sigmoid’.It would need far more than a report to explain this properly but for our specific goal let’s say that there are two hyperparameters that we can tune to get the best classification algorithm. It is based on the idea of getting the largest margin (distance) between the points of the dataset (in particular a set of them, call support vectors) and the separation hyperplane.īehind this idea, there is a lot of math and some really powerful tricks like the kernel trick and the soft margin/hard margin distinction. The Support Vector Machine algorithm is one of the most powerful one out there in terms of classification.

0 Comments

Super vectorizer vs vector magic

Leave a Reply.

Author

Archives

Categories