Whilst testing classifier performance, it is helpful to compare a number of classification accuracy visualisations. You will find that there are many ways to do that in R, but SSAS Data Mining and Azure Machine Learning do not all support the same, broad set of diagnostic visualisations. You can easily run this simple R code to validate a classifier built in SSAS, Azure ML, or any another environment, by downloading it from:
It plots ROC, precision-recall, cost and lift curves, calculates optimum probability threshold, prints a confusion matrix, and additional metrics, for any two-class machine learning classifier. The only required inputs are a vector of known outcomes and a vector of predicted probabilities. As a bonus, this code will look up the optimum prediction probability threshold given a ratio of the cost of a False Positive to a False Negative.
These sample data sets have been prepared and used by Rafal on his Practical Machine Learning and Data Science classroom-style courses. They include a working SQL Server 2017/2019 Machine Learning Services mortgage analysis R script and the SQL Server 2016+ .bak file containing the 10 million rows of data for this demo, as well as code showing how to write DMX for predicting recommendations using Association Rules algorithm in SSAS Data Mining, with and without buyer-level demographic details.
Please note that we do not provide any support for the demo files available below, and they are provided as-is. By downloading them you hereby accept the terms of the Apache License 2.0 under which they are being distributed.
Enjoy!
Log in or register for free to access this content.