Additional Code and Data Samples (R, ML Services, SSAS)

Whilst testing classiﬁer performance, it is helpful to compare a number of classiﬁcation accuracy visualisations. You will ﬁnd that there are many ways to do that in R, but SSAS Data Mining and Azure Machine Learning do not all support the same, broad set of diagnostic visualisations. You can easily run this simple R code to validate a classiﬁer built in SSAS, Azure ML, or any another environment, by downloading it from:

Rafal’s GitHub classiﬁer-performance repository

It plots ROC, precision-recall, cost and lift curves, calculates optimum probability threshold, prints a confusion matrix, and additional metrics, for any two-class machine learning classiﬁer. The only required inputs are a vector of known outcomes and a vector of predicted probabilities. As a bonus, this code will look up the optimum prediction probability threshold given a ratio of the cost of a False Positive to a False Negative.

These sample data sets have been prepared and used by Rafal on his Practical Machine Learning and Data Science classroom-style courses. They include a working SQL Server 2017/2019 Machine Learning Services mortgage analysis R script and the SQL Server 2016+ .bak ﬁle containing the 10 million rows of data for this demo, as well as code showing how to write DMX for predicting recommendations using Association Rules algorithm in SSAS Data Mining, with and without buyer-level demographic details.

Please note that we do not provide any support for the demo ﬁles available below, and they are provided as-is. By downloading them you hereby accept the terms of the Apache License 2.0 under which they are being distributed.

Classiﬁer performance in R (ROC, lift, cost, precision-recall, prediction probability threshold) or get it from Github
SQL Server SSAS Association Rules recommendation/cross-sell predictions in DMX (with and without demographics)
Mortgage default logistic regression in Microsoft R for SQL Server ML Services script
Mortgages.bak SQL Server 2016+ database backup with 10 million rows (approx 86 MB download). Please note that this data has been derived from a Microsoft-owned demo available on MSDN. Although Microsoft do not provide this data as a SQL Server database, they retain the rights to this data, and so, this ﬁle although created by, it is not owned by, Tecflix Ltd or Project Botticelli Ltd.
If you are looking for our educational machine learning data set, HappyCars, it is available here.

Enjoy!

Log in or register for free to access this content.

Data Mining with SQL Server SSAS

Introduction to Data Mining with Microsoft SQL Server 24-min Watch with Free Subscription
Data Mining Concepts and Tools 50-min
Data Mining Model Building, Testing and Predicting with Microsoft SQL Server and Excel 1-hour 20-min
What Are Decision Trees? 10-min Free—Watch Now
Decision Trees in Depth 1-hour 54-min
Why Cluster and Segment Data? 9-min Watch with Free Subscription
Clustering in Depth 1-hour 50-min
What is Market Basket Analysis? 10-min Watch with Free Subscription
Association Rules in Depth 1-hour 35-min
HappyCars Sample Data Set for Learning Data Mining
Additional Code and Data Samples (R, ML Services, SSAS) Get with Free Subscription

Purchase a Full Access Subscription

Individual Subscription

$480/year

Access all content on this site for 1 year.
Purchase

Group Purchase

from $480/year

For small business & enterprise.
Group Purchase

You can also redeem a prepaid code.
Payments are instant and you will receive a tax invoice straight away.
We oﬀer sales quotes/pro-forma invoices, and we accept purchase orders and bank transfers.
Your satisfaction is paramount: we oﬀer a no-quibble refund guarantee.
See pricing FAQ for more detail.

Additional Code and Data Samples (R, ML Services, SSAS) Get Free Access Purchase the entire course

Classifier performance, mortgage default logistic regression, cross-sell and recommendations

Data Mining with SQL Server SSAS

Purchase a Full Access Subscription

Individual Subscription

Group Purchase

In collaboration with

Company

Courses

Resources

Help

Search form

Additional Code and Data Samples (R, ML Services, SSAS) Get Free Access Purchase the entire course

Classifier performance, mortgage default logistic regression, cross-sell and recommendations

Data Mining with SQL Server SSAS

Purchase a Full Access Subscription

Individual Subscription

Group Purchase

Get the Newsletter

In collaboration with