Machine Learning With SQL Server 2017 And Python

Introduction

 
In this article, we will cover the steps for installing SQL Server 2017 in the Windows system with Machine Learning Server (In-Database) with Python & R language. Then we call the Python scripts from the SQL server and get SQL data from Python also. Later we will create a machine learning model for SQL server and train the model with Python scripts. Finally, we will predict a result from sample data.
 
SQL Server 2017 Machine Learning Services is an add-on to a database engine instance, used for executing R and Python code on SQL Server. The code runs in an extensibility framework, isolated from core engine processes, but fully available to relational data as stored procedures, as T-SQL script containing R or Python statements, or as R or Python code containing T-SQL.
 
If you previously used SQL Server 2016 R Services, Machine Learning Services in SQL Server 2017 is the next generation of R support, with updated versions of base R, RevoScaleR, MicrosoftML, and other libraries introduced in 2016. The key value proposition of Machine Learning Services is the power of its enterprise R and Python packages to deliver advanced analytics at scale, and the ability to bring calculations and processing to where the data resides, eliminating the need to pull data across the network. SQL Server 2017 supports R and Python. 
 
revoscalepy is the primary library for scalable Python with functions for data manipulation, transformation, visualization, and analysis.
 
Step 1 - Installing SQL Server 2017 with Machine Learning Services (R & Python)
 
Download the free Developer edition (free) of SQL Server 2017 from here - SQL 2017
 
It is an MSI setup downloader file. Please open it and click Download Media.
 
Download Media 
 
This will show you the screen to download the media file. Please choose the ISO option and click download.
 
Download Media 
 
Run the downloaded setup file and choose the first option. (New SQL Server stand-alone installation)
 
Installer Starting
 
You can choose Machine learning services. Here I chose both R & Python services.
 
Feature Selection 
 
There are four services selected now. Please click next.
 
Server Configuration 
 
Microsoft is using Anaconda distribution for installing Python libraries. So please click the consent to Install Python. (Please click R consent also).
 
Consent to Install Python 
 
After some time, our installation will be completed. It will show the below screen.
 
Installation Successfull 
 
Please check if all the services have succeeded properly.
 
Please note that Python Services are installed in the below folder.
 
C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017ML\PYTHON_SERVICES
 
SQL2017ML is the instance name of the SQL server. We must add this path in the Windows system environment variables list to access Python services.
 
As I mentioned earlier, there are four services installed with SQL server 2017. You can check the windows services and find the below services are running properly.
 
Services Running
 
Only SQL Server service and SQL Server Launchpad service automatically run. We can see the SQL Service CEIP service is also running automatically. This service is used for the Customer Experience Improvement Program. Our SQL server is ready now. We can install SSMS for connecting the SQL server.
 
Please download the SSMS from here SSMS and install it.
 
SSMS Installation 
 
After installing SSMS (Version 17) we can connect the SQL server from SSMS.
 
Step 2 - Enabling External Scripts
  1. EXEC sp_configure  'external scripts enabled', 1  
  2. RECONFIGURE WITH OVERRIDE  
The above scripts will enable the external scripts and we are now ready to execute Python scripts from the SQL server. Please restart the SQL server now. Otherwise, this change will not take effect.
 
Please execute the below Python script in SSMS to test our Python service.
  1. EXEC sp_execute_external_script @language = N'PYTHON',   
  2. @script = N'year=2000   
  3. name="Mahesh Chand "  
  4. print("C# Corner was launched by " + name + " in " + str(year))'  
The first time, it will take some time to give the result.
 
Python Script Execution 
 
We have successfully executed our Python script. In the above script, we initialized the year and name of two variables and printed these variables in a statement. If you get any error message, please check your Python code. SQL server will correctly show the error message.
 
Python Error 
 
In the above statement, I made a syntax error by adding two variables in the same line. Python will not support this and shows the error as above.
 
Step 3 - Connect SQL data from Python Code
 
We have already seen executing Python from SQL. Now, we are going to connect SQL data from Python code. You can connect any SQL version using this method.
We are going to create a new database and new table with some sample records in that table.
  1. CREATE DATABASE PythonSQLTest  
  2. GO  
  3. USE PythonSQLTest  
  4.   
  5. DROP TABLE IF EXISTS PythonConnect;  
  6. CREATE TABLE [dbo].[PythonConnect](  
  7.     [id] [intNULL,  
  8.     [name] [nvarchar](50) NULL,  
  9.     [age] [intNULL)   
  10.   
  11. INSERT INTO PythonConnect(id,name,age) values (1,'Sarath Lal',37)  
  12. INSERT INTO PythonConnect(id,name,age) values (2,'Aradhya',4)  
  13. INSERT INTO PythonConnect(id,name,age) values (3,'Suresh Kumar',25)  
  14. INSERT INTO PythonConnect(id,name,age) values (4,'Anil Soman',42)  
  15. INSERT INTO PythonConnect(id,name,age) values (5,'Mohan Das Gupta',68)  
Step 4 - Create a Python file using any code editor.
 
I am using Visual Studio Code for creating Python file.
 
PythonSQLConnector.py
  1. import pandas   
  2.   
  3. from revoscalepy import RxComputeContext, RxInSqlServer, RxSqlServerData  
  4. from revoscalepy import rx_import  
  5.   
  6. #Connection string to connect to SQL Server named instance  
  7. conn_str = 'Driver=SQL Server;Server={YOUR SERVER NAME};Database=PythonSQLTest;Trusted_Connection=True;'  
  8.   
  9. #Define the columns we wish to import  
  10. column_info = {  
  11.          "id"   : { "type" : "integer" },  
  12.          "name" : { "type" : "string" },  
  13.          "age"  : { "type" : "integer" },  
  14.                }  
  15.   
  16. #Get the data from SQL Server Table  
  17. data_source = RxSqlServerData(table="dbo.PythonConnect",  
  18.                                connection_string=conn_str, column_info=column_info)  
  19. computeContext = RxInSqlServer(  
  20.      connection_string = conn_str,  
  21.      num_tasks = 1,  
  22.      auto_cleanup = False  
  23. )  
  24.   
  25. RxInSqlServer(connection_string=conn_str, num_tasks=1, auto_cleanup=False)  
  26.   
  27.  # import data source and convert to pandas dataframe  
  28. df = pandas.DataFrame(rx_import(input_data = data_source))  
  29. print("Data frame:\n", df)  
In the above code we have used RxComputeContext, RxInSqlServer, RxSqlServerData methods from revoscalepy library which are used for connecting SQL from Python. We also used Panda library to get the data frame.
 
After completing the code please run the Python file with the Python command. You will get a result as shown below.
 
Python SQL connector
 
Please note we got all five  records from our SQL table.
 
Step 5 - Create a model and predict a result from a model.
 
We are going to see how to create a model in SQL server and predict data from sample data using python scripts.
 
Create database and tables,
  1. CREATE DATABASE SQLPythonPrediction  
  2. GO  
  3. USE SQLPythonPrediction  
  4. GO  
Add some empty tables: one to store the data, and one to store the models you train. Later you will populate the tables using Python. 
  1. DROP TABLE IF EXISTS iris_data;  
  2. GO  
  3. CREATE TABLE iris_data (  
  4.   id INT NOT NULL IDENTITY PRIMARY KEY  
  5.   , "Sepal.Length" FLOAT NOT NULL"Sepal.Width" FLOAT NOT NULL  
  6.   , "Petal.Length" FLOAT NOT NULL"Petal.Width" FLOAT NOT NULL  
  7.   , "Species" VARCHAR(100) NOT NULL"SpeciesId" INT NOT NULL  
  8. );  
Run the following code to create the table used for storing the trained model. To save Python models in SQL Server, they must be serialized and stored in a column of type varbinary (max).
  1. DROP TABLE IF EXISTS iris_models;  
  2. GO  
  3.   
  4. CREATE TABLE iris_models (  
  5.   model_name VARCHAR(50) NOT NULL DEFAULT('default model'PRIMARY KEY,  
  6.   model VARBINARY(MAXNOT NULL  
  7. );  
  8. GO  
Populate the table
 
To move the training data from Python into a SQL Server table is a multistep process:
  • You design a stored procedure that gets the data you want.
  • You execute the stored procedure to get the data.
  • You use an INSERT statement to specify where the retrieved data should be saved.
Create the following stored procedure that includes Python code.
  1. CREATE PROCEDURE get_iris_dataset  
  2. AS  
  3. BEGIN  
  4. EXEC sp_execute_external_script @language = N'Python',   
  5. @script = N'  
  6. from sklearn import datasets  
  7. iris = datasets.load_iris()  
  8. iris_data = pandas.DataFrame(iris.data)  
  9. iris_data["Species"] = pandas.Categorical.from_codes(iris.target, iris.target_names)  
  10. iris_data["SpeciesId"] = iris.target  
  11. ',   
  12. @input_data_1 = N'',   
  13. @output_data_1_name = N'iris_data'  
  14. WITH RESULT SETS (("Sepal.Length" float not null"Sepal.Width" float not null"Petal.Length" float not null"Petal.Width" float not null"Species" varchar(100) not null"SpeciesId" int not null));  
  15. END;  
  16. GO  
To populate the table, run the stored procedure and specify the table where the data should be written. When run, the stored procedure executes the Python code, which loads the Iris dataset from the built-in Python sample data. 
  1. INSERT INTO iris_data ("Sepal.Length""Sepal.Width""Petal.Length""Petal.Width""Species""SpeciesId")  
  2. EXEC dbo.get_iris_dataset;  
After successful execution you may get a message that 150 rows have been affected. That means we have created 150 records from Python dataset in the SQL server.
Please check the data using the below statement.
  1. SELECT TOP(10) * FROM iris_data;  
  2. SELECT COUNT(*) FROM iris_data;  
Step 6 - Create the stored procedure and train a Python model
 
Run the following code in SQL Server Management Studio to create the stored procedure that builds a model.
  1. CREATE PROCEDURE generate_iris_model (@trained_model varbinary(maxOUTPUT)  
  2. AS  
  3. BEGIN  
  4. EXEC sp_execute_external_script @language = N'Python',  
  5. @script = N'  
  6. import pickle  
  7. from sklearn.naive_bayes import GaussianNB  
  8. GNB = GaussianNB()  
  9. trained_model = pickle.dumps(GNB.fit(iris_data[[0,1,2,3]], iris_data[[4]]))  
  10. '  
  11. , @input_data_1 = N'select "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "SpeciesId" from iris_data'  
  12. , @input_data_1_name = N'iris_data'  
  13. , @params = N'@trained_model varbinary(max) OUTPUT'  
  14. , @trained_model = @trained_model OUTPUT;  
  15. END;  
  16. GO  
To generate the model with the required inputs and save it to a table requires some additional statements.
  1. DECLARE @model varbinary(max);  
  2. DECLARE @new_model_name varchar(50)  
  3. SET @new_model_name = 'Naive Bayes'   
  4. SELECT @new_model_name   
  5. EXEC generate_iris_model @model OUTPUT;  
  6. INSERT INTO iris_models (model_name, model) values(@new_model_name, @model);  
To view the models, run a simple SELECT statement.
  1. SELECT * FROM iris_models;  
Step 7 - Generate scores from the model
 
Run the following code to create the stored procedure that performs scoring.
  1. CREATE PROCEDURE predict_species (@model varchar(100))  
  2. AS  
  3. BEGIN  
  4. DECLARE @nb_model varbinary(max) = (SELECT model FROM iris_models WHERE model_name = @model);  
  5. EXEC sp_execute_external_script @language = N'Python',   
  6. @script = N'  
  7. import pickle  
  8. irismodel = pickle.loads(nb_model)  
  9. species_pred = irismodel.predict(iris_data[[1,2,3,4]])  
  10. iris_data["PredictedSpecies"] = species_pred  
  11. OutputDataSet = iris_data.query( ''PredictedSpecies != SpeciesId'' )[[0, 5, 6]]  
  12. print(OutputDataSet)  
  13. '  
  14. , @input_data_1 = N'select id, "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "SpeciesId" from iris_data'  
  15. , @input_data_1_name = N'iris_data'  
  16. , @params = N'@nb_model varbinary(max)'  
  17. , @nb_model = @nb_model  
  18. WITH RESULT SETS ( ("id" int"SpeciesId" int"SpeciesId.Predicted" int));  
  19. END;  
  20. GO  
The stored procedure gets the "Naive Bayes" model from the table and uses the functions associated with the model to generate scores. In this example, the stored procedure gets the model from the table using the model name. However, depending on what kind of metadata you are saving with the model, you could also get the most recent model, or the model with the highest accuracy.
 
Run the following lines to pass the model name "Naive Bayes" to the stored procedure that executes the scoring code.
  1. EXEC predict_species 'Naive Bayes';  
 Prediction Result
 
We have successfully executed our model and got the prediction. This example has been made simple by using the data from the Python iris dataset for scoring. (See the line iris_data[[1,2,3,4]]).)
 
In this article, we covered all the steps for installing SQL Server 2017 with Machine Learning Services (R & Python) and we executed Python scripts from SQL and connected SQL data from Python code. Finally, we created a model in the SQL server using Python script and trained the model. We successfully got the prediction result.
This article covered only the basic features of Machine learning. We will cover more complex details in upcoming articles. I have attached the Python file and SQL scripts. Please share your valuable feedback.


Similar Articles