Uploading Files To Snowflake Stage Using SnowSQL

Introduction

 
In this article, we are going to learn the uploading process of the CSV and Json file into a Snowflake stage using SnowSQL client. Once the file is uploaded into the Stage, then we can perform bulk copy operation to load the data from the file to a Snowflake table. Before working on the problem statement, we should have knowledge of SnowSQL and Snowflake Stage. SnowSQL is just a connector whereas a Snowflake stage is a location where we are loading our files. We can create an Internal and external stage in Snowflake. The external stage for AWS or Azure or GCP can be created in Snowflake. We should have the required keys respective to the cloud provider. In this case, if we have stored our CSV/JSON/XML any semi-structured supported files into the S3 bucket or Azure Blob, we can simply stage it, and it will be available in Snowflake for data loading into the table.
 
That is why sometimes we say this process as ELT instead of ETL. In ELT (Extraction, Loading, and transformation) process, first, we extract data from any source, then we load into storage location to stage it on Snowflake. Once we have the list of files in Stage, we can transform those data as per our requirement. We are going to discuss the following things in detail
  1. SnowSQL
  2. Snowflake Stage
  3. Problem statement (uploading file to stage and load the data from stage to snowflake table)

SnowSQL

 
It is a command-line client that is used to connect Snowflake. Instead of Snowflake Web Interface, we can use the SnowSQL tool to perform/execute SQL query, DDL, DML command, including loading and unloading data. To install SnowSQL, please download the executable from the following link.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Let us set up the SnowSQL in our system to work on our problem statement. Perform the following steps to install and configure.
 
Step 1
 
Download the executable from the given link and double-click once it is downloaded (refer to the below screen).
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 2
 
Click on Next until you reach to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
If you closely look into the screen, then you can see some instructions. We need to configure the account detail in ~/.snowsql/config file.
 
Step 3

Click on the Finish button.
 
To check whether SnowSQL is installed or not. Window key + R or Run command
 
Step 4
 
Open the Run command. You will get the below screen:
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 5
 
Type %USERPROFILE%\.snowsql\ and then click on OK
 
Step 6
 
You can see the below screen. Now open the config file in notepad or notepad++
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 7
 
Add the Snowflake account information based on the below screen.
 
You need to provide an accountname, region, username, password, dbname, schemaname, warehousename, and rolename. This information you can get in your Snowflake web Interface.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 8
 
You can see [connections.my_example_connection] in the above screen. Here, my_example_connection we use to connect Snowflake in command prompt. You can give any name.
 
Step 9
 
Open the command prompt
 
Step 10
 
Type ‘snowsql -v’, like shown below.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 11
 
Type snowsql -c my_example_connection and press the enter key
 
This command connects the snowsql with Snowflake as per the given configuration. In this case, it will come as ,nitesh#COMPUTE_WH@TEST_DB.MANAGEMENT refer below screen. Now you perform any SQL query, DML or DDL command.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Now SnowSQL is installed and configured in your system properly. Let us discuss the Snowflake Stage.
 

Snowflake Stage

 
It specifies the stored data files location, so we can load the data into the table. You can explore on the Snowflake documentation.
 
The following types of Stages can be created in Snowflake:
  1. User stage
  2. Table Stage
  3. Internal Named Stage
While uploading a file into the internal stage, we must specify the internal stage name in PUT command, and while loading data into a table from staged files we must specify the stage name in COPY INTO <table> command
 
User Stages
 
By default, each user has a Snowflake stage allocated for storing files. It is a convenient option if your files will only be accessed by a single user but need to be copied into multiple tables.
 
Characteristics and Limitations:
  • It is referenced using @~; e.g. use LIST @~ to list the files in a user stage.
  • It cannot be altered or dropped.
  • It does not support setting file format options. Instead, you must specify the file format and copy options as part of the COPY INTO <table> command.
Table Stages
 
By default, each table has a Snowflake stage allocated for storing files. It is a convenient option if your files need to be accessible to multiple users and only need to be copied into a single table.
 
Characteristics and Limitations:
  • It has the same name as the table, e.g. a table named mytable has a stage referenced as @%mytable.
  • It cannot be altered or dropped.
  • It does not support setting file format options. Instead, you must specify the file format and copy options as part of the COPY INTO <table> command.
  • It does not support transforming data while loading it (i.e. using a query as the source for the COPY command).
We do not go with this option if we need to copy the data in the files into multiple tables.
 
Internal Named Stages
 
It is named database objects which provides more flexibility for data loading. Because they are database objects, the security/access rules that apply to all objects apply:
  • Users who have appropriate privileges on the stage can load data into any table.
  • Named internal stages are optional but recommended when you plan regular data loads that could involve multiple users and/or tables.

Creating Internal Named Stage

 
To achieve the solution for a given problem, we need to create the Internal named stage, so we can upload the files into that. Apart from creating Stage in Snowflake, we can also create a stage for AWS, Azure, and GCP. We must provide the appropriate keys related to the cloud provider. Just an example, we create a stage with AWS S# bucket key, then we can see all the listed files in the Snowflake stage which you have loaded to the S3 bucket. It is just created the reference, so now we can load the data into the table if the file is listed in our stage. In our scenario, we are creating a named stage in Snowflake, so uploading files into S3, we need to upload the file into the Snowflake stage. You can create stage by GUI or SQL query. Ultimately, whatever we are passing to GUI, it creates SQL for that and runs it behind the scenes. I am creating Stage by GUI (Graphical User Interface).
 
Perform the following steps to create Stage:
 
Step 1
 
Login into the Snowflake account. You will be landing on the worksheet.
 
Step 2
 
Click on the Database from the Header (besides the Share Icon). Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 3
 
Click on the database where you want to create the stage. In this case. I have selected TEST_DB. Once you click it, you will be getting the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 4
 
Click on the Stages
 
Step 5
 
Click on Create. You will be getting the below screen. Now it depends on you whether you want to create stage for Snowflake or Amazon S3 or Azure or GCP.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 6
 
Select Snowflake Managed in the above screen and click on Next
 
Step 7
 
Enter the stage name and select the schema. Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 8
 
Click on Finish.
 
Step 9
 
In the below screen, you can see the stage which we have created under MANAGEMENT schema.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Now we have configured the SnowSQL and Created the STAGE as well.  Let me give you brief information on File Format as it is required while loading data from stage to table.
 

File Format

 
It is a pattern of format which we create for the type of File. It can be CSV, Json, XML, Avro, etc. It will help us while loading the different format data into the Snowflake table.
 
Here, I am creating the File Format for CSV and Json file and we will use these formats while loading data from stage to snowflake table.
 
Creating File Format inside the Database
 
Like a stage, we can create File Format inside the database. Perform the following steps:
 
Step 1
 
Click on the File Format from the menu
 
Step 2
 
Click on Create
 
Step 3
 
Specify the format name and schema name with the required settings. Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 4
 
Click on the Finish button.
 
Perform the above steps once again to create Json file Format. Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Click on the Finish button.
 
Now we have everything ready to work on our problem statement.
 
Problem Statement 
  • Uploading CSV and JSON file from local system to Snowflake Stage
  • Load data from stage to Snowflake table 
Solution
 
As we have already set up and configured the SnowSQL and Snowflake Stage, now it will be very easy for us to work on this solution part. I have created two files. One is employee_list.json and another one is employeedaa.csv. These files are available in C drive inside the testdata folder.
 
Please have a look at the screen below for JSON file data and CSV file data. I have also attached the testdata.zip here.
 
Json file data
 
 
CSV file Data in below screen:
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Now, we must perform the following steps to achieve the solution of loading file into stage and stage to Snowflake table.
 
Step 1
 
Open the command prompt.
 
Step 2
 
Type “snowsql -v “ and press enter key
 
It will provide you the snowsql version installed in your system.
 
Step 3
 
Type snowsql -c my_example_connection and press enter key
 
This command connects snowsql with Snowflake as per the given configuration. In this case, it will come as Nitesh#COMPUTE_WH@TEST_DB.MANAGEMENT
 
Step 4
 
Type PUT file://c:\testdata\employeedata.csv @MYDATA; and press the enter key. Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 5
 
Once it is successfully loaded into Stage, you will see the below screen. The status will be uploaded, and it will be created employee.csv.gz compress file on the Snowflake stage.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 6
 
Type “PUT file://c:\testdata\employee_list. json @MYDATA; “ and press enter key. Refer to the below screen.
 
The Json file will be loaded into the Snowflake Stage called “MYDATA”
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 7
 
Open your Snowflake Web Interface. Set the context
 
Use role sysadmin; 
Use warehouse compute_wh;
Use database test_db;
Use schema management;
 
Step 8
 
Open worksheet and type the below query:
 
List @MYDATA;
 
It will show the list of the files available in the stage.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 9
 
To see the CSV file data, run the below query.
 
Select $1,$2,$3,$4,$5 from @MYDATA/employeedata.csv.gz;
 
It will show the CSV data in the result set. Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
You can use the file format to see the proper data.
 
Select $1,$2,$3,$4,$5 from @MYDATA/employeedata.csv.gz (FILE_FORMAT=>’CSV_FORMAT’);
 
Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 10
 
Perform the bulk insert operation. We have the user_data table in TEST_DB database under the management schema.
 
copy into user_data (user_id,user_name,address)
from (select $1,$2,$3 from @MYDATA/employeedata.csv.gz)
FILE_FORMAT = 'CSV_FORMAT';
 
Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 11
 
The data has been loaded to the user_data table. To see the loaded data, run the below query
 
select * from user_data;
 
Refer to the below screen:
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 12
 
We have loaded the CSV file data to the user_data table. Now we have to load the Json file data to the table. Firstly, To view the Json data in Stage, run the below query
 
select parse_json($1) from @mydata/employee_list.json.gz (file_format => 'JSON_FORMAT');
 
Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 13
 
Click on the result row so you can see the json format data. Refer below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 
Step 14
 
Create a table where we can store the json data as it is. Here, I am creating a table which is having one column whose data type is variant. This is the new data type in Snowflake which can store the semi-structured data. Run the below query to create a table.
 
create or replace table test_table(user_data variant);
 
Step 15
 
Run the below query to load the JSON file data
 
copy into test_table (user_data)
from (select parse_json($1) from @mydata/employee_list.json.gz)
FILE_FORMAT = 'JSON_FORMAT';
 
Step 16
 
To view the loaded data into test_table, run the below query:
 
select * from test_table;
 
Refer to the below screen.
 
Uploading Files To Snowflake Stage Using SnowSQL
 

Conclusion 

 
Uploading files to a Snowflake stage can be done by any Snowflake connector client. We can post the file into the stage from the local system and then the data can be loaded from the stage to the Snowflake table. If you have an S3 bucket where you are posting/uploading the data files or if you have Azure blob where you are posting/uploading data files, then you can simply create a stage referencing the keys of AWS S3 bucket or Azure blob and start loading the data into the table.
 
I answer questions that you may have on this functionality.
 
Happy Learning!