Inserting Document In Bulk Into Elastic Search Using Logstash

Saineshwar Bageri
7y
38.5k
0
11

Article

What is Logstash?

Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously transforms it, and then sends it to your favorite “stash.” (Ours is Elasticsearch, naturally.)

Downloading Logstash
Downloading JDBC driver for SQL
Downloading JDBC driver for MYSQL
Logstash structure
Writing JDBC Plugin for getting data from MS SQL and inserting into Elasticsearch
Writing JDBC Plugin for getting data from MY SQL and inserting into Elasticsearch

Downloading Logstash

To download the product, visit the same website from where we have already downloaded Elastic search and Kibana.

https://www.elastic.co/products

After completing with downloading Logstash, next, we are going to download JDBC driver for SQL.

Downloading JDBC driver for SQL

Link to download is here.

Elastic Search

Downloading JDBC driver for MYSQL

To download, click here.

After completing downloading drivers, next, we are going to see the folder structure of Logstash which we have downloaded.

Logstash Structure

Below is the view of Logstash after unzipping it.

After having a view on folder structure of Logstash, next, we are going to see tables of MSSQL which we are going to migrate.

Table Data to Migrate into Elastic search

We are going to migrate the entire table data into Elastic search. After knowing the table which we are going to migrate, next, we are going to write JDBC Plugin.

JDBC plugin for Logstash for push data into the Elastic search

For creating a plugin, we are going to take a text file and save that file as “logstash.conf” with the extension “.conf”. After that, we are going to write the input and output.

Elastic Search

In the input part, we are going to first provide JDBC SQL driver “jdbc_driver_library”. Here, we are going to provide a path of SQL JDBC driver which we have downloaded.

Next, we are going to provide “jdbc_driver_class”, “jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver". After providing JDBC driver class, next, we are going to set “jdbc_connection_string” SQL connection string to connect to SQL Server along with that we are going to provide jdbc_user, jdbc_password which will be username and password of SQL server, and in the input, statement is the last part which will contain your SQL query.

After setting input, next, we are going to have a look at the output. In that part we are going set “stdout” which will print JSON data to the console. And next, we are going to set elastic search. In that, we are going to set host which will be elastic search host.

And index name can be user-defined but must be unique and it should be in lowercase.

Code Snippet of JDBC plugin for SQL Server

input {
jdbc {
jdbc_driver_library => "D:\sqljdbc_6.4.0.0_enu\sqljdbc_6.4\enu\mssql-jdbc-6.4.0.jre8.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://SAI-PC;user=sa;password=Pass$123;"
jdbc_user => "sa"
jdbc_password => "Pass$123"
statement => "SELECT * FROM [AdventureWorks2008R2].[HumanResources].[Employee]"
}
}
filter {}
output {
stdout {
codec => json_lines
}
elasticsearch {
hosts => "http://localhost:9200"
index => "humanresources"
}
}

After writing JDBC plugin next we are going to run Logstash and then we are going to run this JDBC plugin.

To Run JDBC plugin with Logstash

To run Logstash first we are going access its bin folder as shown below.

Elastic Search

After accessing folder next, we are going to open that path in command prompt by typing command “cmd” in the header bar.

After entering the command, it will open a command prompt with folder path.

Next, we are going to run Logstash.

Before running if you want to see what options are provided by Logstash cli then enter command “Logstash --help”

It will show you a lot of options which are provided by Logstash as shown below.

We are going to run using “-f” command.

What is “-f” command?

-f, --path.config CONFIG_PATH Load the Logstash config from a specific file or directory.

If a directory is given, all files in that directory will be concatenated in lexicographical order and then parsed as a single config file. You can also specify wildcards (globs) and any matched files will be loaded in the order described above.

Now enter -f command along with Logstash.conf file path as shown below.

After entering command press enter to run Logstash.

Note
While using SQL JDBC driver we require TCP/IP enabled. If it is not enabled then it will fire an error.

To enable TCP/IP protocol goto “Configuration Tools” inside that you will find “SQL Server Configuration Manager” menu click on it.

After clicking on it the below screen will appear.

In that just go to SQL Native Client 10.0 configuration and choose Clients Protocols and in the left side panel you will see TCP/IP just enable it.

After starting the process, you will see all data will be printed on the console which is inserted into Elasticsearch as shown below.

Now we have completed inserting data into elastic search Next, we are going to see data in Kibana.

Visualizing inserted data by Kibana

For visualizing data in Kibana we need to follow the same process; first we need to add an index in Kibana.

Index name is “humanresource” which we have recently created, after entering index name, next on dashboard you can see “humanresource” index just choose to see complete data.

Below snapshot, while visualizing “humanresource” data in Kibana

That’s cool -- we have migrated data from SQL server to elastic search successfully.

In the same way, we can write JDBC plugin for pulling data from MYSQL Database.

Code Snippet of JDBC plugin for MYSQL

Code Snippet of JDBC plugin for MySQL

JDBC plugin is similar to what we already wrote for MSSQL Server. In M JDBC plugin we are going to change its connection string, Credentials, MySQL JDBC driver and MySQL JDBC driver class that’s it and we are ready to pull data from MySQL server.

input {
jdbc {
jdbc_connection_string => "jdbc:mysql://127.0.0.0:3306/AdventureWorks2008R2"
#The user we wish to execute our statement as
jdbc_user => "demo"
jdbc_password => "#######"
#The path to our downloaded jdbc driver
jdbc_driver_library => "E:/mysql-connector-java-6.0.5.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
#our query
statement => "SELECT * FROM AdventureWorks2008R2.HumanResources"
}
}
output {
stdout {
codec => json_lines
}
elasticsearch {
"hosts" => "localhost:9200"
"index" => "humanresources"
}
}

After creating JDBC plugin, next, we are going to run this plugin in the same way as running MSSQL JDBC plugin.

Now, enter -f command along with the config file path, as shown below, to run.

Finally, we have completed inserting the document in bulk into elastic search using Logstash in step by step way.