Import Job Output From HDInsight Into Microsoft Excel

Introduction

This article will help you import your job output from HDInsight into Microsoft Excel. Make sure that you have worked on my previous articles before you step into this one.

Pre-requisites

  1. Microsoft Excel 2010 or 2013.
  2. Microsoft Power Query installed.

Tools needed

Download Microsoft Power Query by clicking here. (Select the appropriate 32 bit or 64 bit version, whichever you need for your Office package).

Follow the below steps now - 

Step 1 - Install Microsoft Power Query if you don’t have it already in your machine.



Meanwhile, open your cluster and run the job with your Hive Query. Click here to get knowledge about it.

Step 2 - Open a new workbook in Microsoft Excel 2013. Here, you can find the Power Query added.



Step 3 - Click on Power Query -> From Azure -> From Microsoft Azure HDInsight,



Step 4 - Enter the Storage account name that we created in our previous article, and click OK.



Step 5 - Enter the Account Key to access the Azure HDInsight and click on Connect. To know about the Account Key details, go to your storage account.



Click on Access Keys.



Copy the Key 1 from here and paste it in your Excel workbook.



Now, click on Connect.



Step 6 - Now, select the Blob name from the right pane of the Navigator window and click on Load.



Step 7 - Locate stdout from the Name column and verify that the GUID in the corresponding folder path column matches the GUID you copied earlier in previous demos. This match suggests that the output data corresponds to the job you have submitted.



Once done, click on Close & Load button in the upper-left corner, to import the Hive job output to your Excel sheet.

X

Build smarter apps with Machine Learning, Bots, Cognitive Services - Start free.

Start Learning Now