Python in Excel Data Analysis using the New PY Function

Fundamentals of Python in Excel

Python is a popular programming language for data analysis and manipulation tasks. Excel, on the other hand, is a widely used spreadsheet software that allows users to organize, manipulate, and analyze data. In recent years, there has been an increasing demand for integrating Python with Excel to leverage the strengths of both tools.

One way to use Python in Excel is by using the xlwings library, which allows users to tie Excel, Python, and pandas together to build a data analysis tool that pulls information from an external such as Power Query, manipulates it, and presents it to the user in a familiar spreadsheet format.

Another way to use Python in Excel is by using the new Python/Excel integration feature introduced by Microsoft. This feature allows users to explore and analyze data using Python operations within the familiar environment of an Excel spreadsheet. The integrated version of Excel is appropriate for all levels of Python developers and Excel users, whether beginner or advanced.

With this integration, users can manipulate and explore data in Excel using Python plots and libraries and then use Excel’s formulas, charts, and PivotTables to further refine insights. In addition, users can also easily bring external data into Python in Excel workflows using Excel’s built-in connectors and Power Query.

Python in Excel leverages Anaconda Distribution for Python running in Azure, which includes popular Python libraries such as pandas for data manipulation, statsmodels for advanced statistical modeling, and Matplotlib and seaborn for data visualization. 

As of the time of writing this article, the PY function, which is used to access the Python Environment is currently available in the Beta Channel for Excel for M365. Python code can be authored in the Python Editor component of the Excel Labs add-ins. To get the Excel Lab add-ins, click here: https://www.exceljetconsult.com.ng/home/blog/how-to-get-excel-labs-addin-to-author-python-code/

Practical Examples

Having discussed the fundamental part of Python in Excel, let's jump into the practical aspect.

In the screenshot below, we have a sales dataset with five columns. We are going to use the data to perform a handful of data analysis using the PY function in Excel

Python in Excel

Read Data Into DataFrame and Descriptive Statistics in Python

The first thing we need to do is to read the data into the dataframe using the PY function and the xl function.

Note: popular python libraries such pandas, numpy, matplotlib, and seaborn are pre-installed in Python in Excel. Hence, it is not required to import them before performing data analysis in Python in Excel as seen in the Python Editor in the screenshot below.

To read data in this example, in cell G1, type in =PY and press the tab key. This opens the Python environment with a green border in the formula bar.

Execute this Python code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = xl("Sheet1!$A$2:$E$100")
df.describe()

Click CTRL + Enter to commit. Python Object will be returned.

To see the output in Excel, click the dropdown icon left of the formula bar and select Excel Value. This will display the value in Excel as we can see the descriptive statistics of the dataset

Read Data Into DataFrame and Descriptive Statistics in Python

Total Sales by Products Visualization 

Next, we want to author the python code to visualizse Total Sales by Products. This would help us obtain the insight in the top-performing products bringing in highest sales value. To write the python code:

Execute this code

plt.figure(figsize=(12, 6))
sns.barplot(x='Products', y='Sales', data=df)
plt.title('Total Sales by Product')
plt.xlabel('Product')
plt.ylabel('Total Sales')
plt.xticks(rotation=45)
plt.show()

Click CTRL + Enter to commit and again switch from Python Object to Excel Value.

Voila! We have the visualization as seen below:

Read Data Into DataFrame and Descriptive Statistics in Python

From the above visualization, we can easily tell that Laptop is the best selling product hence, we need to focus more on selling more laptops to maximize profit while not neglecting underperforming products such as Tablet.

See you in the next article.


Similar Articles