Data Operations Using Pandas In Python

Bhawna Tuteja
4y
5.6k
0
2
25
Blog

Pandas is a library in Python that is used for data manipulation and analysis. It contains data structures that can perform various operations on data files like csv, excel, sql etc. At the backend for every source it creates a data frame which is a tabular structure of a data source. Here I am showing a few basic operations that can be performed using Panads. Here I am taking the example of a csv file.

The first and most basic step is to store all the data in data frame.

df = pd.read_csv("<Path of CSV File>", sep=",")

Let’s say we have few columns with Date Time and the user wants to calculate the difference between these columns

startTime = df.head(1)[<Name of the Date Time Column 1>']
startTime = pd.to_datetime(startTime,format= '%M:%S.%f')
endTime = df. head (1)[ )[<Name of the Date Time Column 2>']
endTime = pd.to_datetime(endTime,format= '%M:%S.%f')
diff = endTime - startTime

Iterating through all the rows

for i, j in df.iterrows():
{
<All the implementation logic here>
}

To compare if two CSV files are the same

df.equals(df2)

Merging to csv files based on columns

df_column = pd.concat([df,df2], axis=1, ignore_index=True)

Merging to csv files based on rows

df_row = pd.concat([df, df1], ignore_index=True)

Hope these basic operations help you to learn and explore Pandas in Python.