Data Operations Using Pandas In Python

Pandas is a library in Python that is used for data manipulation and analysis. It contains data structures that can perform various operations on data files like csv, excel, sql etc. At the backend for every source it creates a data frame which is a tabular structure of a data source. Here I am showing a few basic operations that can be performed using Panads. Here I am taking the example of a csv file.
 
The first and most basic step is to store all the data in data frame.
  1. df = pd.read_csv("<Path of CSV File>", sep=",")  
 Let’s say we have few columns with Date Time and the user wants to calculate the difference between these columns
  1. startTime = df.head(1)[<Name of the Date Time Column 1>']  
  2. startTime = pd.to_datetime(startTime,format= '%M:%S.%f')  
  3. endTime = df. head (1)[ )[<Name of the Date Time Column 2>']     
  4. endTime = pd.to_datetime(endTime,format= '%M:%S.%f')  
  5. diff = endTime - startTime  
 Iterating through all the rows
  1. for i, j in df.iterrows():  
  2.   {  
  3.      <All the implementation logic here>  
  4.    }  
 To compare if two CSV files are the same
  1. df.equals(df2)  
Merging to csv files based on columns
  1. df_column = pd.concat([df,df2], axis=1, ignore_index=True)  
 Merging to csv files based on rows
  1. df_row = pd.concat([df, df1], ignore_index=True)  
Hope these basic operations help you to learn and explore Pandas in Python.


Next Recommended Reading CRUD Operations in Python on MySQL