Learn about Pandas package in Python
Till now we have learned to install and use NumPy. But there is more to scientific computing,
Lets start with another scientific computing package 'Pandas'
Till now we have learned to install and use NumPy. But there is more to scientific computing, Chill! don't worry we are here to get you through with that. In this blog we are going to start with another opensource scientific computing package called 'Pandas'. You can check out their official website here. Or you can check out Pandas on GitHub here.
Lets start with overview on Pandas to understand what it is about. Pandas is a convenient, highly productive Data Manipulation, Data Analysis and Data Structure tool. It can be used for the processes like loading data, manipulating data, modelling data and analyzing data. The term Pandas is derived from Panel Data much creative it is!
The Pandas package provides us with an large range of features like:
- We can handle the missing data in our data sets easily with Pandas.
- Implicit and Explicit data alignment: The user is free to choose between implicit or explicit data alignment for the computations.
- Aggregating and transforming data using group by functionality.
- Easy conversion of differently indexed Data Structures from other Python and NumPy Data Structures into Pandas Data Frames.
- Smart sub-setting, label based slicing and indexing of enormous Data Sets.
- Merging and joining of data sets.
- Reading and importing data from file like CSV, TXT, XLSX.
- Generating data range, date shifting, converting frequencies and other time series functionalities.
- Easy arranging of data and analyzing time series.
- Manipulating data using integrated indexing for dataframe objects.
Now without further ado, lets install Pandas on your system. For installing Pandas on your system please go through the steps carefully.
Installing Pandas on PyCharm
- We will start by installing Pandas on your PyCharm: For installing Pandas Package on PyCharm, you simply need to repeat the steps we have previously discusses in the setting up NumPy blog check out the prerequisites title and follow steps 1 and 2. Now you must be seeing a window like this.
- Now type pandas in the search bar and hit install. This will install Pandas on your PyCharm and then you can easily import Pandas in your projects. using the line of code: import pandas as pd
Installing Pandas on IDLE
For installing Pandas on IDLE simply start your command prompt (For Windows) and run the command line:
pip install pandas
This will take some time and install pandas for your IDLE.
Working with Pandas
Before starting off with Pandas, let us get familiar with the terminologies used in Pandas.
Python Data Structures: The Pandas package supports three types of data structures
- Series: It is One Dimensional and Homogeneously typed Array.
- Data Frames: It is a Two Dimensional and Heterogeneously types Tabular structure. It is size mutable which means that we can change its size by changing the column and rows size.
- Data Panels: It is a Three Dimensional container of data and it is derived from econometrics (the application of statistical methods to economic data). The Data Panel consists of 3 axes namely:
- items: axis 0, items denotes the dataframe contained in it.
- major_axis: axis 1, major_axis denotes the rows of each dataframe.
- minor_axis: axis 2, minor_axis denotes the columns of each dataframe.
We will discuss more on creating series, data frames and panels in next blog.