Lets learn some advanced Column and Row functions in Pandas to operate on datasets.
In the previous blog we have learned about creating Series, DataFrames and Panels with Pandas. In this blog we will learn about some advanced features and operations we can perform with Pandas. But for this we first need to create a DataFrame.
Lets get started...
Output: one two a 2.0 1 b 4.0 3 c 6.0 5 d 8.0 7 e NaN 9
Performing Operations on Columns
The line of code below performs selection operation on the DataFrame. The passed argument is 'one' which means this will select the dict which have 'one' as its key and return all the values and index related to that key.
Output: a 2.0 b 4.0 c 6.0 d 8.0 e NaN Name: one, dtype: float64
We can also perform the same selection on 'two' like shown below:
Output: a 1 b 3 c 5 d 7 e 9 Name: two, dtype: int64
In both the cases the output consists of indices and the Series related to the indices. You can also see that it prints the key value as 'Name' and the datatype of the Series.
The code below adds a new column 'three' to the existing DataFrame
Output: Adding a new column to the existing DataFrame one two three a 2.0 1 12 b 4.0 3 14 c 6.0 5 16 d 8.0 7 18 e NaN 9 20
The code below adds the columns 'one' and 'two' and stores the result in 'four' and then displays the column 'four'.
Output: Adding columns 'one' and 'two' and storing the result in 'four' a 3.0 b 7.0 c 11.0 d 15.0 e NaN Name: four, dtype: float64
- pop() function
We will use the pop() function to delete a specified column. The line of code below deletes the column 'two'
Output: one three four a 2.0 12 3.0 b 4.0 14 7.0 c 6.0 16 11.0 d 8.0 18 15.0 e NaN 20 NaN
We can see that the resulted output does not have the column two. Because we popped it.
- del keyword
Now, we will use del keyword to perform deletion on the DataFrame.
Output: one three a 2.0 12 b 4.0 14 c 6.0 16 d 8.0 18 e NaN 20
We can see that the resulted output does not have the column 'four'.
Performing Operations on Rows
Row Selection by Label
We can perform selection operation on Rows by using label and passing the row label to the loc[ ]
Output: one three a 2.0 12 b 4.0 14 c 6.0 16 d 8.0 18 e NaN 20 one 4.0 three 14.0 Name: b, dtype: float64
We can see that only the content related to row: b are returned form the columns 'one' and 'three'.
Row selection by Integer Location
We can also perform selection operation on the Rows by passing the integer value to the iloc[ ].
Output: one three a 2.0 12 b 4.0 14 c 6.0 16 d 8.0 18 e NaN 20 one 6.0 three 16.0 Name: c, dtype: float64
In the above code, the content of both the row present at location '2' in columns 'one' and 'three' is returned.
We can use append() function to insert a DataFrame in another DataFrame. The code below inserts the DataFrame d2 in the DataFrame d1.
Output: a b c 0 2 4 6 1 3 5 7 0 10 20 30 1 40 50 60
We can use the drop() function to delete the specified row.
Output: a b c 1 3 5 7 1 40 50 60
The above code deletes all the rows which have label as '0'. Similarly we can also delete the rows with label '1' by passing '1' as argument to the drop() function.
Output: a b c 0 2 4 6 0 10 20 30