You've successfully subscribed to Smartcodehub ™ Blog
Great! Next, complete checkout for full access to Smartcodehub ™ Blog
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Column and Row operations in Pandas

Column and Row operations in Pandas

. 4 min read

Lets learn some advanced Column and Row functions in Pandas to operate on datasets.


Photo by Marvin Ronsdorf / Unsplash

In the previous blog we have learned about creating Series, DataFrames and Panels with Pandas. In this blog we will learn about some advanced features and operations we can perform with Pandas. But for this we first need to create a DataFrame.

Lets get started...

import pandas as pd
import numpy as np
d = {'one': pd.Series([2, 4, 6, 8], index = ['a', 'b', 'c', 'd']),
	 'two': pd.Series([1, 3, 5, 7, 9], index = ['a', 'b', 'c', 'd', 'e'])}
df = pd.DataFrame(d)
print(df)
Creating DataFrame from Dict of Series
Output:
   one  two
a  2.0    1
b  4.0    3
c  6.0    5
d  8.0    7
e  NaN    9

Performing Operations on Columns

Column Selection

The line of code below performs selection operation on the DataFrame. The passed argument is 'one' which means this will select the dict which have 'one' as its key and return all the values and index related to that key.

print df['one']
Select Column 'one'
Output:
a    2.0
b    4.0
c    6.0
d    8.0
e    NaN
Name: one, dtype: float64

We can also perform the same selection on 'two' like shown below:

print df['two']
Select Column 'two'
Output:
a    1
b    3
c    5
d    7
e    9
Name: two, dtype: int64

In both the cases the output consists of indices and the Series related to the indices. You can also see that it prints the key value as 'Name' and the datatype of the Series.

Column Insertion

The code below adds a new column 'three' to the existing DataFrame

#Adding a new column to the DataFrame by passing a Series
print("Adding a new column to the existing DataFrame")
df['three'] = pd.Series([12, 14, 16, 18, 20], index=['a', 'b', 'c', 'd', 'e'])
print(df)
Insert column 'three'
Output:
Adding a new column to the existing DataFrame
   one  two  three
a  2.0    1     12
b  4.0    3     14
c  6.0    5     16
d  8.0    7     18
e  NaN    9     20

The code below adds the columns 'one' and 'two' and stores the result in 'four' and then displays the column 'four'.

#Performing addition on columns and storing the result in new column
print("Adding columns 'one' and 'two' and storing the result in 'four'")
df['four'] = df['one'] + df['two']
print(df['four'])
Performing addition on columns
Output:
Adding columns 'one' and 'two' and storing the result in 'four'
a     3.0
b     7.0
c    11.0
d    15.0
e     NaN
Name: four, dtype: float64

Column Deletion

  • pop() function

We will use the pop() function to delete a specified column. The line of code below deletes the column 'two'

#Deleting column 'two'
df.pop('two')
print(df)
Deleting column with pop() 
Output:
   one  three  four
a  2.0     12   3.0
b  4.0     14   7.0
c  6.0     16  11.0
d  8.0     18  15.0
e  NaN     20   NaN

We can see that the resulted output does not have the column two. Because we popped it.

  • del keyword

Now, we will use del keyword to perform deletion on the DataFrame.

#Deleting column 'four'
del df['four']
print(df)
Deleting a column with del keyword
Output:
   one  three
a  2.0     12
b  4.0     14
c  6.0     16
d  8.0     18
e  NaN     20

We can see that the resulted output does not have the column 'four'.

Performing Operations on Rows

Row Selection by Label

We can perform selection operation on Rows by using label and passing the row label to the loc[ ]

#Printing the row with 'b'
print(df, "\n")
print(df.loc['b'])
Selecting rows with row label
Output:
   one  three
a  2.0     12
b  4.0     14
c  6.0     16
d  8.0     18
e  NaN     20 

one       4.0
three    14.0
Name: b, dtype: float64

We can see that only the content related to row: b are returned form the columns 'one' and 'three'.

Row selection by Integer Location

We can also perform selection operation on the Rows by passing the integer value to the iloc[ ].

#Printing the row at 2
print(df, "\n")
print(df.iloc[2])
Selecting rows with integer location
Output:
   one  three
a  2.0     12
b  4.0     14
c  6.0     16
d  8.0     18
e  NaN     20 

one       6.0
three    16.0
Name: c, dtype: float64

In the above code, the content of both the row present at location '2' in columns 'one' and 'three' is returned.

Row Insertion

We can use append() function to insert a DataFrame in another DataFrame. The code below inserts the DataFrame d2 in the DataFrame d1.

import pandas as pd
#Creating two DataFrames
d1 = pd.DataFrame([[2, 4, 6], [3, 5, 7]], columns=['a', 'b', 'c'])
d2 = pd.DataFrame([[10, 20, 30], [40, 50, 60]], columns=['a', 'b', 'c'])
#Inserting d2 in d1
d1 = d1.append(d2)
print(d1)
Inserting rows with append()
Output:
    a   b   c
0   2   4   6
1   3   5   7
0  10  20  30
1  40  50  60

Row Deletion

We can use the drop() function to delete the specified row.

#Deleting the rows with label 0
d1 = d1.drop(0)
print(d1)
Deleting rows with drop()
Output:
    a   b   c
1   3   5   7
1  40  50  60

The above code deletes all the rows which have label as '0'. Similarly we can also delete the rows with label '1' by passing '1' as argument to the drop() function.

#Deleting the rows with label 1
d1 = d1.drop(1)
print(d1)
Deleting rows with drop()
Output:
    a   b   c
0   2   4   6
0  10  20  30