In [1]: df1 = pd.DataFrame({'x': [1, 2, 3], 'y': ['a', 'b', 'c']})
In [2]: df2 = pd.DataFrame({'y': ['b', 'c', 'd'], 'z': [4, 5, 6]})
In [3]: df1
Out[3]:
x y
0 1 a
1 2 b
2 3 c
In [4]: df2
Out[4]:
y z
0 b 4
1 c 5
2 d 6
Uses the intersection of keys from two DataFrames.
In [5]: df1.merge(df2) # by default, it does an inner join on the common column(s)
Out[5]:
x y z
0 2 b 4
1 3 c 5
Alternatively specify intersection of keys from two Dataframes.
In [5]: merged_inner = pd.merge(left=df1, right=df2, left_on='y', right_on='y')
Out[5]:
x y z
0 2 b 4
1 3 c 5
Uses the union of the keys from two DataFrames.
In [6]: df1.merge(df2, how='outer')
Out[6]:
x y z
0 1.0 a NaN
1 2.0 b 4.0
2 3.0 c 5.0
3 NaN d 6.0
Uses only keys from left DataFrame.
In [7]: df1.merge(df2, how='left')
Out[7]:
x y z
0 1 a NaN
1 2 b 4.0
2 3 c 5.0
Uses only keys from right DataFrame.
In [8]: df1.merge(df2, how='right')
Out[8]:
x y z
0 2.0 b 4
1 3.0 c 5
2 NaN d 6