It is a common task to merge different datasets, such as merging index data with stock data and the like. Thus, it is quite important to understand the mechanism of merging different datasets. Here, the pandas.merge()
function is discussed:
import pandas as pd import scipy as s x= pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'], 'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3']}) y = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K6'], 'C': ['C0', 'C1', 'C2', 'C3'], 'D': ['D0', 'D1', 'D2', 'D3']})
The sizes of both x
and y
are 4
by 3
, that is, four rows and three columns; see the following code:
print(sp.shape(x)) print(x)
print(sp.shape(y)) print(y)
Assume that we intend to merge them based on the variable called key
, a common variable shared by both datasets. Since the common values of this variable are K0, K1 and K2. The final result should have three rows and five...