pandas Tutorial => Select duplicated

Example

If need set value 0 to column B, where in column A are duplicated data first create mask by Series.duplicated and then use DataFrame.ix or Series.mask:

In [224]: df = pd.DataFrame({'A':[1,2,3,3,2],
     ...:                    'B':[1,7,3,0,8]})

In [225]: mask = df.A.duplicated(keep=False)

In [226]: mask
Out[226]: 
0    False
1     True
2     True
3     True
4     True
Name: A, dtype: bool

In [227]: df.ix[mask, 'B'] = 0

In [228]: df['C'] = df.A.mask(mask, 0)

In [229]: df
Out[229]: 
   A  B  C
0  1  1  1
1  2  0  0
2  3  0  0
3  3  0  0
4  2  0  0

If need invert mask use ~:

In [230]: df['C'] = df.A.mask(~mask, 0)

In [231]: df
Out[231]: 
   A  B  C
0  1  1  0
1  2  0  2
2  3  0  3
3  3  0  3
4  2  0  2

PDF - Download pandas for free

Previous Next

pandas

Fastest Entity Framework Extensions

Example

Got any pandas Question?

pandas

pandas Duplicated data Select duplicated

Fastest Entity Framework Extensions

Example

Got any pandas Question?