Tutorial by Examples

If need set value 0 to column B, where in column A are duplicated data first create mask by Series.duplicated and then use DataFrame.ix or Series.mask: In [224]: df = pd.DataFrame({'A':[1,2,3,3,2], ...: 'B':[1,7,3,0,8]}) In [225]: mask = df.A.duplicated(keep=False) In...
Use drop_duplicates: In [216]: df = pd.DataFrame({'A':[1,2,3,3,2], ...: 'B':[1,7,3,0,8]}) In [217]: df Out[217]: A B 0 1 1 1 2 7 2 3 3 3 3 0 4 2 8 # keep only the last value In [218]: df.drop_duplicates(subset=['A'], keep='last') Out[218]: ...
Number of unique elements in a series: In [1]: id_numbers = pd.Series([111, 112, 112, 114, 115, 118, 114, 118, 112]) In [2]: id_numbers.nunique() Out[2]: 5 Get unique elements in a series: In [3]: id_numbers.unique() Out[3]: array([111, 112, 114, 115, 118], dtype=int64) In [4]: df = pd.Da...
In [15]: df = pd.DataFrame({"A":[1,1,2,3,1,1],"B":[5,4,3,4,6,7]}) In [21]: df Out[21]: A B 0 1 5 1 1 4 2 2 3 3 3 4 4 1 6 5 1 7 To get unique values in column A and B. In [22]: df["A"].unique() Out[22]: array([1, 2, 3]) In [23]: df[&qu...

Page 1 of 1