pandas Slicing strings


Example

Strings in a Series can be sliced using .str.slice() method, or more conveniently, using brackets (.str[]).

In [1]: ser = pd.Series(['Lorem ipsum', 'dolor sit amet', 'consectetur adipiscing elit'])
In [2]: ser
Out[2]: 
0                    Lorem ipsum
1                 dolor sit amet
2    consectetur adipiscing elit
dtype: object 

Get the first character of each string:

In [3]: ser.str[0]
Out[3]: 
0    L
1    d
2    c
dtype: object

Get the first three characters of each string:

In [4]: ser.str[:3]
Out[4]: 
0    Lor
1    dol
2    con
dtype: object

Get the last character of each string:

In [5]: ser.str[-1]
Out[5]:
0    m
1    t
2    t
dtype: object

Get the last three characters of each string:

In [6]: ser.str[-3:]
Out[6]: 
0    sum
1    met
2    lit
dtype: object

Get the every other character of the first 10 characters:

In [7]: ser.str[:10:2]
Out[7]: 
0    Lrmis
1    dlrst
2    cnett
dtype: object

Pandas behaves similarly to Python when handling slices and indices. For example, if an index is outside the range, Python raises an error:

In [8]:'Lorem ipsum'[12]
# IndexError: string index out of range

However, if a slice is outside the range, an empty string is returned:

In [9]: 'Lorem ipsum'[12:15]
Out[9]: ''

Pandas returns NaN when an index is out of range:

In [10]: ser.str[12]
Out[10]:
0    NaN
1      e
2      a
dtype: object

And returns an empty string if a slice is out of range:

In [11]: ser.str[12:15]
Out[11]:
0       
1     et
2    adi
dtype: object