Iterators are a means of navigating and operating on a sequence of elements and are a generalized extension of pointers. Conceptually it is important to remember that iterators are positions, not elements. For example, take the following sequence:
A B C
The sequence contains three elements and four positions
+---+---+---+---+
| A | B | C | |
+---+---+---+---+
Elements are things within a sequence. Positions are places where meaningful operations can happen to the sequence. For example, one inserts into a position, before or after element A, not into an element. Even deletion of an element (erase(A)
) is done by first finding its position, then deleting it.
To convert from a position to a value, an iterator is dereferenced:
auto my_iterator = my_vector.begin(); // position
auto my_value = *my_iterator; // value
One can think of an iterator as dereferencing to the value it refers to in the sequence. This is especially useful in understanding why you should never dereference the end()
iterator in a sequence:
+---+---+---+---+
| A | B | C | |
+---+---+---+---+
↑ ↑
| +-- An iterator here has no value. Do not dereference it!
+-------------- An iterator here dereferences to the value A.
In all the sequences and containers found in the C++ standard library, begin()
will return an iterator to the first position, and end()
will return an iterator to one past the last position (not the last position!). Consequently, the names of these iterators in algorithms are oftentimes labelled first
and last
:
+---+---+---+---+
| A | B | C | |
+---+---+---+---+
↑ ↑
| |
+- first +- last
It is also possible to obtain an iterator to any sequence, because even an empty sequence contains at least one position:
+---+
| |
+---+
In an empty sequence, begin()
and end()
will be the same position, and neither can be dereferenced:
+---+
| |
+---+
↑
|
+- empty_sequence.begin()
|
+- empty_sequence.end()
The alternative visualization of iterators is that they mark the positions between elements:
+---+---+---+
| A | B | C |
+---+---+---+
↑ ^ ^ ↑
| |
+- first +- last
and dereferencing an iterator returns a reference to the element coming after the iterator. Some situations where this view is particularly useful are:
insert
operations will insert elements into the position indicated by the iterator,erase
operations will return an iterator corresponding to the same position as the one passed in,An iterator becomes invalidated if (say, in the course of an operation) its position is no longer a part of a sequence. An invalidated iterator cannot be dereferenced until it has been reassigned to a valid position. For example:
std::vector<int>::iterator first;
{
std::vector<int> foo;
first = foo.begin(); // first is now valid
} // foo falls out of scope and is destroyed
// At this point first is now invalid
The many algorithms and sequence member functions in the C++ standard library have rules governing when iterators are invalidated. Each algorithm is different in the way they treat (and invalidate) iterators.
As we know, iterators are for navigating sequences. In order to do that an iterator must migrate its position throughout the sequence. Iterators can advance forward in the sequence and some can advance backwards:
auto first = my_vector.begin();
++first; // advance the iterator 1 position
std::advance(first, 1); // advance the iterator 1 position
first = std::next(first); // returns iterator to the next element
std::advance(first, -1); // advance the iterator 1 position backwards
first = std::next(first, 20); // returns iterator to the element 20 position forward
first = std::prev(first, 5); // returns iterator to the element 5 position backward
auto dist = std::distance(my_vector.begin(), first); // returns distance between two iterators.
Note, second argument of std::distance should be reachable from the first one(or, in other words first
should be less or equal than second
).
Even though you can perform arithmetic operators with iterators, not all operations are defined for all types of iterators. a = b + 3;
would work for Random Access Iterators, but wouldn't work for Forward or Bidirectional Iterators, which still can be advanced by 3 position with something like b = a; ++b; ++b; ++b;
. So it is recommended to use special functions in case you are not sure what is iterator type (for example, in a template function accepting iterator).
The C++ standard describes several different iterator concepts. These are grouped according to how they behave in the sequences they refer to. If you know the concept an iterator models (behaves like), you can be assured of the behavior of that iterator regardless of the sequence to which it belongs. They are often described in order from the most to least restrictive (because the next iterator concept is a step better than its predecessor):
Algorithms can vary depending on the concept modeled by the iterators they are given. For example, although random_shuffle
can be implemented for forward iterators, a more efficient variant that requires random access iterators could be provided.
Iterator traits provide uniform interface to the properties of iterators. They allow you to retrieve value, difference, pointer, reference types and also category of iterator:
template<class Iter>
Iter find(Iter first, Iter last, typename std::iterator_traits<Iter>::value_type val) {
while (first != last) {
if (*first == val)
return first;
++first;
}
return last;
}
Category of iterator can be used to specialize algorithms:
template<class BidirIt>
void test(BidirIt a, std::bidirectional_iterator_tag) {
std::cout << "Bidirectional iterator is used" << std::endl;
}
template<class ForwIt>
void test(ForwIt a, std::forward_iterator_tag) {
std::cout << "Forward iterator is used" << std::endl;
}
template<class Iter>
void test(Iter a) {
test(a, typename std::iterator_traits<Iter>::iterator_category());
}
Categories of iterators are basically iterators concepts, except Contiguous Iterators don't have their own tag, since it was found to break code.