Generator expressions are very similar to list comprehensions. The main difference is that it does not create a full set of results at once; it creates a generator object which can then be iterated over.
For instance, see the difference in the following code:
# list comprehension [x**2 for x in range(10)] # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# generator comprehension (x**2 for x in xrange(10)) # Output: <generator object <genexpr> at 0x11b4b7c80>
These are two very different objects:
the list comprehension returns a
list object whereas the generator comprehension returns a
generator objects cannot be indexed and makes use of the
next function to get items in order.
Note: We use
xrange since it too creates a generator object. If we would use range, a list would be created. Also,
xrange exists only in later version of python 2. In python 3,
range just returns a generator. For more information, see the Differences between range and xrange functions example.
g = (x**2 for x in xrange(10)) print(g)
Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'generator' object has no attribute '__getitem__'
g.next() # 0 g.next() # 1 g.next() # 4 ... g.next() # 81 g.next() # Throws StopIteration Exception
Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
NOTE: The function
g.next()should be substituted by
xrange()do not exist in Python 3.
Although both of these can be iterated in a similar way:
for i in [x**2 for x in range(10)]: print(i) """ Out: 0 1 4 ... 81 """
for i in (x**2 for x in xrange(10)): print(i) """ Out: 0 1 4 . . . 81 """
Generator expressions are lazily evaluated, which means that they generate and return each value only when the generator is iterated. This is often useful when iterating through large datasets, avoiding the need to create a duplicate of the dataset in memory:
for square in (x**2 for x in range(1000000)): #do something
Another common use case is to avoid iterating over an entire iterable if doing so is not necessary. In this example, an item is retrieved from a remote API with each iteration of
get_objects(). Thousands of objects may exist, must be retrieved one-by-one, and we only need to know if an object matching a pattern exists. By using a generator expression, when we encounter an object matching the pattern.
def get_objects(): """Gets objects from an API one by one""" while True: yield get_next_item() def object_matches_pattern(obj): # perform potentially complex calculation return matches_pattern def right_item_exists(): items = (object_matched_pattern(each) for each in get_objects()) for item in items: if item.is_the_right_one: return True return False