# 阶段五:迭代生成
# 前言
在写乘法表的时候,你可能写过类似
```python
for i in [2, 3, 5, 7, 11, 13]: print(i)
```
这样的语句。for in 语句理解起来很直观形象,比起 C++ 和 java 早期的
```c
for (int i = 0; i < n; i ++)
```
这样的语句,不知道简洁清晰到哪里去了。
但是,你想过 Python 在处理 for in 语句的时候,具体发生了什么吗?什么样的对象可以被 for in 来枚举呢?
# 容器迭代
容器这个概念非常好理解。
在 Python 中一切皆对象,对象的抽象就是类,而对象的集合就是容器。
列表(list: [0, 1, 2]),元组(tuple: (0, 1, 2)),字典(dict: {0:0, 1:1, 2:2}),集合(set: set([0, 1, 2]))都是容器。
对于容器,你可以很直观地想象成多个元素在一起的单元;而不同容器的区别,正是在于内部数据结构的实现方法。
然后,你就可以针对不同场景,选择不同时间和空间复杂度的容器。所有的容器都是可迭代的(iterable)。
```python
iterator = iter(iterable)
while True:
elem = next(iterator)
# do something
```
- 首先,在可迭代对象上调用内置 `iter` 函数以创建对应的迭代器。
- 要获取序列中的下一个元素,在此迭代器上调用内置 `next` 函数。
如果没有下一个元素了,怎么办?
我们需要对这种异常进行处理。
思考题:什么是异常处理,为什么要进行异常处理?有什么好处?
多次调用 `iter` 可迭代对象每次都会返回一个具有不同状态的新迭代器
你也可以调用 `iter` 迭代器本身,它只会返回相同的迭代器而不改变它的状态。但是,请注意,您不能直接在可迭代对象上调用 next。
```python
>>> lst = [1, 2, 3, 4]
>>> next(lst) # Calling next on an iterable
TypeError: 'list' object is not an iterator
>>> list_iter = iter(lst) # Creates an iterator for the list
>>> list_iter
>>> next(list_iter) # Calling next on an iterator
1
>>> next(list_iter) # Calling next on the same iterator
2
>>> next(iter(list_iter)) # Calling iter on an iterator returns itself
3
>>> list_iter2 = iter(lst)
>>> next(list_iter2) # Second iterator has new state
1
>>> next(list_iter) # First iterator is unaffected by second iterator
4
>>> next(list_iter) # No elements left!
StopIteration
>>> lst # Original iterable is unaffected
[1, 2, 3, 4]
```
# 英语练习,对迭代器的类比
Analogy: An iterable is like a book (one can flip through the pages) and an iterator for a book would be a bookmark (saves the position and can locate the next page). Calling `iter` on a book gives you a new bookmark independent of other bookmarks, but calling `iter` on a bookmark gives you the bookmark itself, without changing its position at all. Calling `next` on the bookmark moves it to the next page, but does not change the pages in the book. Calling `next` on the book wouldn't make sense semantically. We can also have multiple bookmarks, all independent of each other.
# 生成器:懒人迭代器!
```python
def test_iterator():
show_memory_info('initing iterator')
list_1 = [i for i in range(100000000)]
show_memory_info('after iterator initiated')
print(sum(list_1))
show_memory_info('after sum called')
def test_generator():
show_memory_info('initing generator')
list_2 = (i for i in range(100000000))
show_memory_info('after generator initiated')
print(sum(list_2))
show_memory_info('after sum called')
%time test_iterator()
%time test_generator()
########## 输出 ##########
initing iterator memory used: 48.9765625 MB
after iterator initiated memory used: 3920.30078125 MB
4999999950000000
after sum called memory used: 3920.3046875 MB
Wall time: 17 s
initing generator memory used: 50.359375 MB
after generator initiated memory used: 50.359375 MB
4999999950000000
after sum called memory used: 50.109375 MB
Wall time: 12.5 s
```
声明一个迭代器很简单,[i for i in range(100000000)]就可以生成一个包含一亿元素的列表。每个元素在生成后都会保存到内存中,你通过代码可以看到,它们占用了巨量的内存,内存不够的话就会出现 OOM 错误。
了解下 yield()函数吧,他可以返回一个生成器对象,试试看懂这个
```python
>>> def gen_list(lst):
... yield from lst
...
>>> g = gen_list([1, 2, 3, 4])
>>> next(g)
1
>>> next(g)
2
>>> next(g)
3
>>> next(g)
4
>>> next(g)
StopIteration
```
# 思考题:python 会显示什么?为什么?
```python
>>> s = [1, 2, 3, 4]
>>> t = iter(s)
>>> next(s)
______
>>> next(t)
______
>>> next(t)
______
>>> iter(s)
______
>>> next(iter(s))
______
>>> next(iter(t))
______
>>> next(iter(s))
______
>>> next(iter(t))
______
>>> next(t)
______
```
```python
>>> r = range(6)
>>> r_iter = iter(r)
>>> next(r_iter)
______
>>> [x + 1 for x in r]
______
>>> [x + 1 for x in r_iter]
______
>>> next(r_iter)
______
>>> list(range(-2, 4)) # Converts an iterable into a list
______
```
# 任务
P10:实现 `count`,它接受一个迭代器 `t` 并返回该值 `x` 出现在 的第一个 n 个元素中的次数 `t`
```python
def count(t, n, x):
"""Return the number of times that x appears in the first n elements of iterator t.
>>> s = iter([10, 9, 10, 9, 9, 10, 8, 8, 8, 7])
>>> count(s, 10, 9)
3
>>> s2 = iter([10, 9, 10, 9, 9, 10, 8, 8, 8, 7])
>>> count(s2, 3, 10)
2
>>> s = iter([3, 2, 2, 2, 1, 2, 1, 4, 4, 5, 5, 5])
>>> count(s, 1, 3)
1
>>> count(s, 4, 2)
3
>>> next(s)
2
>>> s2 = iter([4, 1, 6, 6, 7, 7, 8, 8, 2, 2, 2, 5])
>>> count(s2, 6, 6)
2
"""
"*** YOUR CODE HERE ***"
```
P11:实现生成器函数 `scale(it, multiplier)`,它产生给定迭代的元素 `it`,按 `multiplier`.
同时也希望你尝试使用 `yield from` 语句编写这个函数!
```python
def scale(it, multiplier):
"""Yield elements of the iterable it scaled by a number multiplier.
>>> m = scale(iter([1, 5, 2]), 5)
>>> type(m)
>>> list(m)
[5, 25, 10]
>>> # generators allow us to represent infinite sequences!!!
>>> def naturals():
... i = 0
... while True:
... yield i
... i += 1
>>> m = scale(naturals(), 2)
>>> [next(m) for _ in range(5)]
[0, 2, 4, 6, 8]
"""
"*** YOUR CODE HERE ***"
```