241 lines
6.4 KiB
Markdown
241 lines
6.4 KiB
Markdown
# 阶段五:迭代生成
|
||
|
||
# 前言
|
||
|
||
在写乘法表的时候,你可能写过类似
|
||
|
||
```python
|
||
for i in [2, 3, 5, 7, 11, 13]: print(i)
|
||
```
|
||
|
||
这样的语句。for in 语句理解起来很直观形象,比起 C++ 和 java 早期的
|
||
|
||
```c
|
||
for (int i = 0; i < n; i ++)
|
||
```
|
||
|
||
这样的语句,不知道简洁清晰到哪里去了。
|
||
|
||
但是,你想过 Python 在处理 for in 语句的时候,具体发生了什么吗?什么样的对象可以被 for in 来枚举呢?
|
||
|
||
# 容器迭代
|
||
|
||
容器这个概念非常好理解。
|
||
|
||
在 Python 中一切皆对象,对象的抽象就是类,而对象的集合就是容器。
|
||
|
||
列表(list: [0, 1, 2]),元组(tuple: (0, 1, 2)),字典(dict: {0:0, 1:1, 2:2}),集合(set: set([0, 1, 2]))都是容器。
|
||
|
||
对于容器,你可以很直观地想象成多个元素在一起的单元;而不同容器的区别,正是在于内部数据结构的实现方法。
|
||
|
||
然后,你就可以针对不同场景,选择不同时间和空间复杂度的容器。所有的容器都是可迭代的(iterable)。
|
||
|
||
```python
|
||
iterator = iter(iterable)
|
||
while True:
|
||
elem = next(iterator)
|
||
# do something
|
||
```
|
||
|
||
- 首先,在可迭代对象上调用内置 `iter` 函数以创建对应的<em>迭代器</em>。
|
||
- 要获取序列中的下一个元素,在此迭代器上调用内置 `next` 函数。
|
||
|
||
如果没有下一个元素了,怎么办?
|
||
|
||
我们需要对这种异常进行处理。
|
||
|
||
思考题:什么是异常处理,为什么要进行异常处理?有什么好处?
|
||
|
||
多次调用 `iter` 可迭代对象每次都会返回一个具有不同状态的新迭代器
|
||
|
||
你也可以调用 `iter` 迭代器本身,它只会返回相同的迭代器而不改变它的状态。但是,请注意,您不能直接在可迭代对象上调用 next。
|
||
|
||
```python
|
||
>>> lst = [1, 2, 3, 4]
|
||
>>> next(lst) # Calling next on an iterable
|
||
TypeError: 'list' object is not an iterator
|
||
>>> list_iter = iter(lst) # Creates an iterator for the list
|
||
>>> list_iter
|
||
<list_iterator object ...>
|
||
>>> next(list_iter) # Calling next on an iterator
|
||
1
|
||
>>> next(list_iter) # Calling next on the same iterator
|
||
2
|
||
>>> next(iter(list_iter)) # Calling iter on an iterator returns itself
|
||
3
|
||
>>> list_iter2 = iter(lst)
|
||
>>> next(list_iter2) # Second iterator has new state
|
||
1
|
||
>>> next(list_iter) # First iterator is unaffected by second iterator
|
||
4
|
||
>>> next(list_iter) # No elements left!
|
||
StopIteration
|
||
>>> lst # Original iterable is unaffected
|
||
[1, 2, 3, 4]
|
||
```
|
||
|
||
# 英语练习,对迭代器的类比
|
||
|
||
<strong>Analogy</strong>: An iterable is like a book (one can flip through the pages) and an iterator for a book would be a bookmark (saves the position and can locate the next page). Calling `iter` on a book gives you a new bookmark independent of other bookmarks, but calling `iter` on a bookmark gives you the bookmark itself, without changing its position at all. Calling `next` on the bookmark moves it to the next page, but does not change the pages in the book. Calling `next` on the book wouldn't make sense semantically. We can also have multiple bookmarks, all independent of each other.
|
||
|
||
# 生成器:懒人迭代器!
|
||
|
||
```python
|
||
def test_iterator():
|
||
show_memory_info('initing iterator')
|
||
list_1 = [i for i in range(100000000)]
|
||
show_memory_info('after iterator initiated')
|
||
print(sum(list_1))
|
||
show_memory_info('after sum called')
|
||
|
||
def test_generator():
|
||
show_memory_info('initing generator')
|
||
list_2 = (i for i in range(100000000))
|
||
show_memory_info('after generator initiated')
|
||
print(sum(list_2))
|
||
show_memory_info('after sum called')
|
||
|
||
%time test_iterator()
|
||
%time test_generator()
|
||
|
||
########## 输出 ##########
|
||
|
||
initing iterator memory used: 48.9765625 MB
|
||
after iterator initiated memory used: 3920.30078125 MB
|
||
4999999950000000
|
||
after sum called memory used: 3920.3046875 MB
|
||
Wall time: 17 s
|
||
initing generator memory used: 50.359375 MB
|
||
after generator initiated memory used: 50.359375 MB
|
||
4999999950000000
|
||
after sum called memory used: 50.109375 MB
|
||
Wall time: 12.5 s
|
||
```
|
||
|
||
声明一个迭代器很简单,[i for i in range(100000000)]就可以生成一个包含一亿元素的列表。每个元素在生成后都会保存到内存中,你通过代码可以看到,它们占用了巨量的内存,内存不够的话就会出现 OOM 错误。
|
||
|
||
了解下 yield()函数吧,他可以返回一个生成器对象,试试看懂这个
|
||
|
||
```python
|
||
>>> def gen_list(lst):
|
||
... yield from lst
|
||
...
|
||
>>> g = gen_list([1, 2, 3, 4])
|
||
>>> next(g)
|
||
1
|
||
>>> next(g)
|
||
2
|
||
>>> next(g)
|
||
3
|
||
>>> next(g)
|
||
4
|
||
>>> next(g)
|
||
StopIteration
|
||
```
|
||
|
||
# 思考题:python 会显示什么?为什么?
|
||
|
||
```python
|
||
>>> s = [1, 2, 3, 4]
|
||
>>> t = iter(s)
|
||
>>> next(s)
|
||
______
|
||
|
||
>>> next(t)
|
||
______
|
||
|
||
>>> next(t)
|
||
______
|
||
|
||
>>> iter(s)
|
||
______
|
||
|
||
>>> next(iter(s))
|
||
______
|
||
|
||
>>> next(iter(t))
|
||
______
|
||
|
||
>>> next(iter(s))
|
||
______
|
||
|
||
>>> next(iter(t))
|
||
______
|
||
|
||
>>> next(t)
|
||
______
|
||
```
|
||
|
||
```python
|
||
>>> r = range(6)
|
||
>>> r_iter = iter(r)
|
||
>>> next(r_iter)
|
||
______
|
||
|
||
>>> [x + 1 for x in r]
|
||
______
|
||
|
||
>>> [x + 1 for x in r_iter]
|
||
______
|
||
|
||
>>> next(r_iter)
|
||
______
|
||
|
||
>>> list(range(-2, 4)) # Converts an iterable into a list
|
||
______
|
||
```
|
||
|
||
# 任务
|
||
|
||
P10:实现 `count`,它接受一个迭代器 `t` 并返回该值 `x` 出现在 的第一个 n 个元素中的次数 `t`
|
||
|
||
```python
|
||
def count(t, n, x):
|
||
"""Return the number of times that x appears in the first n elements of iterator t.
|
||
|
||
>>> s = iter([10, 9, 10, 9, 9, 10, 8, 8, 8, 7])
|
||
>>> count(s, 10, 9)
|
||
3
|
||
>>> s2 = iter([10, 9, 10, 9, 9, 10, 8, 8, 8, 7])
|
||
>>> count(s2, 3, 10)
|
||
2
|
||
>>> s = iter([3, 2, 2, 2, 1, 2, 1, 4, 4, 5, 5, 5])
|
||
>>> count(s, 1, 3)
|
||
1
|
||
>>> count(s, 4, 2)
|
||
3
|
||
>>> next(s)
|
||
2
|
||
>>> s2 = iter([4, 1, 6, 6, 7, 7, 8, 8, 2, 2, 2, 5])
|
||
>>> count(s2, 6, 6)
|
||
2
|
||
"""
|
||
"*** YOUR CODE HERE ***"
|
||
```
|
||
|
||
P11:实现生成器函数 `scale(it, multiplier)`,它产生给定迭代的元素 `it`,按 `multiplier`.
|
||
|
||
同时也希望你尝试使用 `yield from` 语句编写这个函数!
|
||
|
||
```python
|
||
def scale(it, multiplier):
|
||
"""Yield elements of the iterable it scaled by a number multiplier.
|
||
|
||
>>> m = scale(iter([1, 5, 2]), 5)
|
||
>>> type(m)
|
||
<class 'generator'>
|
||
>>> list(m)
|
||
[5, 25, 10]
|
||
>>> # generators allow us to represent infinite sequences!!!
|
||
>>> def naturals():
|
||
... i = 0
|
||
... while True:
|
||
... yield i
|
||
... i += 1
|
||
>>> m = scale(naturals(), 2)
|
||
>>> [next(m) for _ in range(5)]
|
||
[0, 2, 4, 6, 8]
|
||
"""
|
||
"*** YOUR CODE HERE ***"
|
||
```
|