大家在学习python的阶段经常可以听到迭代器和生成器,那么这两个的区别是什么呢?
首先大家需要明确的一点是,生成器是特殊的迭代器,迭代器的作用是用于遍历访问。
迭代器和生成器的区别:
- 迭代器一般用于for循环迭代遍历迭代器中的数据。
- 生成器就像挤牙膏,挤一下就出一托数据,换句话说就是调用一次就返回一次数据。
迭代器的必备要素:
- __iter__特殊方法,生成迭代器对象,一般返回自身
- __next__特殊方法,遍历下一个数据
生成器的必备要素:
- yield关键字,自动调用iter和next方法返回数据,下一次访问从yield关键词之后开始访问。
我们举一个例子来探究迭代器和生成器的最重要的区别——迭代逻辑。
1.1 先创建一个10w行的文本文件
with open("text.txt", "w") as f:
for i in range(1000000):
f.write("我爱你中国" + "\n")
1.2 使用迭代器进行读取,并计算内存消耗
一次性读取全部文件内容:
import tracemalloc
def process_line(line):
pass
tracemalloc.start()
with open("text.txt", "r") as f:
lines = f.readlines()
for line in lines:
process_line(line)
current, peak = tracemalloc.get_traced_memory()
print(f"current memory is {current / 1024**2} MB")
print(f"peak memory is {peak / 1024**2} MB")
tracemalloc.stop()
# current memory is 89.88475894927979 MB
# peak memory is 89.90067100524902 MB
不一次性读取全部文件内容:
import tracemalloc
def process_line(line):
pass
tracemalloc.start()
class LineIter:
def __init__(self):
self.f = open("text.txt", "r")
def __iter__(self):
return self
def __next__(self):
line = self.f.readline()
if line:
return line
else:
self.f.close()
raise StopIteration
lines = LineIter()
for line in lines:
process_line(line)
current, peak = tracemalloc.get_traced_memory()
print(f"current memory is {current / 1024**2} MB")
print(f"peak memory is {peak / 1024**2} MB")
tracemalloc.stop()
# current memory is 0.0029458999633789062 MB
# peak memory is 0.04485607147216797 MB
1.3 使用生成器进行读取,并计算内存消耗
一次性全部读取文件:尖峰内存为文件全部大小。
import tracemalloc
def process_line(line):
pass
tracemalloc.start()
def LineIter():
with open("text.txt", "r") as f:
lines = f.readlines()
for line in lines:
yield line
lines = LineIter()
for line in lines:
process_line(line)
current, peak = tracemalloc.get_traced_memory()
print(f"current memory is {current / 1024**2} MB")
print(f"peak memory is {peak / 1024**2} MB")
tracemalloc.stop()
# current memory is 0.0005741119384765625 MB
# peak memory is 89.90090751647949 MB
不一次性全部读取文件:
import tracemalloc
def process_line(line):
pass
tracemalloc.start()
def LineIter():
with open("text.txt", "r") as f:
for line in f:
yield line
lines = LineIter()
for line in lines:
process_line(line)
current, peak = tracemalloc.get_traced_memory()
print(f"current memory is {current / 1024**2} MB")
print(f"peak memory is {peak / 1024**2} MB")
tracemalloc.stop()
# current memory is 0.0005702972412109375 MB
# peak memory is 0.03522205352783203 MB
# with open("text.txt", "w") as f:
# for i in range(1000000):
# f.write("我爱你中国" + "\n")