Python编程入门
一、Python简介
1.1 什么是Python
Python是一种高级、通用、解释型编程语言,由Guido van Rossum于1991年首次发布。它以简洁的语法、强大的功能和丰富的生态系统而闻名,强调代码可读性和简洁性,适合快速开发各种应用程序。Python支持多种编程范式,包括面向对象、函数式和过程式编程。
1.2 Python的主要特点
简洁易读:使用缩进来定义代码块,语法接近自然语言
动态类型:不需要显式声明变量类型
跨平台:可在Windows、macOS、Linux等多种操作系统上运行
解释型:无需编译,直接由解释器执行
丰富的库:拥有庞大的标准库和第三方库
可扩展:可通过C/C++等语言编写扩展模块
胶水语言:能轻松整合其他语言编写的程序
多范式:支持面向对象、函数式、命令式等多种编程风格
1.3 Python生态系统
Python拥有丰富的生态系统,覆盖了从Web开发到科学计算的多个领域:
1.4 Python版本差异
Python有两个主要版本分支:Python 2和Python 3。目前Python 2已停止维护,推荐使用Python 3。
Python 3的重要版本特性:
Python 3.0:2008年发布,不兼容Python 2
Python 3.5:引入类型提示、async/await语法
Python 3.6:引入f-string、变量注解
Python 3.7:增强类型提示、dataclass装饰器
Python 3.8:海象运算符(:=)、位置参数限定符
Python 3.9:字典合并运算符、类型提示改进
Python 3.10:结构模式匹配(match-case)
Python 3.11:异常组、性能大幅提升
二、Python环境搭建
2.1 Python安装
Windows系统安装
访问Python官网下载最新版本
运行安装程序,勾选"Add Python to PATH"
选择"Install Now"或自定义安装
验证安装:打开命令提示符,输入
python --version
macOS系统安装
bash
# 使用Homebrew安装
brew install python
# 验证安装
python3 --version
Linux系统安装
bash
# Ubuntu/Debian
sudo apt update
sudo apt install python3 python3-pip
# CentOS/RHEL
sudo dnf install python3 python3-pip
# 验证安装
python3 --version
pip3 --version
2.2 虚拟环境配置
虚拟环境可以隔离不同项目的依赖:
bash
# 使用venv创建虚拟环境
python -m venv myenv
# 激活虚拟环境
# Windows:
myenv\Scripts\activate.bat
# macOS/Linux:
source myenv/bin/activate
# 安装包
pip install requests numpy
# 导出依赖
pip freeze > requirements.txt
# 退出虚拟环境
deactivate
# 使用requirements.txt安装依赖
pip install -r requirements.txt
2.3 开发工具选择
Python常用开发工具:
PyCharm:JetBrains开发的专业IDE
社区版(免费):适合基础开发
专业版(付费):支持Web开发、数据科学等高级功能
Visual Studio Code:微软开发的轻量级编辑器
安装Python扩展
支持代码补全、调试、linting等功能
Jupyter Notebook:交互式笔记本
适合数据分析和教学
支持实时代码执行和可视化
Sublime Text:轻量级文本编辑器
安装Python相关插件增强功能
2.4 第一个Python程序
创建并运行HelloWorld程序:
python
# hello_world.py
print("Hello, Python World!")
# 打印Python版本信息
import sys
print(f"Python Version: {sys.version.split()[0]}")
# 简单计算
a = 10
b = 20
print(f"{a} + {b} = {a + b}")
运行程序:
bash
python hello_world.py
三、Python基本语法
3.1 变量与数据类型
Python支持多种内置数据类型:
变量定义与初始化示例:
python
# 基本类型
age = 25 # int
height = 1.75 # float
name = "Alice" # str
is_student = True # bool
complex_num = 3 + 4j # complex
nothing = None # NoneType
# 容器类型
fruits = ["apple", "banana", "cherry"] # list
coordinates = (10, 20) # tuple
person = {
"name": "Bob",
"age": 30,
"hobbies": ["reading", "sports"]
} # dict
unique_numbers = {1, 2, 3, 4, 5} # set
# 类型检查
print(type(age)) # <class 'int'>
print(isinstance(height, float)) # True
3.2 运算符与表达式
Python支持多种运算符:
python
# 算术运算符
a = 10
b = 3
print(a + b) # 13
print(a - b) # 7
print(a * b) # 30
print(a / b) # 3.333...
print(a // b) # 3 (整除)
print(a % b) # 1 (取余)
print(a **b) # 1000 (幂运算)
# 赋值运算符
x = 5
x += 3 # x = x + 3 → 8
x *= 2 # x = x * 2 → 16
# 比较运算符
print(5 == 5) # True
print(5 != 3) # True
print(5 > 3) # True
print(5 < 10) # True
print(5 >= 5) # True
print(5 <= 3) # False
# 逻辑运算符
print(True and False) # False
print(True or False) # True
print(not True) # False
# 成员运算符
fruits = ["apple", "banana"]
print("apple" in fruits) # True
print("orange" not in fruits) # True
# 身份运算符
a = [1, 2, 3]
b = a
c = [1, 2, 3]
print(a is b) # True (同一对象)
print(a is c) # False (不同对象)
print(a == c) # True (值相等)
3.3 控制流语句
条件语句
python
# if-elif-else语句
score = 85
if score >= 90:
print("优秀")
elif score >= 80:
print("良好")
elif score >= 60:
print("及格")
else:
print("不及格")
# 简洁的条件表达式
result = "通过" if score >= 60 else "未通过"
print(result)
# 多条件判断
age = 20
has_id = True
if age >= 18 and has_id:
print("允许进入")
elif age >= 18 and not has_id:
print("请出示身份证")
else:
print("不允许进入")
循环语句
python
# for循环遍历序列
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
print(f"I like {fruit}s")
# 使用range生成数字序列
for i in range(5):
print(i) # 0, 1, 2, 3, 4
for i in range(1, 10, 2):
print(i) # 1, 3, 5, 7, 9
# while循环
count = 0
while count < 5:
print(f"Count: {count}")
count += 1
# 循环控制语句
for i in range(10):
if i % 2 == 0:
continue # 跳过偶数
if i > 7:
break # 当i>7时退出循环
print(i) # 1, 3, 5, 7
# 循环中的else子句
for i in range(5):
if i == 10:
break
else:
print("循环正常结束") # 会执行,因为循环不是通过break结束的
3.4 函数
函数是组织好的、可重复使用的代码块:
python
# 基本函数定义
def greet(name):
"""向指定的人打招呼"""
return f"Hello, {name}!"
# 调用函数
message = greet("Alice")
print(message)
# 带默认参数的函数
def calculate_area(length, width=None):
"""计算面积,默认计算正方形面积"""
if width is None:
width = length # 如果未提供width,假设是正方形
return length * width
print(calculate_area(5)) # 正方形面积
print(calculate_area(5, 10)) # 长方形面积
# 带可变参数的函数
def sum_numbers(*args):
"""计算多个数字的和"""
total = 0
for num in args:
total += num
return total
print(sum_numbers(1, 2, 3)) # 6
print(sum_numbers(10, 20, 30, 40)) # 100
# 带关键字参数的函数
def print_info(**kwargs):
"""打印关键字参数信息"""
for key, value in kwargs.items():
print(f"{key}: {value}")
print_info(name="Bob", age=30, city="New York")
# 函数作为参数
def apply_function(func, x, y):
return func(x, y)
def add(a, b):
return a + b
def multiply(a, b):
return a * b
print(apply_function(add, 3, 5)) # 8
print(apply_function(multiply, 3, 5)) # 15
四、Python高级特性
4.1 列表推导式
列表推导式提供了一种简洁的方式创建列表:
python
# 基本列表推导式
squares = [x** 2 for x in range(10)]
print(squares) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# 带条件的列表推导式
even_squares = [x **2 for x in range(10) if x % 2 == 0]
print(even_squares) # [0, 4, 16, 36, 64]
# 嵌套列表推导式
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
flattened = [num for row in matrix for num in row]
print(flattened) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
# 集合推导式
unique_squares = {x** 2 for x in [-3, -2, -1, 0, 1, 2, 3]}
print(unique_squares) # {0, 1, 4, 9}
# 字典推导式
square_map = {x: x **2 for x in range(10)}
print(square_map) # {0: 0, 1: 1, ..., 9: 81}
4.2 面向对象编程
Python支持面向对象编程范式:
python
# 类定义
class Person:
"""人员类"""
# 类属性
species = "Homo sapiens"
# 构造方法
def __init__(self, name, age):
self.name = name # 实例属性
self.age = age
# 实例方法
def greet(self):
"""打招呼"""
return f"Hello, my name is {self.name} and I'm {self.age} years old."
# 类方法
@classmethod
def get_species(cls):
"""获取物种信息"""
return cls.species
# 静态方法
@staticmethod
def is_adult(age):
"""判断是否成年"""
return age >= 18
# 创建对象
person1 = Person("Alice", 30)
person2 = Person("Bob", 25)
# 调用实例方法
print(person1.greet()) # Hello, my name is Alice and I'm 30 years old.
# 调用类方法
print(Person.get_species()) # Homo sapiens
# 调用静态方法
print(Person.is_adult(20)) # True
# 继承
class Student(Person):
"""学生类,继承自Person"""
def __init__(self, name, age, student_id, major):
super().__init__(name, age) # 调用父类构造方法
self.student_id = student_id
self.major = major
# 方法重写
def greet(self):
"""重写打招呼方法"""
base_greeting = super().greet()
return f"{base_greeting} I'm studying {self.major}."
def study(self, subject):
"""学习方法"""
return f"{self.name} is studying {subject}."
# 创建子类对象
student = Student("Charlie", 20, "S12345", "Computer Science")
print(student.greet()) # Hello, my name is Charlie and I'm 20 years old. I'm studying Computer Science.
print(student.study("Python Programming")) # Charlie is studying Python Programming.
# 多态
def introduce(person):
"""介绍人物"""
print(person.greet())
introduce(person1) # 使用Person对象调用
introduce(student) # 使用Student对象调用
类继承关系图:
4.3 装饰器
装饰器是修改函数或类行为的函数:
python
# 基本装饰器
def simple_decorator(func):
def wrapper():
print("Before function execution")
func()
print("After function execution")
return wrapper
@simple_decorator
def say_hello():
print("Hello!")
say_hello()
# 输出:
# Before function execution
# Hello!
# After function execution
# 带参数的装饰器
def repeat(n):
def decorator(func):
def wrapper(*args, **kwargs):
results = []
for _ in range(n):
result = func(*args, **kwargs)
results.append(result)
return results
return wrapper
return decorator
@repeat(3)
def greet(name):
return f"Hello, {name}!"
print(greet("Alice")) # ['Hello, Alice!', 'Hello, Alice!', 'Hello, Alice!']
# 保留函数元数据的装饰器
import functools
def log_decorator(func):
@functools.wraps(func) # 保留原函数元数据
def wrapper(*args, **kwargs):
print(f"Calling function: {func.__name__}")
print(f"Arguments: {args}, {kwargs}")
result = func(*args, **kwargs)
print(f"Function {func.__name__} returned: {result}")
return result
return wrapper
@log_decorator
def add(a, b):
"""Add two numbers"""
return a + b
print(add(3, 5))
print(add.__name__) # add (如果没有@functools.wraps,这里会显示wrapper)
print(add.__doc__) # Add two numbers (如果没有@functools.wraps,这里会显示None)
# 类装饰器
class CountCalls:
def __init__(self, func):
self.func = func
self.call_count = 0
def __call__(self, *args, **kwargs):
self.call_count += 1
print(f"Function {self.func.__name__} has been called {self.call_count} times")
return self.func(*args, **kwargs)
@CountCalls
def multiply(a, b):
return a * b
print(multiply(2, 3)) # 6
print(multiply(4, 5)) # 20
4.4 生成器与迭代器
生成器和迭代器提供了高效的序列生成方式:
python
# 生成器函数
def count_up_to(n):
"""生成从1到n的数字"""
current = 1
while current <= n:
yield current # yield语句返回值并暂停函数
current += 1
# 使用生成器
counter = count_up_to(5)
print(next(counter)) # 1
print(next(counter)) # 2
print(next(counter)) # 3
# 在for循环中使用生成器
for num in count_up_to(5):
print(num) # 1, 2, 3, 4, 5
# 生成器表达式
squares = (x** 2 for x in range(10))
print(list(squares)) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# 迭代器
class FibonacciIterator:
"""斐波那契数列迭代器"""
def __init__(self, max_num):
self.max_num = max_num
self.a, self.b = 0, 1
def __iter__(self):
"""返回迭代器对象本身"""
return self
def __next__(self):
"""返回下一个斐波那契数"""
if self.a > self.max_num:
raise StopIteration
result = self.a
self.a, self.b = self.b, self.a + self.b
return result
# 使用迭代器
fib = FibonacciIterator(100)
for num in fib:
print(num, end=" ") # 0 1 1 2 3 5 8 13 21 34 55 89
# 可迭代对象
class NumberRange:
"""数字范围类,实现可迭代接口"""
def __init__(self, start, end):
self.start = start
self.end = end
def __iter__(self):
"""返回生成器作为迭代器"""
current = self.start
while current <= self.end:
yield current
current += 1
# 使用可迭代对象
for num in NumberRange(5, 10):
print(num, end=" ") # 5 6 7 8 9 10
生成器工作流程图:
五、Python标准库与第三方库
5.1 常用标准库
Python标准库提供了丰富的功能:
python
# os模块 - 系统操作
import os
# 获取当前目录
print(os.getcwd())
# 创建目录
os.makedirs("test_dir", exist_ok=True)
# 列出目录内容
print(os.listdir("."))
# 路径操作
path = os.path.join("test_dir", "test.txt")
print(os.path.abspath(path))
print(os.path.exists(path))
# sys模块 - Python解释器相关
import sys
# 命令行参数
print(sys.argv)
# 退出程序
# sys.exit(0)
# 环境变量
print(sys.path)
# datetime模块 - 日期时间处理
from datetime import datetime, timedelta
# 获取当前时间
now = datetime.now()
print(now)
# 格式化时间
print(now.strftime("%Y-%m-%d %H:%M:%S"))
# 时间运算
tomorrow = now + timedelta(days=1)
print(tomorrow)
# 字符串处理
import string
# 字符串常量
print(string.ascii_letters) # abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.digits) # 0123456789
# 随机数生成
import random
# 生成随机数
print(random.randint(1, 100)) # 1-100之间的随机整数
print(random.random()) # 0-1之间的随机浮点数
# 随机选择
fruits = ["apple", "banana", "cherry"]
print(random.choice(fruits)) # 随机选择一个元素
random.shuffle(fruits) # 打乱列表
print(fruits)
# 网络请求
import urllib.request
# 发送HTTP请求
with urllib.request.urlopen("https://www.python.org") as response:
html = response.read()
print(f"Status code: {response.getcode()}")
# print(html.decode("utf-8")) # 打印网页内容
# 多线程
import threading
import time
def worker(name, delay):
"""线程工作函数"""
for i in range(5):
print(f"Worker {name} - Iteration {i}")
time.sleep(delay)
# 创建线程
thread1 = threading.Thread(target=worker, args=("A", 1))
thread2 = threading.Thread(target=worker, args=("B", 2))
# 启动线程
thread1.start()
thread2.start()
# 等待线程完成
thread1.join()
thread2.join()
print("All workers finished")
5.2 常用第三方库
Python拥有丰富的第三方库:
python
# 安装第三方库
# pip install requests pandas numpy matplotlib
# requests - HTTP请求库
import requests
# 发送GET请求
response = requests.get("https://api.github.com")
print(f"Status code: {response.status_code}")
print(response.json()) # 解析JSON响应
# 发送POST请求
data = {"name": "John", "age": 30}
response = requests.post("https://httpbin.org/post", json=data)
print(response.json())
# pandas - 数据分析库
import pandas as pd
# 创建DataFrame
data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"City": ["New York", "London", "Paris"]
}
df = pd.DataFrame(data)
print(df)
# 数据操作
print(df.head(2)) # 查看前两行
print(df.describe()) # 统计信息
print(df[df["Age"] > 28]) # 筛选数据
# 保存数据
df.to_csv("people.csv", index=False)
# 读取数据
df_read = pd.read_csv("people.csv")
print(df_read)
# NumPy - 数值计算库
import numpy as np
# 创建数组
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(arr.shape) # 数组形状
print(arr.dtype) # 数据类型
# 数组运算
print(arr + 2) # 每个元素加2
print(arr * 3) # 每个元素乘3
print(np.mean(arr)) # 平均值
print(np.sum(arr)) # 总和
# 矩阵操作
matrix = np.array([[1, 2], [3, 4]])
print(matrix)
print(matrix.T) # 转置矩阵
print(np.dot(matrix, matrix)) # 矩阵乘法
# Matplotlib - 数据可视化库
import matplotlib.pyplot as plt
# 简单折线图
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.figure(figsize=(10, 4))
plt.plot(x, y, label="sin(x)")
plt.title("Sine Wave")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
plt.grid(True)
plt.savefig("sine_wave.png")
# plt.show()
六、Python实践案例
6.1 数据处理案例
使用Pandas和Matplotlib分析销售数据:
python
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# 生成模拟销售数据
def generate_sales_data():
"""生成模拟销售数据"""
dates = pd.date_range(start="2023-01-01", end="2023-12-31", freq="D")
products = ["Product A", "Product B", "Product C"]
data = {
"Date": np.random.choice(dates, 1000),
"Product": np.random.choice(products, 1000),
"Sales": np.random.randint(100, 1000, 1000),
"Quantity": np.random.randint(1, 20, 1000)
}
return pd.DataFrame(data)
# 生成数据
df = generate_sales_data()
# 数据预处理
df["Date"] = pd.to_datetime(df["Date"])
df["Month"] = df["Date"].dt.to_period("M")
df["Revenue"] = df["Sales"] * df["Quantity"]
# 数据分析
monthly_revenue = df.groupby("Month")["Revenue"].sum()
product_revenue = df.groupby("Product")["Revenue"].sum()
daily_avg_sales = df.groupby(df["Date"].dt.dayofweek)["Sales"].mean()
# 数据可视化
plt.figure(figsize=(15, 10))
# 1. 月度收入趋势
plt.subplot(2, 2, 1)
monthly_revenue.plot(kind="line")
plt.title("Monthly Revenue Trend")
plt.ylabel("Revenue")
plt.xticks(rotation=45)
# 2. 产品收入占比
plt.subplot(2, 2, 2)
product_revenue.plot(kind="pie", autopct="%1.1f%%")
plt.title("Revenue by Product")
plt.ylabel("")
# 3. 周内销售情况
plt.subplot(2, 2, 3)
daily_avg_sales.plot(kind="bar")
plt.title("Average Sales by Day of Week")
plt.xlabel("Day of Week (0=Monday)")
plt.ylabel("Average Sales")
plt.xticks(range(7), ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"])
# 4. 销售额与数量关系
plt.subplot(2, 2, 4)
scatter = plt.scatter(df["Quantity"], df["Sales"], c=df["Revenue"], cmap="viridis", alpha=0.5)
plt.title("Sales vs Quantity")
plt.xlabel("Quantity")
plt.ylabel("Sales")
plt.colorbar(scatter, label="Revenue")
plt.tight_layout()
plt.savefig("sales_analysis.png")
# plt.show()
# 打印关键指标
print("Key Performance Indicators:")
print(f"Total Revenue: ${df['Revenue'].sum():,.2f}")
print(f"Average Daily Revenue: ${df.groupby('Date')['Revenue'].sum().mean():,.2f}")
print(f"Top Product: {product_revenue.idxmax()} (${product_revenue.max():,.2f})")
print(f"Best Month: {monthly_revenue.idxmax()} (${monthly_revenue.max():,.2f})")
6.2 Web API开发案例
使用FastAPI创建简单的REST API:
python
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
from datetime import datetime
# 创建FastAPI应用
app = FastAPI(title="Task Management API")
# 数据模型
class TaskBase(BaseModel):
title: str
description: Optional[str] = None
completed: bool = False
class TaskCreate(TaskBase):
pass
class Task(TaskBase):
id: int
created_at: datetime
class Config:
orm_mode = True
# 模拟数据库
tasks_db = []
task_id_counter = 1
# API端点
@app.get("/tasks/", response_model=List[Task])
def read_tasks(completed: Optional[bool] = None):
"""获取任务列表,可选按完成状态筛选"""
if completed is not None:
return [task for task in tasks_db if task["completed"] == completed]
return tasks_db
@app.get("/tasks/{task_id}", response_model=Task)
def read_task(task_id: int):
"""获取单个任务"""
task = next((t for t in tasks_db if t["id"] == task_id), None)
if task is None:
raise HTTPException(status_code=404, detail="Task not found")
return task
@app.post("/tasks/", response_model=Task, status_code=201)
def create_task(task: TaskCreate):
"""创建新任务"""
global task_id_counter
task_dict = task.dict()
task_dict["id"] = task_id_counter
task_dict["created_at"] = datetime.now()
tasks_db.append(task_dict)
task_id_counter += 1
return task_dict
@app.put("/tasks/{task_id}", response_model=Task)
def update_task(task_id: int, task: TaskCreate):
"""更新任务"""
task_index = next((i for i, t in enumerate(tasks_db) if t["id"] == task_id), None)
if task_index is None:
raise HTTPException(status_code=404, detail="Task not found")
task_dict = task.dict()
task_dict["id"] = tasks_db[task_index]["id"]
task_dict["created_at"] = tasks_db[task_index]["created_at"]
tasks_db[task_index] = task_dict
return task_dict
@app.delete("/tasks/{task_id}", status_code=204)
def delete_task(task_id: int):
"""删除任务"""
global tasks_db
task = next((t for t in tasks_db if t["id"] == task_id), None)
if task is None:
raise HTTPException(status_code=404, detail="Task not found")
tasks_db = [t for t in tasks_db if t["id"] != task_id]
return None
# 运行说明:
# 1. 安装依赖:pip install fastapi uvicorn
# 2. 保存为main.py
# 3. 运行:uvicorn main:app --reload
# 4. 访问http://127.0.0.1:8000/docs查看API文档
API架构图:
七、Python性能优化
7.1 代码优化技巧
Python性能优化常用技巧:
python
# 1. 使用内置函数和库
# 内置函数通常用C实现,速度更快
import time
# 反例
start = time.time()
result = []
for i in range(1000000):
result.append(i * 2)
end = time.time()
print(f"Loop method: {end - start:.4f} seconds")
# 正例
start = time.time()
result = list(map(lambda x: x * 2, range(1000000)))
end = time.time()
print(f"Map method: {end - start:.4f} seconds")
# 更优
start = time.time()
result = [x * 2 for x in range(1000000)]
end = time.time()
print(f"List comprehension: {end - start:.4f} seconds")
# 2. 避免全局变量
# 全局变量访问比局部变量慢
def global_variable_test():
"""使用全局变量"""
global x
result = 0
for i in range(1000000):
result += x
return result
def local_variable_test(x):
"""使用局部变量"""
result = 0
for i in range(1000000):
result += x
return result
x = 10
start = time.time()
global_variable_test()
end = time.time()
print(f"Global variable: {end - start:.4f} seconds")
start = time.time()
local_variable_test(x)
end = time.time()
print(f"Local variable: {end - start:.4f} seconds")
# 3. 使用适当的数据结构
# 选择合适的数据结构可以显著提高性能
import random
# 列表查找 vs 集合查找
data_list = list(range(1000000))
data_set = set(data_list)
to_find = random.sample(data_list, 1000)
# 列表查找
start = time.time()
for item in to_find:
if item in data_list:
pass
end = time.time()
print(f"List lookup: {end - start:.4f} seconds")
# 集合查找
start = time.time()
for item in to_find:
if item in data_set:
pass
end = time.time()
print(f"Set lookup: {end - start:.4f} seconds")
# 4. 生成器节省内存
# 生成器一次只生成一个元素,节省内存
def large_data_generator(n):
"""生成器函数"""
for i in range(n):
yield i * 2
# 生成器使用
start = time.time()
total = 0
for num in large_data_generator(10000000):
total += num
end = time.time()
print(f"Generator total: {total}, Time: {end - start:.4f} seconds")
# 5. 使用__slots__减少内存占用
class WithoutSlots:
"""不使用__slots__的类"""
def __init__(self, x, y):
self.x = x
self.y = y
class WithSlots:
"""使用__slots__的类"""
__slots__ = ("x", "y")
def __init__(self, x, y):
self.x = x
self.y = y
# 内存占用比较
import sys
obj1 = WithoutSlots(1, 2)
obj2 = WithSlots(1, 2)
print(f"Without slots: {sys.getsizeof(obj1)} bytes")
print(f"With slots: {sys.getsizeof(obj2)} bytes")
# 创建大量对象时差异更明显
start = time.time()
objects = [WithoutSlots(i, i+1) for i in range(100000)]
end = time.time()
print(f"Without slots creation: {end - start:.4f} seconds")
start = time.time()
objects = [WithSlots(i, i+1) for i in range(100000)]
end = time.time()
print(f"With slots creation: {end - start:.4f} seconds")
7.2 使用C扩展
对于性能关键部分,可以使用C扩展:
python
# 使用Cython (需要额外安装和编译)
# 或者使用ctypes调用C库
# 使用numba即时编译
from numba import jit
import numpy as np
# 普通Python函数
def python_fib(n):
"""Python实现斐波那契数列"""
a, b = 0, 1
for _ in range(n):
a, b = b, a + b
return a
# 使用numba装饰的函数
@jit(nopython=True) # nopython模式会生成纯机器码
def numba_fib(n):
"""Numba优化的斐波那契数列"""
a, b = 0, 1
for _ in range(n):
a, b = b, a + b
return a
# 性能比较
n = 1000000
start = time.time()
python_fib(n)
end = time.time()
print(f"Python implementation: {end - start:.4f} seconds")
# 第一次调用会包含编译时间
start = time.time()
numba_fib(n)
end = time.time()
print(f"Numba implementation (with compilation): {end - start:.4f} seconds")
# 第二次调用
start = time.time()
numba_fib(n)
end = time.time()
print(f"Numba implementation (compiled): {end - start:.4f} seconds")
7.3 多线程与多进程
Python中处理并发的两种主要方式:
python
# 多线程 vs 多进程
import threading
import multiprocessing
import time
# CPU密集型任务
def cpu_intensive_task(n):