Security · #docker#container#kubernetes#security

容器安全最佳实践与工具链

2025.07.13 7 min 2.6k
// 目录 · contents

前言

容器技术极大地简化了应用的部署和管理,但也带来了新的安全挑战。从镜像构建到运行时,从网络隔离到密钥管理,每个环节都可能成为攻击者的突破口。本文将系统性地介绍容器安全的最佳实践和工具链。

容器安全全景

graph TB
    subgraph "构建阶段 Build"
        B1[基础镜像安全]
        B2[Dockerfile最佳实践]
        B3[依赖漏洞扫描]
        B4[镜像签名]
    end

    subgraph "分发阶段 Ship"
        S1[镜像仓库安全]
        S2[准入控制]
        S3[SBOM生成]
    end

    subgraph "运行阶段 Run"
        R1[运行时安全监控]
        R2[网络策略]
        R3[密钥管理]
        R4[Pod安全标准]
    end

    B1 --> S1
    B2 --> S1
    B3 --> S1
    B4 --> S1
    S1 --> R1
    S2 --> R1
    S3 --> R1

镜像安全

安全的Dockerfile

1
2
3
4
5
6
7
# 不安全的Dockerfile
FROM ubuntu:latest
RUN apt-get update && apt-get install -y python3 curl wget
COPY . /app
RUN pip install -r requirements.txt
USER root
CMD ["python3", "app.py"]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 安全的Dockerfile
# 1. 使用特定版本的最小基础镜像
FROM python:3.12-slim-bookworm AS builder

# 2. 安装依赖(构建阶段)
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# 3. 多阶段构建,最终镜像更小更安全
FROM python:3.12-slim-bookworm

# 4. 创建非root用户
RUN groupadd -r appgroup && useradd -r -g appgroup -d /app -s /sbin/nologin appuser

# 5. 只复制必要的文件
WORKDIR /app
COPY --from=builder /root/.local /home/appuser/.local
COPY --chown=appuser:appgroup . .

# 6. 设置环境变量
ENV PATH="/home/appuser/.local/bin:${PATH}"

# 7. 切换到非root用户
USER appuser

# 8. 健康检查
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD python3 -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"

# 9. 只暴露必要端口
EXPOSE 8080

# 10. 使用exec形式的CMD
CMD ["python3", "app.py"]

Dockerfile安全检查清单

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
dockerfile_security_checklist:
base_image:
- "使用特定版本tag,不用latest"
- "选择最小基础镜像(alpine/slim/distroless)"
- "使用可信的官方镜像"

build_process:
- "多阶段构建,减小最终镜像体积"
- "不在镜像中包含密钥、证书、SSH私钥"
- "使用.dockerignore排除不必要文件"
- "合并RUN命令减少层数"
- "清理包管理器缓存"

runtime:
- "使用非root用户运行"
- "只暴露必要端口"
- "设置只读文件系统(如可行)"
- "使用HEALTHCHECK"

镜像扫描工具

Trivy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 扫描镜像漏洞
trivy image myapp:latest

# 只显示高危和严重漏洞
trivy image --severity HIGH,CRITICAL myapp:latest

# 扫描Dockerfile
trivy config Dockerfile

# 扫描文件系统(项目依赖)
trivy fs --scanners vuln,secret,misconfig .

# JSON输出(用于CI/CD集成)
trivy image --format json --output result.json myapp:latest

# CI/CD中使用(发现高危漏洞时失败)
trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:latest

集成到CI/CD

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# GitHub Actions集成Trivy
name: Security Scan
on: [push, pull_request]

jobs:
trivy-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Build image
run: docker build -t myapp:${{ github.sha }} .

- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: sarif
output: trivy-results.sarif
severity: CRITICAL,HIGH
exit-code: 1

- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: trivy-results.sarif

运行时安全

Falco规则

Falco是开源的云原生运行时安全工具,检测异常行为。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# Falco自定义规则
- rule: Detect Shell in Container
desc: 检测容器中启动Shell
condition: >
spawned_process and
container and
proc.name in (bash, sh, zsh, ash) and
not proc.pname in (cron, supervisord)
output: >
Shell started in container
(user=%user.name container=%container.name
shell=%proc.name parent=%proc.pname
image=%container.image.repository)
priority: WARNING
tags: [container, shell, runtime]

- rule: Write Below /etc
desc: 检测对/etc目录的写入
condition: >
write_etc_common and
container and
not user_known_write_etc_conditions
output: >
File written below /etc in container
(user=%user.name file=%fd.name
container=%container.name image=%container.image.repository)
priority: ERROR
tags: [container, filesystem]

- rule: Outbound Connection to Suspicious IP
desc: 检测连接到可疑IP
condition: >
outbound and
container and
fd.sip.name in (miners_ip_list)
output: >
Suspicious outbound connection
(container=%container.name ip=%fd.sip command=%proc.cmdline)
priority: CRITICAL
tags: [container, network]
1
2
3
4
5
6
7
# 安装Falco(Kubernetes)
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm install falco falcosecurity/falco \
--namespace falco \
--create-namespace \
--set falcosidekick.enabled=true \
--set falcosidekick.config.slack.webhookurl="https://hooks.slack.com/xxx"

运行时安全策略

flowchart TB
    EVENT[系统调用事件] --> FALCO[Falco Engine]
    FALCO --> |匹配规则| ALERT[告警]
    ALERT --> SLACK[Slack通知]
    ALERT --> SIEM[SIEM系统]
    ALERT --> OPA[OPA/Admission Controller]
    OPA --> |自动响应| KILL[终止Pod]
    OPA --> |自动响应| ISOLATE[网络隔离]

Kubernetes Pod安全

Pod Security Standards (PSS)

Kubernetes 1.25+使用Pod Security Admission替代了PodSecurityPolicy。

1
2
3
4
5
6
7
8
9
10
# 命名空间级别的安全标准
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
# 三种安全级别: privileged, baseline, restricted
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# 符合 restricted 安全标准的Pod
apiVersion: v1
kind: Pod
metadata:
name: secure-app
namespace: production
spec:
# 不使用hostNetwork
hostNetwork: false
# 不使用hostPID
hostPID: false

securityContext:
# 以非root用户运行
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
# 使用Seccomp
seccompProfile:
type: RuntimeDefault

containers:
- name: app
image: myapp:1.0.0@sha256:abc123... # 使用digest
securityContext:
# 不允许提权
allowPrivilegeEscalation: false
# 不使用特权模式
privileged: false
# 只读根文件系统
readOnlyRootFilesystem: true
# 删除所有Linux capabilities
capabilities:
drop: ["ALL"]
resources:
limits:
cpu: "500m"
memory: "256Mi"
requests:
cpu: "100m"
memory: "128Mi"
# 需要可写目录时使用emptyDir
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache

volumes:
- name: tmp
emptyDir:
sizeLimit: "100Mi"
- name: cache
emptyDir:
sizeLimit: "200Mi"

# 使用专用ServiceAccount
serviceAccountName: app-sa
automountServiceAccountToken: false # 不需要K8s API访问时禁用

网络策略

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# 默认拒绝所有流量
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress

---
# 允许特定流量
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-web-to-api
namespace: production
spec:
podSelector:
matchLabels:
app: api-server
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: web-frontend
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 8080
graph LR
    subgraph "production namespace"
        WEB[Web Frontend<br>port 80] -->|allowed| API[API Server<br>port 8080]
        API -->|allowed| DB[(PostgreSQL<br>port 5432)]
        WEB -.->|denied| DB
    end

    subgraph "monitoring namespace"
        PROM[Prometheus] -->|allowed| API
    end

    EXTERNAL[External] -.->|denied| API
    EXTERNAL -.->|denied| DB

密钥管理

Kubernetes Secrets最佳实践

1
2
3
4
5
6
7
8
9
# 不要这样做 - Secret明文存在etcd中
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
data:
username: YWRtaW4= # base64, 不是加密
password: cGFzc3dvcmQ=
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 推荐: 使用External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: db-credentials
creationPolicy: Owner
data:
- secretKey: username
remoteRef:
key: production/database
property: username
- secretKey: password
remoteRef:
key: production/database
property: password

密钥管理方案对比

graph TB
    subgraph "密钥管理方案"
        direction LR
        K8S[K8s Secrets<br>Base64编码<br>不安全] --> ESO[External Secrets<br>Operator<br>推荐]
        ESO --> VAULT[HashiCorp Vault]
        ESO --> AWS_SM[AWS Secrets Manager]
        ESO --> GCP_SM[GCP Secret Manager]
        ESO --> AZ_KV[Azure Key Vault]
    end

    subgraph "加密增强"
        ETCD_ENC[etcd加密<br>EncryptionConfiguration]
        SEALED[Sealed Secrets<br>加密后可提交Git]
        SOPS[Mozilla SOPS<br>文件级加密]
    end
1
2
3
4
5
6
7
8
9
10
11
12
13
# Sealed Secrets - 可以安全地提交到Git
# 使用kubeseal加密
# kubeseal --format=yaml < secret.yaml > sealed-secret.yaml

apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: db-credentials
namespace: production
spec:
encryptedData:
username: AgBy3i4OJSWK+... # 只有集群中的controller能解密
password: AgCtr8p2E1xyL+...

供应链安全

镜像签名(Cosign)

1
2
3
4
5
6
7
8
9
10
11
12
# 生成密钥对
cosign generate-key-pair

# 签名镜像
cosign sign --key cosign.key myregistry.io/myapp:v1.0.0

# 验证签名
cosign verify --key cosign.pub myregistry.io/myapp:v1.0.0

# 使用keyless签名(通过OIDC身份)
cosign sign myregistry.io/myapp:v1.0.0
# 自动通过Sigstore的Fulcio获取短期证书

SBOM(Software Bill of Materials)

1
2
3
4
5
6
7
8
# 使用Syft生成SBOM
syft myapp:latest -o spdx-json > sbom.json

# 使用Trivy扫描SBOM中的漏洞
trivy sbom sbom.json

# 将SBOM附加到镜像(使用cosign)
cosign attach sbom --sbom sbom.json myregistry.io/myapp:v1.0.0

准入控制(Admission Control)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Kyverno策略 - 只允许签名的镜像
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-image-signature
spec:
validationFailureAction: Enforce
background: false
rules:
- name: verify-cosign-signature
match:
any:
- resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "myregistry.io/*"
attestors:
- count: 1
entries:
- keys:
publicKeys: |-
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...
-----END PUBLIC KEY-----
sequenceDiagram
    participant Dev as Developer
    participant CI as CI/CD
    participant REG as Registry
    participant K8S as Kubernetes
    participant ADM as Admission Controller

    Dev->>CI: Push code
    CI->>CI: Build image
    CI->>CI: Scan vulnerabilities (Trivy)
    CI->>CI: Generate SBOM (Syft)
    CI->>REG: Push image
    CI->>REG: Sign image (Cosign)
    CI->>REG: Attach SBOM

    K8S->>ADM: Pod creation request
    ADM->>REG: Verify signature
    ADM->>REG: Check vulnerability scan
    alt Verified & Clean
        ADM->>K8S: Allow
    else Not verified or Vulnerable
        ADM->>K8S: Deny
    end

安全扫描工具对比

工具 类型 特点 适用场景
Trivy 漏洞扫描 全面、快速、易用 CI/CD集成
Snyk 漏洞扫描 商业支持、修复建议 企业使用
Falco 运行时安全 系统调用监控 K8s运行时
Kyverno 准入控制 K8s原生策略 策略即代码
Cosign 镜像签名 Sigstore生态 供应链安全
Syft SBOM生成 多格式支持 合规审计

总结

容器安全需要贯穿整个生命周期:

  1. 构建阶段:使用最小基础镜像、非root用户、多阶段构建,集成漏洞扫描
  2. 分发阶段:镜像签名(Cosign)、SBOM生成、准入控制
  3. 运行阶段:Pod安全标准(restricted)、网络策略(默认拒绝)、运行时监控(Falco)
  4. 密钥管理:使用External Secrets Operator + Vault,不在代码中硬编码密钥
  5. 供应链安全:签名验证、SBOM、准入策略形成完整的信任链

容器安全不是单一工具能解决的,需要构建一个覆盖全生命周期的安全工具链和流程。

作者 · authorzt
发布 · date2025-07-13
篇幅 · length2.6k 字 · 7 min
许可 · licenseCC BY-SA 4.0
$ echo "comments" · 评论