前言
短链接服务(URL
Shortener)是系统设计面试中的经典题目,也是实际工程中常见的需求。虽然看似简单——将长
URL 映射为短
URL,但要设计一个支持高并发、高可用、可水平扩展的短链接服务,涉及到的技术点非常丰富。本文将从需求分析开始,逐步设计一个生产级的短链接服务。
需求分析
功能需求
给定一个长 URL,生成一个短 URL
用户访问短 URL 时,重定向到原始长 URL
短链接可以设置过期时间
自定义短链接别名(可选)
访问统计分析
非功能需求
高可用 :服务可用性 99.99%
低延迟 :重定向响应时间 < 50ms
高并发 :支持每秒 10 万次读取
高扩展 :可水平扩展
容量估算
1 2 3 4 5 6 7 8 9 10 11 假设: - 每月新增短链接: 1亿条 - 读写比: 100:1 - 每条短链接平均存储: 500 bytes - 保留期限: 5年 写入 QPS: 1亿 / 30天 / 24小时 / 3600秒 ≈ 40 QPS 读取 QPS: 40 × 100 = 4,000 QPS (峰值 × 10 = 40,000 QPS) 存储: 1亿/月 × 12月 × 5年 × 500B = 3TB 缓存: 热点数据 20% ≈ 600GB (可分片)
系统架构总览
graph TB
Client[客户端] --> CDN[CDN]
CDN --> LB[负载均衡]
LB --> API1[API Server 1]
LB --> API2[API Server 2]
LB --> APIN[API Server N]
API1 --> Cache[Redis Cluster<br/>缓存层]
API2 --> Cache
APIN --> Cache
Cache --> DB[(MySQL Cluster<br/>主从)]
API1 --> IDGen[ID生成服务]
API2 --> IDGen
APIN --> IDGen
API1 --> Analytics[分析服务]
Analytics --> Kafka[Kafka]
Kafka --> Flink[Flink 流处理]
Flink --> ClickHouse[(ClickHouse<br/>分析数据库)]
URL 编码方案
方案选择
短链接的核心是将长 URL 映射为一个短字符串。常见方案:
哈希(MD5/SHA)
实现简单
碰撞处理复杂
自增ID + Base62
无碰撞
需要分布式ID
随机字符串
简单
碰撞检测开销
预生成
无实时计算
维护Key池
推荐使用 自增 ID + Base62
编码 ,既保证唯一性又简洁。
Base62 编码
Base62 使用 [0-9a-zA-Z] 共 62 个字符。7 位 Base62
可以表示 62^7 ≈ 3.5 万亿个不同的值,足以满足需求。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 public class Base62Encoder { private static final String ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" ; private static final int BASE = ALPHABET.length(); public static String encode (long num) { if (num == 0 ) return String.valueOf(ALPHABET.charAt(0 )); StringBuilder sb = new StringBuilder (); while (num > 0 ) { sb.append(ALPHABET.charAt((int ) (num % BASE))); num /= BASE; } return sb.reverse().toString(); } public static long decode (String str) { long num = 0 ; for (char c : str.toCharArray()) { num = num * BASE + ALPHABET.indexOf(c); } return num; } }
1 2 3 4 示例: ID = 12345678 → Base62 = "dGnd" ID = 1000000000 → Base62 = "15FTGf" ID = 3521614606208 → Base62 = "zzzzzzzz"
分布式 ID 生成
graph LR
subgraph 方案1: Snowflake
S[Snowflake ID<br/>64位] --> T[时间戳 41位]
S --> M[机器ID 10位]
S --> SEQ[序列号 12位]
end
subgraph 方案2: 号段模式
DB[(数据库)] -->|每次取1000个ID| Service1[服务1<br/>1-1000]
DB -->|每次取1000个ID| Service2[服务2<br/>1001-2000]
end
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 @Service public class SegmentIdGenerator { private final AtomicLong currentId = new AtomicLong (0 ); private volatile long maxId = 0 ; private final int segmentSize = 1000 ; private final Object lock = new Object (); @Autowired private JdbcTemplate jdbc; public long nextId () { long id = currentId.incrementAndGet(); if (id > maxId) { synchronized (lock) { if (currentId.get() > maxId) { loadNextSegment(); } id = currentId.incrementAndGet(); } } return id; } private void loadNextSegment () { jdbc.update( "UPDATE id_generator SET max_id = max_id + ? WHERE biz_type = 'short_url'" , segmentSize); Long newMaxId = jdbc.queryForObject( "SELECT max_id FROM id_generator WHERE biz_type = 'short_url'" , Long.class); maxId = newMaxId; currentId.set(newMaxId - segmentSize); } }
数据库设计
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 CREATE TABLE short_url ( id BIGINT PRIMARY KEY AUTO_INCREMENT, short_code VARCHAR (10 ) NOT NULL , long_url VARCHAR (2048 ) NOT NULL , user_id BIGINT , created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP , expires_at TIMESTAMP NULL , click_count BIGINT NOT NULL DEFAULT 0 , UNIQUE KEY uk_short_code (short_code), INDEX idx_user_id (user_id), INDEX idx_expires_at (expires_at) ) ENGINE= InnoDB DEFAULT CHARSET= utf8mb4;CREATE TABLE url_mapping ( url_hash CHAR (64 ) NOT NULL , short_code VARCHAR (10 ) NOT NULL , long_url VARCHAR (2048 ) NOT NULL , PRIMARY KEY (url_hash), INDEX idx_short_code (short_code) ) ENGINE= InnoDB DEFAULT CHARSET= utf8mb4;CREATE TABLE click_log ( id BIGINT AUTO_INCREMENT, short_code VARCHAR (10 ) NOT NULL , client_ip VARCHAR (45 ), user_agent VARCHAR (512 ), referer VARCHAR (2048 ), country VARCHAR (2 ), created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP , PRIMARY KEY (id, created_at), INDEX idx_short_code (short_code) ) PARTITION BY RANGE (UNIX_TIMESTAMP(created_at)) ( PARTITION p202501 VALUES LESS THAN (UNIX_TIMESTAMP('2025-02-01' )), PARTITION p202502 VALUES LESS THAN (UNIX_TIMESTAMP('2025-03-01' )) );
核心 API 实现
创建短链接
sequenceDiagram
participant Client
participant API as API Server
participant IDGen as ID Generator
participant Cache as Redis
participant DB as MySQL
Client->>API: POST /api/v1/shorten<br/>{long_url, expires_at}
API->>API: 校验URL格式
API->>DB: 查询url_mapping<br/>(SHA256去重)
alt URL已存在
DB-->>API: 返回已有short_code
else URL不存在
API->>IDGen: 获取唯一ID
IDGen-->>API: id=12345678
API->>API: Base62编码<br/>id→"dGnd"
API->>DB: INSERT short_url + url_mapping
API->>Cache: SET short:dGnd → long_url
end
API-->>Client: {short_url: "https://s.io/dGnd"}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 @RestController @RequestMapping("/api/v1") public class ShortUrlController { @Autowired private ShortUrlService shortUrlService; @PostMapping("/shorten") public ResponseEntity<ShortenResponse> shorten (@Valid @RequestBody ShortenRequest request) { String shortCode = shortUrlService.createShortUrl( request.getLongUrl(), request.getExpiresAt(), request.getCustomAlias() ); return ResponseEntity.ok(new ShortenResponse ( "https://s.io/" + shortCode, shortCode )); } }@Service public class ShortUrlService { @Autowired private SegmentIdGenerator idGenerator; @Autowired private ShortUrlMapper urlMapper; @Autowired private RedisTemplate<String, String> redis; @Transactional public String createShortUrl (String longUrl, LocalDateTime expiresAt, String customAlias) { validateUrl(longUrl); if (customAlias != null && !customAlias.isEmpty()) { if (urlMapper.existsByShortCode(customAlias)) { throw new AliasAlreadyExistsException (customAlias); } saveShortUrl(customAlias, longUrl, expiresAt); return customAlias; } String urlHash = DigestUtils.sha256Hex(longUrl); String existingCode = urlMapper.findShortCodeByUrlHash(urlHash); if (existingCode != null ) { return existingCode; } long id = idGenerator.nextId(); String shortCode = Base62Encoder.encode(id); saveShortUrl(shortCode, longUrl, expiresAt); urlMapper.insertUrlMapping(urlHash, shortCode, longUrl); redis.opsForValue().set("short:" + shortCode, longUrl, Duration.ofHours(24 )); return shortCode; } private void saveShortUrl (String shortCode, String longUrl, LocalDateTime expiresAt) { ShortUrl entity = new ShortUrl (); entity.setShortCode(shortCode); entity.setLongUrl(longUrl); entity.setExpiresAt(expiresAt); urlMapper.insert(entity); } }
重定向处理
sequenceDiagram
participant Client
participant API as API Server
participant Cache as Redis
participant DB as MySQL
participant Kafka as Kafka
Client->>API: GET /dGnd
API->>Cache: GET short:dGnd
alt 缓存命中
Cache-->>API: long_url
else 缓存未命中
API->>DB: SELECT long_url WHERE short_code='dGnd'
DB-->>API: long_url
API->>Cache: SET short:dGnd → long_url
end
API->>API: 检查是否过期
API-->>Client: 301/302 Redirect → long_url
API->>Kafka: 异步发送点击事件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 @RestController public class RedirectController { @GetMapping("/{shortCode:[a-zA-Z0-9]{4,10}}") public ResponseEntity<Void> redirect ( @PathVariable String shortCode, HttpServletRequest request) { String longUrl = redis.opsForValue().get("short:" + shortCode); if (longUrl == null ) { ShortUrl shortUrl = urlMapper.findByShortCode(shortCode); if (shortUrl == null ) { throw new ShortUrlNotFoundException (shortCode); } if (shortUrl.getExpiresAt() != null && shortUrl.getExpiresAt().isBefore(LocalDateTime.now())) { throw new ShortUrlExpiredException (shortCode); } longUrl = shortUrl.getLongUrl(); redis.opsForValue().set("short:" + shortCode, longUrl, Duration.ofHours(24 )); } publishClickEvent(shortCode, request); return ResponseEntity.status(HttpStatus.FOUND) .location(URI.create(longUrl)) .build(); } private void publishClickEvent (String shortCode, HttpServletRequest request) { ClickEvent event = new ClickEvent (); event.setShortCode(shortCode); event.setClientIp(getClientIp(request)); event.setUserAgent(request.getHeader("User-Agent" )); event.setReferer(request.getHeader("Referer" )); event.setTimestamp(Instant.now()); kafkaTemplate.send("click-events" , shortCode, event); } }
301 vs 302 :301
是永久重定向,浏览器会缓存,后续不再请求服务器;302
是临时重定向,每次都经过服务器。如果需要统计点击量,应使用 302。
访问分析
graph LR
API[API Server] -->|点击事件| Kafka[Kafka]
Kafka --> Flink[Flink]
Flink -->|实时聚合| Redis[Redis<br/>实时计数]
Flink -->|批量写入| CH[(ClickHouse<br/>分析查询)]
Dashboard[分析面板] --> Redis
Dashboard --> CH
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 public class ClickAnalyticsJob { public static void main (String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream<ClickEvent> clicks = env .addSource(new FlinkKafkaConsumer <>("click-events" , new ClickEventSchema (), kafkaProps)); clicks .keyBy(ClickEvent::getShortCode) .window(TumblingProcessingTimeWindows.of(Time.minutes(1 ))) .aggregate(new ClickCountAggregator ()) .addSink(new ClickHouseSink ()); clicks .keyBy(ClickEvent::getShortCode) .process(new ProcessFunction <ClickEvent, Void>() { @Override public void processElement (ClickEvent event, Context ctx, Collector<Void> out) { redisClient.incr("clicks:" + event.getShortCode()); redisClient.pfAdd("uv:" + event.getShortCode(), event.getClientIp()); } }); env.execute("Click Analytics" ); } }
高可用与扩展
缓存策略
graph TB
subgraph 缓存层设计
Hot[热点短链接<br/>本地缓存 Caffeine] --> Warm[活跃短链接<br/>Redis Cluster]
Warm --> Cold[冷数据<br/>MySQL]
end
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 @Component public class ShortUrlCache { private final Cache<String, String> localCache = Caffeine.newBuilder() .maximumSize(100_000 ) .expireAfterAccess(10 , TimeUnit.MINUTES) .build(); @Autowired private RedisTemplate<String, String> redis; public String getLongUrl (String shortCode) { String url = localCache.getIfPresent(shortCode); if (url != null ) return url; url = redis.opsForValue().get("short:" + shortCode); if (url != null ) { localCache.put(shortCode, url); return url; } return null ; } }
数据库分片
graph TB
Router[分片路由] --> S0[Shard 0<br/>short_code hash % 4 = 0]
Router --> S1[Shard 1<br/>short_code hash % 4 = 1]
Router --> S2[Shard 2<br/>short_code hash % 4 = 2]
Router --> S3[Shard 3<br/>short_code hash % 4 = 3]
S0 --> S0M[Master]
S0 --> S0S[Slave]
S1 --> S1M[Master]
S1 --> S1S[Slave]
完整架构图
graph TB
DNS[DNS] --> CDN[CDN / 边缘节点]
CDN --> LB[负载均衡 L7]
subgraph API Layer
LB --> API1[API Server]
LB --> API2[API Server]
LB --> APIN[API Server]
end
subgraph Cache Layer
API1 --> LocalCache1[Caffeine L1]
LocalCache1 --> RedisCluster[Redis Cluster L2]
end
subgraph Storage Layer
RedisCluster --> Shard1[MySQL Shard 1<br/>Master + Slave]
RedisCluster --> Shard2[MySQL Shard 2<br/>Master + Slave]
end
subgraph ID Generation
API1 --> Leaf[Leaf ID Generator<br/>号段模式]
end
subgraph Analytics Pipeline
API1 -->|异步| Kafka[Kafka]
Kafka --> Flink[Flink]
Flink --> ClickHouse[(ClickHouse)]
Flink --> RedisCounter[Redis 计数器]
end
安全考虑
防恶意 URL :创建短链接时检查长 URL
是否在黑名单中(恶意网站、钓鱼网站)
限流 :对创建接口进行用户级别的限流
防滥用 :限制单个用户的创建频率和数量
隐私 :访问日志中的 IP 地址需要脱敏处理
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 public class UrlSafetyChecker { private final Set<String> blacklistedDomains; public boolean isSafe (String url) { URI uri = URI.create(url); String host = uri.getHost().toLowerCase(); if (blacklistedDomains.contains(host)) { return false ; } InetAddress addr = InetAddress.getByName(host); if (addr.isLoopbackAddress() || addr.isSiteLocalAddress()) { return false ; } return true ; } }
总结
短链接服务虽然功能看似简单,但涉及到的技术点覆盖了系统设计的方方面面:分布式
ID 生成保证唯一性,Base62
编码保证短码简洁,多级缓存保证低延迟,数据库分片保证存储扩展,消息队列实现异步统计。关键设计决策包括:使用
302 而非 301 来保证统计准确性,使用号段模式生成分布式 ID,采用 Caffeine
+ Redis 多级缓存将热点 Key 的响应时间压到微秒级。