Istio: 监控应用的对外请求
前阵子有同学在阿里云上的集群通过公网地址访问 OSS, 消耗了大量带宽和流量, 促使我将之前拖了很久的一件事落地: 劫持并监控集群内访问外部网络的流量. 这件事在 mesh 的基础上会非常的简单, 基本也不需要入侵业务. 依赖的核心是:
- 通过 ServiceEntry 将外部网址包装成 Service
- 通过 DestinationRule 允许业务方通过 HTTP 访问外部网址, 并由 sidecar 负责 TLS, 进而可以解析请求具体内容
- 在 Istio 劫持 DNS 的基础上, 通过 ISTIO_META_DNS_AUTO_ALLOCATE 为 ServiceEntry 分配虚拟的 IP
实际效果
以 oss-cn-shanghai.aliyuncs.com 为例.
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: demo
namespace: external
spec:
hosts:
- oss-cn-shanghai.aliyuncs.com
location: MESH_EXTERNAL
ports:
- name: http
number: 80
protocol: HTTP
targetPort: 443
- name: https
number: 443
protocol: TLS
resolution: DNS
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: com.aliyuncs.oss-cn-shanghai
namespace: external
spec:
host: oss-cn-shanghai.aliyuncs.com
trafficPolicy:
portLevelSettings:
- port:
number: 80
tls:
mode: SIMPLE
此时, 访问 https://baidu.com 可以在 L4 按 TCP 被 Envoy 解析, 访问 http://baidu.com 会被转发到 https, 且可以按 HTTP 被解析.
# k get pods -l app=ping -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ping-ff5ff68d6-xfwhc 2/2 Running 0 3m9s 172.20.175.81 cn-shanghai.10.1.79.74 <none> <none>
# curl http://172.20.175.81:15020/stats/prometheus 2>/dev/null | grep baidu.com | grep -v sum | grep -v bucket
# k exec -ti ping-ff5ff68d6-xfwhc -- curl https://baidu.com -i
HTTP/1.1 302 Moved Temporarily
Server: bfe/1.0.8.18
Date: Wed, 10 Jan 2024 14:50:51 GMT
Content-Type: text/html
Content-Length: 161
Connection: keep-alive
Location: http://www.baidu.com/
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>bfe/1.0.8.18</center>
</body>
</html>
# curl http://172.20.175.81:15020/stats/prometheus 2>/dev/null | grep baidu.com | grep -v sum | grep -v bucket
istio_tcp_connections_closed_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_connections_opened_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_received_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 775
istio_tcp_sent_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 4077
# k exec -ti ping-ff5ff68d6-xfwhc -- curl http://baidu.com -i
HTTP/1.1 302 Found
server: envoy
date: Wed, 10 Jan 2024 14:51:08 GMT
content-type: text/html
content-length: 161
location: http://www.baidu.com/
x-envoy-upstream-service-time: 128
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>bfe/1.0.8.18</center>
</body>
</html>
# curl http://172.20.175.81:15020/stats/prometheus 2>/dev/null | grep baidu.com | grep -v sum | grep -v bucket
istio_tcp_connections_closed_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_connections_opened_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_received_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 775
istio_tcp_sent_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 4077
istio_request_duration_milliseconds_count{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="http",response_code="302",grpc_response_status="",response_flags="-",method="unknown"} 1
实现思路
我们偷懒直接看 Istio 转化后的 Envoy, 而不是去翻代码.
- 因为我们开启了 ISTIO_META_DNS_AUTO_ALLOCATE, 所以 Istio 分配了一个 IP 给 baidu.com, mesh 的应用看到的都是这个结果.
# k exec -ti ping-ff5ff68d6-xfwhc -- ping baidu.com PING baidu.com (240.240.182.196): 56 data bytes
- 对于 HTTPS, Envoy 监听了对应 IP:PORT, 只解析到 L4, 因为 TLS 是业务自己做的, Envoy 并无法解析到 L7.
# istioctl proxy-config listener ping-ff5ff68d6-xfwhc --port 443 --address 240.240.182.196 ADDRESSES PORT MATCH DESTINATION 240.240.182.196 443 ALL Cluster: outbound|443||baidu.com # istioctl proxy-config listener ping-ff5ff68d6-xfwhc --port 443 --address 240.240.182.196 -o json | jq '.[].filterChains[].filters[].name' "istio.stats" "envoy.filters.network.tcp_proxy"
- 对于 HTTP, Envoy 会将其转发到 443. 因为 TLS 是 Envoy 处理的, 所以其可以解析按 HTTP 解析请求, 并获取到对应数据.
cluster 中的 transport_socket 指定了 TLS, 转发的 port 是 443.
# istioctl proxy-config route ping-ff5ff68d6-xfwhc --name 80 -o json | jq '.[].virtualHosts[] | select(.name | test("baidu.com:80"))' { "name": "baidu.com:80", "domains": [ "baidu.com", "240.240.182.196" ], "routes": [ { "name": "default", "match": { "prefix": "/" }, "route": { "cluster": "outbound|80||baidu.com", "timeout": "0s", "retryPolicy": { "retryOn": "connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes", "numRetries": 2, "retryHostPredicate": [ { "name": "envoy.retry_host_predicates.previous_hosts", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate" } } ], "hostSelectionRetryMaxAttempts": "5", "retriableStatusCodes": [ 503 ] }, "maxGrpcTimeout": "0s" }, "decorator": { "operation": "baidu.com:80/*" } } ], "includeRequestAttemptCount": true } # istioctl proxy-config cluster ping-ff5ff68d6-xfwhc --fqdn "outbound|80||baidu.com" -o json | jq [ { "name": "outbound|80||baidu.com", "type": "STRICT_DNS", "connectTimeout": "10s", "lbPolicy": "LEAST_REQUEST", "loadAssignment": { "clusterName": "outbound|80||baidu.com", "endpoints": [ { "locality": {}, "lbEndpoints": [ { "endpoint": { "address": { "socketAddress": { "address": "baidu.com", "portValue": 443 } } }, "metadata": { "filterMetadata": { "istio": { "workload": ";;;;" } } }, "loadBalancingWeight": 1 } ], "loadBalancingWeight": 1 } ] }, "circuitBreakers": { "thresholds": [ { "maxConnections": 4294967295, "maxPendingRequests": 4294967295, "maxRequests": 4294967295, "maxRetries": 4294967295, "trackRemaining": true } ] }, "dnsRefreshRate": "60s", "respectDnsTtl": true, "dnsLookupFamily": "V4_ONLY", "commonLbConfig": { "localityWeightedLbConfig": {} }, "transportSocket": { "name": "envoy.transport_sockets.tls", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext", "commonTlsContext": { "tlsParams": { "tlsMinimumProtocolVersion": "TLSv1_2", "tlsMaximumProtocolVersion": "TLSv1_3" }, "validationContext": {} } } }, "metadata": { "filterMetadata": { "istio": { "alpn_override": "false", "config": "/apis/networking.istio.io/v1alpha3/namespaces/external/destination-rule/com.baidu", "external": true, "services": [ { "host": "baidu.com", "name": "baidu.com", "namespace": "external" } ] } } }, "filters": [ { "name": "istio.metadata_exchange", "typedConfig": { "@type": "type.googleapis.com/udpa.type.v1.TypedStruct", "typeUrl": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange", "value": { "enable_discovery": true, "protocol": "istio-peer-exchange" } } } ] } ]
匹配一批域名
Istio 的 ServiceEntry 支持 prefix wildcard, 案例如下:
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: wildcard
namespace: external
spec:
hosts:
- "*.oss-cn-shanghai.aliyuncs.com"
- "*.oss-cn-shanghai-internal.aliyuncs.com"
location: MESH_EXTERNAL
ports:
- name: https
number: 443
protocol: TLS
resolution: NONE
但当我使用 prefix wildcard 时:
- resolution 应该从 DNS 改成 NONE
- 无法通过 DestinationRule 让 Envoy 将 80 转发到 443, 所以只能在 L4 做解析.
此时 Envoy 是通过 SNI 来匹配流量的:
# istioctl proxy-config listener ping-ff5ff68d6-xfwhc --port 443 --address 0.0.0.0 -o json | jq '.[].filterChains[].filterChainMatch'
{
"transportProtocol": "raw_buffer",
"applicationProtocols": [
"http/1.1",
"h2c"
]
}
{
"serverNames": [
"*.oss-cn-shanghai-internal.aliyuncs.com"
]
}
{
"serverNames": [
"*.oss-cn-shanghai.aliyuncs.com"
]
}
如果你有需求的话, 可以控制日志来统计 SNI 并做后续处理.