前阵子有同学在阿里云上的集群通过公网地址访问 OSS, 消耗了大量带宽和流量, 促使我将之前拖了很久的一件事落地: 劫持并监控集群内访问外部网络的流量. 这件事在 mesh 的基础上会非常的简单, 基本也不需要入侵业务. 依赖的核心是:

  • 通过 ServiceEntry 将外部网址包装成 Service
  • 通过 DestinationRule 允许业务方通过 HTTP 访问外部网址, 并由 sidecar 负责 TLS, 进而可以解析请求具体内容
  • 在 Istio 劫持 DNS 的基础上, 通过 ISTIO_META_DNS_AUTO_ALLOCATE 为 ServiceEntry 分配虚拟的 IP

实际效果

以 oss-cn-shanghai.aliyuncs.com 为例.

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: demo
  namespace: external
spec:
  hosts:
  - oss-cn-shanghai.aliyuncs.com
  location: MESH_EXTERNAL
  ports:
  - name: http
    number: 80
    protocol: HTTP
    targetPort: 443
  - name: https
    number: 443
    protocol: TLS
  resolution: DNS
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: com.aliyuncs.oss-cn-shanghai
  namespace: external
spec:
  host: oss-cn-shanghai.aliyuncs.com
  trafficPolicy:
    portLevelSettings:
    - port:
        number: 80
      tls:
        mode: SIMPLE

此时, 访问 https://baidu.com 可以在 L4 按 TCP 被 Envoy 解析, 访问 http://baidu.com 会被转发到 https, 且可以按 HTTP 被解析.

# k get pods -l app=ping -o wide
NAME                   READY   STATUS    RESTARTS   AGE    IP              NODE                     NOMINATED NODE   READINESS GATES
ping-ff5ff68d6-xfwhc   2/2     Running   0          3m9s   172.20.175.81   cn-shanghai.10.1.79.74   <none>           <none>
# curl http://172.20.175.81:15020/stats/prometheus 2>/dev/null | grep baidu.com | grep -v sum | grep -v bucket
# k exec -ti ping-ff5ff68d6-xfwhc -- curl https://baidu.com -i
HTTP/1.1 302 Moved Temporarily
Server: bfe/1.0.8.18
Date: Wed, 10 Jan 2024 14:50:51 GMT
Content-Type: text/html
Content-Length: 161
Connection: keep-alive
Location: http://www.baidu.com/

<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>bfe/1.0.8.18</center>
</body>
</html>
# curl http://172.20.175.81:15020/stats/prometheus 2>/dev/null | grep baidu.com | grep -v sum | grep -v bucket
istio_tcp_connections_closed_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_connections_opened_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_received_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 775
istio_tcp_sent_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 4077
# k exec -ti ping-ff5ff68d6-xfwhc -- curl http://baidu.com -i
HTTP/1.1 302 Found
server: envoy
date: Wed, 10 Jan 2024 14:51:08 GMT
content-type: text/html
content-length: 161
location: http://www.baidu.com/
x-envoy-upstream-service-time: 128

<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>bfe/1.0.8.18</center>
</body>
</html>
# curl http://172.20.175.81:15020/stats/prometheus 2>/dev/null | grep baidu.com | grep -v sum | grep -v bucket
istio_tcp_connections_closed_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_connections_opened_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 1
istio_tcp_received_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 775
istio_tcp_sent_bytes_total{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="tcp",response_flags="-"} 4077
istio_request_duration_milliseconds_count{reporter="source",source_app="ping",destination_app="unknown",destination_service="baidu.com",destination_service_name="baidu.com",destination_service_namespace="unknown",destination_cluster="unknown",request_protocol="http",response_code="302",grpc_response_status="",response_flags="-",method="unknown"} 1

实现思路

我们偷懒直接看 Istio 转化后的 Envoy, 而不是去翻代码.

  1. 因为我们开启了 ISTIO_META_DNS_AUTO_ALLOCATE, 所以 Istio 分配了一个 IP 给 baidu.com, mesh 的应用看到的都是这个结果.
    # k exec -ti ping-ff5ff68d6-xfwhc -- ping baidu.com
    PING baidu.com (240.240.182.196): 56 data bytes
    
  2. 对于 HTTPS, Envoy 监听了对应 IP:PORT, 只解析到 L4, 因为 TLS 是业务自己做的, Envoy 并无法解析到 L7.
    # istioctl proxy-config listener ping-ff5ff68d6-xfwhc --port 443 --address 240.240.182.196
    ADDRESSES       PORT MATCH DESTINATION
    240.240.182.196 443  ALL   Cluster: outbound|443||baidu.com
    # istioctl proxy-config listener ping-ff5ff68d6-xfwhc --port 443 --address 240.240.182.196 -o json | jq '.[].filterChains[].filters[].name'
    "istio.stats"
    "envoy.filters.network.tcp_proxy"
    
  3. 对于 HTTP, Envoy 会将其转发到 443. 因为 TLS 是 Envoy 处理的, 所以其可以解析按 HTTP 解析请求, 并获取到对应数据. cluster 中的 transport_socket 指定了 TLS, 转发的 port 是 443.
    # istioctl proxy-config route ping-ff5ff68d6-xfwhc --name 80 -o json | jq '.[].virtualHosts[] | select(.name | test("baidu.com:80"))'
    {
      "name": "baidu.com:80",
      "domains": [
     "baidu.com",
     "240.240.182.196"
      ],
      "routes": [
     {
       "name": "default",
       "match": {
         "prefix": "/"
       },
       "route": {
         "cluster": "outbound|80||baidu.com",
         "timeout": "0s",
         "retryPolicy": {
           "retryOn": "connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes",
           "numRetries": 2,
           "retryHostPredicate": [
             {
               "name": "envoy.retry_host_predicates.previous_hosts",
               "typedConfig": {
                 "@type": "type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate"
               }
             }
           ],
           "hostSelectionRetryMaxAttempts": "5",
           "retriableStatusCodes": [
             503
           ]
         },
         "maxGrpcTimeout": "0s"
       },
       "decorator": {
         "operation": "baidu.com:80/*"
       }
     }
      ],
      "includeRequestAttemptCount": true
    }
    # istioctl proxy-config cluster ping-ff5ff68d6-xfwhc --fqdn "outbound|80||baidu.com" -o json | jq
    [
      {
     "name": "outbound|80||baidu.com",
     "type": "STRICT_DNS",
     "connectTimeout": "10s",
     "lbPolicy": "LEAST_REQUEST",
     "loadAssignment": {
       "clusterName": "outbound|80||baidu.com",
       "endpoints": [
         {
           "locality": {},
           "lbEndpoints": [
             {
               "endpoint": {
                 "address": {
                   "socketAddress": {
                     "address": "baidu.com",
                     "portValue": 443
                   }
                 }
               },
               "metadata": {
                 "filterMetadata": {
                   "istio": {
                     "workload": ";;;;"
                   }
                 }
               },
               "loadBalancingWeight": 1
             }
           ],
           "loadBalancingWeight": 1
         }
       ]
     },
     "circuitBreakers": {
       "thresholds": [
         {
           "maxConnections": 4294967295,
           "maxPendingRequests": 4294967295,
           "maxRequests": 4294967295,
           "maxRetries": 4294967295,
           "trackRemaining": true
         }
       ]
     },
     "dnsRefreshRate": "60s",
     "respectDnsTtl": true,
     "dnsLookupFamily": "V4_ONLY",
     "commonLbConfig": {
       "localityWeightedLbConfig": {}
     },
     "transportSocket": {
       "name": "envoy.transport_sockets.tls",
       "typedConfig": {
         "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext",
         "commonTlsContext": {
           "tlsParams": {
             "tlsMinimumProtocolVersion": "TLSv1_2",
             "tlsMaximumProtocolVersion": "TLSv1_3"
           },
           "validationContext": {}
         }
       }
     },
     "metadata": {
       "filterMetadata": {
         "istio": {
           "alpn_override": "false",
           "config": "/apis/networking.istio.io/v1alpha3/namespaces/external/destination-rule/com.baidu",
           "external": true,
           "services": [
             {
               "host": "baidu.com",
               "name": "baidu.com",
               "namespace": "external"
             }
           ]
         }
       }
     },
     "filters": [
       {
         "name": "istio.metadata_exchange",
         "typedConfig": {
           "@type": "type.googleapis.com/udpa.type.v1.TypedStruct",
           "typeUrl": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
           "value": {
             "enable_discovery": true,
             "protocol": "istio-peer-exchange"
           }
         }
       }
     ]
      }
    ]
    

匹配一批域名

Istio 的 ServiceEntry 支持 prefix wildcard, 案例如下:

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: wildcard
  namespace: external
spec:
  hosts:
  - "*.oss-cn-shanghai.aliyuncs.com"
  - "*.oss-cn-shanghai-internal.aliyuncs.com"
  location: MESH_EXTERNAL
  ports:
  - name: https
    number: 443
    protocol: TLS
  resolution: NONE

但当我使用 prefix wildcard 时:

  • resolution 应该从 DNS 改成 NONE
  • 无法通过 DestinationRule 让 Envoy 将 80 转发到 443, 所以只能在 L4 做解析.

此时 Envoy 是通过 SNI 来匹配流量的:

# istioctl proxy-config listener ping-ff5ff68d6-xfwhc --port 443 --address 0.0.0.0 -o json | jq '.[].filterChains[].filterChainMatch'
{
  "transportProtocol": "raw_buffer",
  "applicationProtocols": [
    "http/1.1",
    "h2c"
  ]
}
{
  "serverNames": [
    "*.oss-cn-shanghai-internal.aliyuncs.com"
  ]
}
{
  "serverNames": [
    "*.oss-cn-shanghai.aliyuncs.com"
  ]
}

如果你有需求的话, 可以控制日志来统计 SNI 并做后续处理.