golang timeoutHandler解析及kubernetes中的变种

2019-09-19 22:14变种 timeoutHandler kubernetes 解析 golang

Golang里的http request timeout比较简单，但是稍不留心就容易出现错误，最近在kubernetes生产环境中出现了的一个问题让我有机会好好捋一捋golang中关于timeout中的所有相关的东西。

Basic

golang中timeout有关的设置，资料已经比较多，其中必须阅读的就是The complete guide to Go net/http timeouts，里面详述了关于http中各个timeou字段及其影响，写的很详细，本文就不在重复造轮子了。所以我们在生产环境中的代码绝对不能傻傻的使用http.Get("www.baidu.com")了，很容易造成client hang死，默认的http client的timeout值为0, 也就是没有超时。具体的血泪教训可以参见Don’t use Go’s default HTTP client (in production)。对于http package中default的设置最后还是仔细review一遍再使用。

Advanced

golang http.TimeoutHandler

了解了基本的使用方式后，笔者带领大家解析一下其中的http.TimeoutHandler，TimeoutHandler顾名思义是一个handler wrapper，用来限制ServeHttp的最大时间，也就是除去读写请求外真正执行服务器逻辑的时间（如果仔细分析的话其实各个结构体中关于timeout的设置中并没有办法来设置这部分超时时间)，如果运行时间超过了设定的时间，将返回一个"503 Service Unavailable" 和一个指定的message。
我们来一起探究一下他的实现，首先是函数定义：

// TimeoutHandler returns a Handler that runs h with the given time limit. // // The new Handler calls h.ServeHTTP to handle each request, but if a // call runs for longer than its time limit, the handler responds with // a 503 Service Unavailable error and the given message in its body. // (If msg is empty, a suitable default message will be sent.) // After such a timeout, writes by h to its ResponseWriter will return // ErrHandlerTimeout. // // TimeoutHandler buffers all Handler writes to memory and does not // support the Hijacker or Flusher interfaces. func TimeoutHandler(h Handler, dt time.Duration, msg string) Handler {     return &timeoutHandler{         handler: h,         body:    msg,         dt:      dt,     } }

可以看到典型的handler wrapper的函数signature，接收一个handler并返回一个hander，返回的timeout handler中ServeHttp方法如下：

func (h *timeoutHandler) ServeHTTP(w ResponseWriter, r *Request) {     ctx := h.testContext     if ctx == nil {         var cancelCtx context.CancelFunc         ctx, cancelCtx = context.WithTimeout(r.Context(), h.dt)         defer cancelCtx()     }     r = r.WithContext(ctx)     done := make(chan struct{})     tw := &timeoutWriter{         w: w,         h: make(Header),     }     panicChan := make(chan interface{}, 1)     go func() {         defer func() {             if p := recover(); p != nil {                 panicChan <- p             }         }()         h.handler.ServeHTTP(tw, r)         close(done)     }()     select {     case p := <-panicChan:         panic(p)     case <-done:         tw.mu.Lock()         defer tw.mu.Unlock()         dst := w.Header()         for k, vv := range tw.h {             dst[k] = vv         }         if !tw.wroteHeader {             tw.code = StatusOK         }         w.WriteHeader(tw.code)         w.Write(tw.wbuf.Bytes())     case <-ctx.Done():         tw.mu.Lock()         defer tw.mu.Unlock()         w.WriteHeader(StatusServiceUnavailable)         io.WriteString(w, h.errorBody())         tw.timedOut = true     } }

整体流程为：

首先初始化context的timeout
初始化一个timeoutWriter，该timeoutWriter实现了http.ResponseWriter接口，内部结构体中有一个bytes.Buffer, 所有的Write方法都是写入到该buffer中。
异步goroutine调用serveHttp方法， timeoutWriter作为serveHttp的参数，所以此时写入的数据并没有发送给用户，而是缓存到了timeoutWriter的buffer中
最后select监听各个channel：
1. 如果子groutine panic，则捕获该panic并在主grouinte中panic进行propagate
2. 如果请求正常完成则开始写入header并将buffer中的内容写给真正的http writer
3. 如果请求超时则返回用户503

为什么需要先写入buffer，然后在写给真正的writer呐？因为我们无法严格意义上的cancel掉一个请求。如果我们已经往一个http writer中写了部分数据(例如已经写了hedaer)，而此时因为某些逻辑处理较慢，并且发现已经过了timeout阈值，想要cancel该请求。此时已经没有办法真正意义上取消了，可能对端已经读取了部分数据了。一个典型的场景是HTTP/1.1中的分块传输，我们先写入header，然后依次写入各个chunk，如果后面的chunk还没写已经超时了，那此时就陷入了两难的情况。
此时就需要使用golang内置的TimeoutHandler了，它提供了两个优势：

首先是提供了一个buffer，等到所有的数据写入完成，如果此时没有超时再统一发送给对端。并且timeoutWriter在每次Write的时候都会判断此时是否超时，如果超时就马上返回错误。
给用户返回一个友好的503提示

实现上述两点的代价就是需要维护一个buffer来缓存所有的数据。有些情况下是这个buffer会导致一定的问题，设想一下对于一个高吞吐的server，每个请求都维护一个buffer势必是不可接受的，以kubernete为例，每次list pods时可能有好几M的数据，如果每个请求都写缓存势必会占用过多内存，那kubernetes是如何实现timeout的呐？

kubernetes timeout Handler

kubernetes 为了防止某个请求hang死之后一直占用连接，所以会对每个请求进行timeout的处理，这部分逻辑是在一个handler chain中WithTimeoutForNonLongRunningRequests handler实现。其中返回的WithTimeout的实现如下：

// WithTimeout returns an http.Handler that runs h with a timeout // determined by timeoutFunc. The new http.Handler calls h.ServeHTTP to handle // each request, but if a call runs for longer than its time limit, the // handler responds with a 504 Gateway Timeout error and the message // provided. (If msg is empty, a suitable default message will be sent.) After // the handler times out, writes by h to its http.ResponseWriter will return // http.ErrHandlerTimeout. If timeoutFunc returns a nil timeout channel, no // timeout will be enforced. recordFn is a function that will be invoked whenever // a timeout happens. func WithTimeout(h http.Handler, timeoutFunc func(*http.Request) (timeout <-chan time.Time, recordFn func(), err *apierrors.StatusError)) http.Handler {     return &timeoutHandler{h, timeoutFunc} }

其中主要是timeoutHandler，实现如下：

type timeoutHandler struct {     handler http.Handler     timeout func(*http.Request) (<-chan time.Time, func(), *apierrors.StatusError) }  func (t *timeoutHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {     after, recordFn, err := t.timeout(r)     if after == nil {         t.handler.ServeHTTP(w, r)         return     }      result := make(chan interface{})     tw := newTimeoutWriter(w)     go func() {         defer func() {             result <- recover()         }()         t.handler.ServeHTTP(tw, r)     }()     select {     case err := <-result:         if err != nil {             panic(err)         }         return     case <-after:         recordFn()         tw.timeout(err)     } }

如上，在ServeHTTP中主要做了几件事情：

调用timeoutHandler.timeout设置一个timer，如果timeout时间到到达会通过after这个channel传递过来，后面会监听该channel
创建timeoutWriter对象，该timeoutWriter中有一个timeout方法，该方法会在超时之后会被调用
异步调用ServeHTTP并将timeoutWriter传递进去，如果该groutine panic则进行捕获并通过channel传递到调用方groutine，因为我们不能因为一个groutine panic导致整个进程退出，而且调用方groutine对这些panic信息比较感兴趣，需要传递过去。
监听定时器channel

如果定时器channel超时会调用timeoutWrite.timeout方法，该方法如下：

func (tw *baseTimeoutWriter) timeout(err *apierrors.StatusError) {     tw.mu.Lock()     defer tw.mu.Unlock()      tw.timedOut = true      // The timeout writer has not been used by the inner handler.     // We can safely timeout the HTTP request by sending by a timeout     // handler     if !tw.wroteHeader && !tw.hijacked {         tw.w.WriteHeader(http.StatusGatewayTimeout)         enc := json.NewEncoder(tw.w)         enc.Encode(&err.ErrStatus)     } else {         // The timeout writer has been used by the inner handler. There is         // no way to timeout the HTTP request at the point. We have to shutdown         // the connection for HTTP1 or reset stream for HTTP2.         //         // Note from: Brad Fitzpatrick         // if the ServeHTTP goroutine panics, that will do the best possible thing for both         // HTTP/1 and HTTP/2. In HTTP/1, assuming you're replying with at least HTTP/1.1 and         // you've already flushed the headers so it's using HTTP chunking, it'll kill the TCP         // connection immediately without a proper 0-byte EOF chunk, so the peer will recognize         // the response as bogus. In HTTP/2 the server will just RST_STREAM the stream, leaving         // the TCP connection open, but resetting the stream to the peer so it'll have an error,         // like the HTTP/1 case.         panic(errConnKilled)     } }

可以看到，如果此时还没有写入任何数据，则直接返回504状态码，否则直接panic。上面有一大段注释说明为什么panic，这段注释的出处在kubernetes issue:
API server panics when writing response #29001。引用的是golang http包作者 Brad Fitzpatrick的话，意思是：如果我们已经往一个writer中写入了部分数据，我们是没有办法timeout，此时goroutine panic或许是最好的选择，无论是对于HTTP/1.1还是HTTP/2.0, 如果是HTTP/1.1, 他不会发送任何数据，直接断开tcp连接，此时对端就能够识别出来server异常，如果是HTTP/2.0 此时srever会RST_STREAM该stream, 并且不会影响connnection, 对端也能够很好的处理。这部分代码还是很有意思的，很难想象kubernetes会以panic掉groutine的方式来处理一个request的超时。

panic掉一个groutine，如果你上层没有任何recover机制的话，整个程序都会退出，对于kubenernetes apiserver肯定是不能接受的， kubernetes在每个request的handler chain中会有一个genericfilters.WithPanicRecovery进行捕获这样的panic，避免整个进程崩溃。

Other

谈完TimeoutHandler，再回到golang timeout，有时虽然我们正常timeout返回，但并不意味整个groutine就正常返回了。此时调用返回也只是上层返回了，异步调用的底层逻辑没有办法撤回的。因为我们没办法cancel掉另一个grouine，只能是groutine主动退出，主动退出的实现思路大部分是通过传递一个context或者close channel给该groutine，该groutine监听到退出信号就终止，但是目前很多调用是不支持接收一个context或close channle作为参数的。
例如下面这段代码：因为在主逻辑中sleep了4s是没有办法中断的，即时此时request已经返回，但是server端该groutine还是没有被释放，所以golang timeout这块还是非常容易leak grouine的，使用的时候需要小心。

package main  import (     "fmt"     "net/http"     "runtime"     "time" )  func main() {     go func() {         for {             time.Sleep(time.Second)             fmt.Printf("groutine num: %dn", runtime.NumGoroutine())         }     }()      handleFunc := func(w http.ResponseWriter, r *http.Request) {         fmt.Printf("request %vn", r.URL)         time.Sleep(4 * time.Second)         _, err := fmt.Fprintln(w, "ok")         if err != nil {             fmt.Printf("write err: %vn", err)         }     }     err := http.ListenAndServe("localhost:9999", http.TimeoutHandler(http.HandlerFunc(handleFunc), 2*time.Second, "err: timeout"))     if err != nil {         fmt.Printf("%v", err)     } }

写在最后

golang timeout 简单但是比较繁琐，只有明白其原理才能真正防患于未然

您可能感兴趣的文章

.NET框架之“小马过河”
CC攻击网站的原理
Netty中的责任链模式
javaweb必备知识点
8 分钟了解 Kubernetes
Netty源码分析（八）—– write过程源码分析
uwsgi基本介绍安装和测试–使用Django建立你的第一个网站
C#文件的输入与输出

未经允许不得转载：杂烩网 » golang timeoutHandler解析及kubernetes中的变种

上一篇：JWT+Interceptor实现无状态登录和鉴权
下一篇：Asp.Net Core WebAPI+PostgreSQL部署在Docker中

更多阅读

golang timeoutHandler解析及kubernetes中的变种

Basic

Advanced

golang http.TimeoutHandler

kubernetes timeout Handler

Other

写在最后

您可能感兴趣的文章

课后答案张九龄《望月怀远》阅读答案及全诗翻译赏析

课后答案王安石《次韵唐公三首其三旅思》阅读答案

笔记心得各级干部学习执法为民心得体会

笔记心得寒假大学生社会实践心得体会

协议书济南市某美容院转让协议第2篇

剧本劳模宣传短剧剧本《阿咪也想当劳模》

教程灰雀说课稿

课件“吴隐之字处默，濮阳鄄城人”阅读答案及原文

推荐阅读

热门阅读

标签

关于本站

阅读导航

网站声明