Python的web框架名目繁多,各有千秋。正如光荣属于希腊,伟大属于罗马。Python的优雅结合WSGI的设计,让web框架接口实现千秋一统。WSGI 把应用(Application)和服务器(Server)结合起来。Django 和 Flask 都可以结合 gunicon 搭建部署应用。
与 django 和 flask 不一样,tornado 既可以是 wsgi 应用,也可以是 wsgi 服务。当然,选择tornado更多的考量源于其单进程单线程异步IO的网络模式。高性能往往吸引人,可是有不少朋友使用之后会提出疑问,tornado号称高性能,实际使用的时候却怎么感受不到呢?
Tornado 异步使用方式
class SyncHandler(tornado.web.RequestHandler): def get(self, *args, **kwargs): # 耗时的代码 os.system("ping -c 2 www.google.com") self.finish('It works')
ab -c 5 -n 5
Server Software: TornadoServer/4.3 Server Hostname: Server Port: 5000 Document Path: /sync Document Length: 5 bytes Concurrency Level: 5 Time taken for tests: 5.076 seconds Complete requests: 5 Failed requests: 0 Total transferred: 985 bytes HTML transferred: 25 bytes Requests per second: 0.99 [#/sec] (mean) Time per request: 5076.015 [ms] (mean) Time per request: 1015.203 [ms] (mean, across all concurrent requests) Transfer rate: 0.19 [Kbytes/sec] received
qps 仅有可怜的 0.99,姑且当成每秒处理一个请求吧。
class AsyncHandler(tornado.web.RequestHandler): @tornado.web.asynchronous @tornado.gen.coroutine def get(self, *args, **kwargs): tornado.ioloop.IOLoop.instance().add_timeout(1, callback=functools.partial(self.ping, 'www.google.com')) # do something others self.finish('It works') @tornado.gen.coroutine def ping(self, url): os.system("ping -c 2 {}".format(url)) return 'after'
尽管在执行异步任务的时候选择了timeout 1秒,主线程的返回还是很快的。ab压测如下:
Document Path: /async Document Length: 5 bytes Concurrency Level: 5 Time taken for tests: 0.009 seconds Complete requests: 5 Failed requests: 0 Total transferred: 985 bytes HTML transferred: 25 bytes Requests per second: 556.92 [#/sec] (mean) Time per request: 8.978 [ms] (mean) Time per request: 1.796 [ms] (mean, across all concurrent requests) Transfer rate: 107.14 [Kbytes/sec] received
class AsyncTaskHandler(tornado.web.RequestHandler): @tornado.web.asynchronous @tornado.gen.coroutine def get(self, *args, **kwargs): # yield 结果 response = yield tornado.gen.Task(self.ping, ' www.google.com') print 'response', response self.finish('hello') @tornado.gen.coroutine def ping(self, url): os.system("ping -c 2 {}".format(url)) return 'after'
Server Software: TornadoServer/4.3 Server Hostname: Server Port: 5000 Document Path: /async/task Document Length: 5 bytes Concurrency Level: 5 Time taken for tests: 0.049 seconds Complete requests: 5 Failed requests: 0 Total transferred: 985 bytes HTML transferred: 25 bytes Requests per second: 101.39 [#/sec] (mean) Time per request: 49.314 [ms] (mean) Time per request: 9.863 [ms] (mean, across all concurrent requests) Transfer rate: 19.51 [Kbytes/sec] received
from concurrent.futures import ThreadPoolExecutor class FutureHandler(tornado.web.RequestHandler): executor = ThreadPoolExecutor(10) @tornado.web.asynchronous @tornado.gen.coroutine def get(self, *args, **kwargs): url = 'www.google.com' tornado.ioloop.IOLoop.instance().add_callback(functools.partial(self.ping, url)) self.finish('It works') @tornado.concurrent.run_on_executor def ping(self, url): os.system("ping -c 2 {}".format(url))
Document Path: /future Document Length: 5 bytes Concurrency Level: 5 Time taken for tests: 0.003 seconds Complete requests: 5 Failed requests: 0 Total transferred: 995 bytes HTML transferred: 25 bytes Requests per second: 1912.78 [#/sec] (mean) Time per request: 2.614 [ms] (mean) Time per request: 0.523 [ms] (mean, across all concurrent requests) Transfer rate: 371.72 [Kbytes/sec] received
class Executor(ThreadPoolExecutor): _instance = None def __new__(cls, *args, **kwargs): if not getattr(cls, '_instance', None): cls._instance = ThreadPoolExecutor(max_workers=10) return cls._instance class FutureResponseHandler(tornado.web.RequestHandler): executor = Executor() @tornado.web.asynchronous @tornado.gen.coroutine def get(self, *args, **kwargs): future = Executor().submit(self.ping, 'www.google.com') response = yield tornado.gen.with_timeout(datetime.timedelta(10), future, quiet_exceptions=tornado.gen.TimeoutError) if response: print 'response', response.result() @tornado.concurrent.run_on_executor def ping(self, url): os.system("ping -c 1 {}".format(url)) return 'after'
Concurrency Level: 5 Time taken for tests: 0.043 seconds Complete requests: 5 Failed requests: 0 Total transferred: 960 bytes HTML transferred: 0 bytes Requests per second: 116.38 [#/sec] (mean) Time per request: 42.961 [ms] (mean) Time per request: 8.592 [ms] (mean, across all concurrent requests) Transfer rate: 21.82 [Kbytes/sec] received
此外,Tornado还有客户端异步功能。该特性主要是在于 AsyncHTTPClient的使用。此时的应用场景往往是tornado服务内,需要针对另外的IO进行请求和处理。顺便提及,上述的例子中,调用ping其实也算是一种服务内的IO处理。接下来,将会探索一下AsyncHTTPClient的使用,尤其是使用AsyncHTTPClient上传文件与转发请求。
前面了解Tornado的异步任务的常用做法,姑且归结为异步服务。通常在我们的服务内,还需要异步的请求第三方服务。针对HTTP请求,Python的库Requests是最好用的库,没有之一。官网宣称:HTTP for Human。然而,在tornado中直接使用requests将会是一场恶梦。requests的请求会block整个服务进程。
上帝关上门的时候,往往回打开一扇窗。Tornado提供了一个基于框架本身的异步HTTP客户端(当然也有同步的客户端)--- AsyncHTTPClient。
AsyncHTTPClient 基本用法
AsyncHTTPClient是 tornado.httpclinet 提供的一个异步http客户端。使用也比较简单。与服务进程一样,AsyncHTTPClient也可以callback和yield两种使用方式。前者不会返回结果,后者则会返回response。
class SyncHandler(tornado.web.RequestHandler): def get(self, *args, **kwargs): url = 'https://api.github.com/' resp = requests.get(url) print resp.status_code self.finish('It works')
Document Path: /sync Document Length: 5 bytes Concurrency Level: 5 Time taken for tests: 10.255 seconds Complete requests: 5 Failed requests: 0 Total transferred: 985 bytes HTML transferred: 25 bytes Requests per second: 0.49 [#/sec] (mean) Time per request: 10255.051 [ms] (mean) Time per request: 2051.010 [ms] (mean, across all concurrent requests) Transfer rate: 0.09 [Kbytes/sec] received
class AsyncHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self, *args, **kwargs): url = 'https://api.github.com/' http_client = tornado.httpclient.AsyncHTTPClient() http_client.fetch(url, self.on_response) self.finish('It works') @tornado.gen.coroutine def on_response(self, response): print response.code
qps 提高了很多
Document Path: /async Document Length: 5 bytes Concurrency Level: 5 Time taken for tests: 0.162 seconds Complete requests: 5 Failed requests: 0 Total transferred: 985 bytes HTML transferred: 25 bytes Requests per second: 30.92 [#/sec] (mean) Time per request: 161.714 [ms] (mean) Time per request: 32.343 [ms] (mean, across all concurrent requests) Transfer rate: 5.95 [Kbytes/sec] received
class AsyncResponseHandler(tornado.web.RequestHandler): @tornado.web.asynchronous @tornado.gen.coroutine def get(self, *args, **kwargs): url = 'https://api.github.com/' http_client = tornado.httpclient.AsyncHTTPClient() response = yield tornado.gen.Task(http_client.fetch, url) print response.code print response.body
AsyncHTTPClient 转发
使用Tornado经常需要做一些转发服务,需要借助AsyncHTTPClient。既然是转发,就不可能只有get方法,post,put,delete等方法也会有。此时涉及到一些 headers和body,甚至还有https的waring。
下面请看一个post的例子, yield结果,通常,使用yield的时候,handler是需要 tornado.gen.coroutine。
headers = self.request.headers body = json.dumps({'name': 'rsj217'}) http_client = tornado.httpclient.AsyncHTTPClient() resp = yield tornado.gen.Task( self.http_client.fetch, url, method="POST", headers=headers, body=body, validate_cert=False)
AsyncHTTPClient 构造请求
body = urllib.urlencode(params) req = tornado.httpclient.HTTPRequest( url=url, method='POST', body=body, validate_cert=False) http_client.fetch(req, self.handler_response) def handler_response(self, response): print response.code
AsyncHTTPClient 上传图片
AsyncHTTPClient 更高级的用法就是上传图片。例如服务有一个功能就是请求第三方服务的图片OCR服务。需要把用户上传的图片,再转发给第三方服务。
@router.Route('/api/v2/account/upload') class ApiAccountUploadHandler(helper.BaseHandler): @tornado.gen.coroutine @helper.token_require def post(self, *args, **kwargs): upload_type = self.get_argument('type', None) files_body = self.request.files['file'] new_file = 'upload/new_pic.jpg' new_file_name = 'new_pic.jpg' # 写入文件 with open(new_file, 'w') as w: w.write(file_['body']) logging.info('user {} upload {}'.format(user_id, new_file_name)) # 异步请求 上传图片 with open(new_file, 'rb') as f: files = [('image', new_file_name, f.read())] fields = (('api_key', KEY), ('api_secret', SECRET)) content_type, body = encode_multipart_formdata(fields, files) headers = {"Content-Type": content_type, 'content-length': str(len(body))} request = tornado.httpclient.HTTPRequest(config.OCR_HOST, method="POST", headers=headers, body=body, validate_cert=False) response = yield tornado.httpclient.AsyncHTTPClient().fetch(request) def encode_multipart_formdata(fields, files): """ fields is a sequence of (name, value) elements for regular form fields. files is a sequence of (name, filename, value) elements for data to be uploaded as files. Return (content_type, body) ready for httplib.HTTP instance """ boundary = '----------ThIs_Is_tHe_bouNdaRY_$' crlf = '\r\n' l = [] for (key, value) in fields: l.append('--' + boundary) l.append('Content-Disposition: form-data; name="%s"' % key) l.append('') l.append(value) for (key, filename, value) in files: filename = filename.encode("utf8") l.append('--' + boundary) l.append( 'Content-Disposition: form-data; name="%s"; filename="%s"' % ( key, filename ) ) l.append('Content-Type: %s' % get_content_type(filename)) l.append('') l.append(value) l.append('--' + boundary + '--') l.append('') body = crlf.join(l) content_type = 'multipart/form-data; boundary=%s' % boundary return content_type, body def get_content_type(filename): import mimetypes return mimetypes.guess_type(filename)[0] or 'application/octet-stream'
对比上述的用法,上传图片仅仅是多了一个图片的编码。将图片的二进制数据按照multipart 方式编码。编码的同时,还需要把传递的相关的字段处理好。相比之下,使用requests 的方式则非常简单:
files = {} f = open('/Users/ghost/Desktop/id.jpg') files['image'] = f data = dict(api_key='KEY', api_secret='SECRET') resp = requests.post(url, data=data, files=files) f.close() print resp.status_Code
通过AsyncHTTPClient的使用方式,可以轻松的实现handler对第三方服务的请求。结合前面关于tornado异步的使用方式。无非还是两个key。是否需要返回结果,来确定使用callback的方式还是yield的方式。当然,如果不同的函数都yield,yield也可以一直传递。这个特性,tornado的中的tornado.auth 里面对oauth的认证。
