对比 HandlerSocket 和 Python 的 MySQL 客户端,性能相差近一倍

虫虫 发布于 2011/12/22 11:41
阅读 2K+
收藏 8
我最近做了一些性能测试,对比了两个Python的MySQL库的数据读取效率:MySQLdb和pyhs。MySQLdb使用了MySQL的客户端库,而pyth使用了HandlerSocket,跳过了MySQL的客户端这一层,直接和Innodb存储引擎交互。在我的测试中, HandlerSocket表现了82%的性能提升,相比mysql客户端。这些测试是在不同的环境下进行的,冷启动(无cache),运行 Select * from customer后有cache,交替运行这两个python文件。不论是哪种环境,测试的结果都差不多。以下是样例输出:
root@ubuntu:~# python hanSolo.py
Using HandlerSocket, below is a report of how many customer's name and address can be retrieved based on customer key:
Seconds elapsed:  61.0000810623
Rows retrieved:  509863
root@ubuntu:~# python mclient.py
Using mysql client libraries, below is a report of how many customer's name and address can be retrieved based on customer key:
Seconds elapsed:  61.0001530647
Rows retrieved:  280120
测试环境如下:
硬件、软件:
1. Rackspace Cloud server Ubuntu 10.04 Lucid Lynx, 1 gig memory, 40 gig hard disk space, 64-bit
Linux ubuntu 2.6.35.4-rscloud #8 SMP Mon Sep 20 15:54:33 UTC 2010 x86_64 GNU/Linux
2. Following instruction here to get Percona’s APT repository;
3. apt-get install percona-server-client-5.5
4. apt-get install percona-server-server-5.5

启用HandlerSocket插件。HandlerSocket是同Percona Server 5.5一同发行的,所以你不用自己下载源码再编译了:
1. mysql> install plugin handlersocket soname 'handlersocket.so'
2. cp /usr/share/mysql/my-large.cnf /etc/mysql/my.cnf
3. vim /etc/mysql/my.cnf with the following under mysqld section
loose_handlersocket_port = 9998
loose_handlersocket_port_wr = 9999
loose_handlersocket_threads = 16
loose_handlersocket_threads_wr = 1
open_files_limit = 65535
4. service mysql restart

获取Python的MySQLdb和pyhs模块:
1. apt-get install libmysqlclient-dev
用来创建 Python’s MySQLdb

2. apt-get install python-dev
这些是必要的python的头文件,用来创建Python模块

2. wget the appropriate egg from this page. Get the one for your version of Python.

3. sh eggFileDownloadedFromTheStepAbove

4. easy_install MySQL-python
MySQLdb 模块,通过mysql客户端库来访问mysql

5. easy_install python-handler-socket

准备测试数据
1. Follow instructions here to get dbgen compiled;
2. While at the proper directory, run
./dbgen -v -T c
这将创建 150000 条数据
3. 创建数据表:
CREATE TABLE customer ( C_CUSTKEY     INTEGER NOT NULL,
C_NAME        VARCHAR(25) NOT NULL,
C_ADDRESS     VARCHAR(40) NOT NULL,
C_NATIONKEY   INTEGER NOT NULL,
C_PHONE       CHAR(15) NOT NULL,
C_ACCTBAL     DECIMAL(15,2)   NOT NULL,
C_MKTSEGMENT  CHAR(10) NOT NULL,
C_COMMENT     VARCHAR(117) NOT NULL,
primary key (C_CUSTKEY));
4. 导入 ‘/root/dbgen/customer.tbl’ 到 customer 数据表。

下面两段python代码就是分别通过pyhs和MySQLdb来运行测试:
hanSolo.ph
import time
from pyhs import Manager
hs = Manager()
start = time.time()
i = 1
j= 0
while i < 150000:
    data = hs.get('test', 'customer', ['C_CUSTKEY', 'C_NAME', 'C_ADDRESS'], '%s' % i)
    i=i+1
    if i == 150000:
        i = 1
    end = time.time()
    j = j + 1
    if int(end - start) > 60:
        break
print "Using HandlerSocket, below is a report of how many customer's name and address can be retrieved based on customer key:"
print "Seconds elapsed: ", str(end - start)
print "Rows retrieved: ", str(j)
mclient.py
import sys, MySQLdb, time
my_host = "localhost"
my_user = "root"
my_pass = ""
my_db = "test"
try:
    db = MySQLdb.connect(host=my_host, user=my_user, passwd=my_pass, db=my_db)
except MySQLdb.Error, e:
     print "Error %d: %s" % (e.args[0], e.args[1])
     sys.exit (1)
cursor = db.cursor (MySQLdb.cursors.DictCursor)
i=1
j=0
start = time.time()
while i < 150000:
    sql = "select c_custkey, c_name, c_address from customer where c_custkey=%s" % i;
    cursor.execute(sql)
    results = cursor.fetchall()
    i=i+1
    if i==150000:
        i=1
    end = time.time()
    j=j+1
    if int(end - start) > 60:
        break
print "Using mysql client libraries, below is a report of how many customer's name and address can be retrieved based on customer key:"
print "Seconds elapsed: ", str(end - start)
print "Rows retrieved: ", str(j)
db.close()

原文链接 

 

加载中
0
该用户已被和谐
该用户已被和谐
木有图表,其实看这些数据头挺疼的!
0
shilei
shilei
Mysql Percona应该就使用了handlerSocket,不知道这个跟直接采用的handlerSocket性能差别有多大。
0
deleted
deleted
这东西谨慎点用好,推出这么久,Percona到现在都还称HandlerSocket是experimental feature
0
h
hustegg
直接和存储引擎交互?这个库可以自己生成解析树进行预处理?
返回顶部
顶部