博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
mongoDB VS PostgreSQL dml performance use python (pymongo & py-postgresql)
阅读量:7117 次
发布时间:2019-06-28

本文共 15208 字,大约阅读时间需要 50 分钟。

前面测试了mongodb和postgresql的插入性能对比, 参考如下 : 
1. 
2. 
3. 
本文将测试对比一下select, update, 以及select, insert, update混合场景的性能.
同样使用并行8个线程测试.
(因为使用的驱动问题, python测试结果性能较差, mongo改用motor驱动的异步调用后, 性能提升明显, 测试结果仅供参考, 如果是PG的话, 建议使用pgbench测试)
mongoDB : 
更新测试 :
# vi test.pyimport threadingimport timeimport pymongoimport randomc=pymongo.MongoClient('/tmp/mongodb-5281.sock')db = c.test_databasedb.drop_collection('test_collection')collection = db.test_collectionprint(collection.count())for i in range(0,1000000):  collection.insert({'id':i, 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})collection.create_index("id")class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread  def __init__(self, num):    threading.Thread.__init__(self)    self.thread_num = num  def run(self): #Overwrite run() method, put what you want the thread do here    c=pymongo.MongoClient('/tmp/mongodb-5281.sock')    db = c.test_database    collection = db.test_collection    start_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(start_t))    for i in range(0,125000):      collection.update({'id':random.randrange(0,1000000)}, {'$set': {'age': random.randrange(0,1000000)}})    stop_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(stop_t))    print(stop_t-start_t)def test():  t_names = dict()  for i in range(0,8):    t_names[i] = n_t(i)     t_names[i].start()  returnif __name__ == '__main__':  test()
测试结果379秒 : 
[root@localhost ~]# python test.py0TID:0 1423070907.7699816TID:1 1423070907.770696TID:2 1423070907.7744105TID:3 1423070907.7760801TID:4 1423070907.779555TID:7 1423070907.78037TID:6 1423070907.7860947TID:5 1423070907.78793TID:6 1423071285.5971715377.81107687950134TID:7 1423071286.6500263378.8696563243866TID:0 1423071286.9464445379.17646288871765TID:1 1423071287.1227949379.352098941803TID:3 1423071287.1230247379.3469445705414TID:5 1423071287.2262568379.4383268356323TID:4 1423071287.2609653379.4814102649689TID:2 1423071287.4058232379.6314127445221
查询测试 : 
import threadingimport timeimport pymongoimport randomc=pymongo.MongoClient('/tmp/mongodb-5281.sock')db = c.test_database# db.drop_collection('test_collection')collection = db.test_collectionprint(collection.count())# for i in range(0,1000000):#   collection.insert({'id':i, 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})# collection.create_index("id")class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread  def __init__(self, num):    threading.Thread.__init__(self)    self.thread_num = num  def run(self): #Overwrite run() method, put what you want the thread do here    c=pymongo.MongoClient('/tmp/mongodb-5281.sock')    db = c.test_database    collection = db.test_collection    start_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(start_t))    for i in range(0,125000):      collection.find_one({'id': random.randrange(0,1000000)})    stop_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(stop_t))    print(stop_t-start_t)def test():  t_names = dict()  for i in range(0,8):    t_names[i] = n_t(i)     t_names[i].start()  returnif __name__ == '__main__':  test()
测试结果361秒 : 
[root@localhost ~]# python test.py2000000TID:2 1423079875.4572093TID:3 1423079875.4576375TID:1 1423079875.4596934TID:0 1423079875.4600854TID:4 1423079875.4589622TID:5 1423079875.4653761TID:6 1423079875.463017TID:7 1423079875.4694664TID:7 1423080235.7239776360.2545111179352TID:3 1423080236.109339360.6517014503479TID:4 1423080236.1194305360.66046833992004TID:1 1423080236.260948360.8012545108795TID:2 1423080236.5218844361.06467509269714TID:5 1423080236.6404896361.17511343955994TID:0 1423080236.6446981361.1846127510071TID:6 1423080236.6607506361.1977336406708
更新, 插入, 查询综合测试 : 
import threadingimport timeimport pymongoimport randomc=pymongo.MongoClient('/tmp/mongodb-5281.sock')db = c.test_database# db.drop_collection('test_collection')collection = db.test_collectionprint(collection.count())# for i in range(0,1000000):#   collection.insert({'id':i, 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})# collection.create_index("id")class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread  def __init__(self, num):    threading.Thread.__init__(self)    self.thread_num = num  def run(self): #Overwrite run() method, put what you want the thread do here    c=pymongo.MongoClient('/tmp/mongodb-5281.sock')    db = c.test_database    collection = db.test_collection    start_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(start_t))    for i in range(0,125000):      collection.insert({'id':random.randrange(1000001,2000000), 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})      collection.update({'id':random.randrange(0,1000000)}, {'$set': {'age': random.randrange(0,1000000)}})      collection.find_one({'id': random.randrange(0,1000000)})    stop_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(stop_t))    print(stop_t-start_t)def test():  t_names = dict()  for i in range(0,8):    t_names[i] = n_t(i)     t_names[i].start()  returnif __name__ == '__main__':  test()
测试结果1150秒 : 
[root@localhost ~]# python test.py2000000TID:0 1423080359.006871TID:1 1423080359.0083587TID:2 1423080359.009925TID:4 1423080359.0124109TID:3 1423080359.015088TID:5 1423080359.0179524TID:6 1423080359.0209677TID:7 1423080359.0235417TID:4 1423081508.115071149.1026592254639TID:7 1423081508.18884521149.1653034687042TID:2 1423081508.26413441149.2542095184326TID:1 1423081508.39732651149.3889677524567TID:0 1423081508.54007031149.5331993103027TID:6 1423081508.5942071149.573239326477TID:5 1423081509.02791261150.0099601745605TID:3 1423081509.09436371150.0792756080627
PostgreSQL : 
更新测试 :
import threadingimport timeimport postgresqlimport randomconn = { "user": "postgres",         "database": "postgres",         "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"       }db = postgresql.open(**conn)db.execute("drop table if exists tt")db.execute("create table tt(id int, username name, age int2, email text, qq text)")ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")for i in range(0,1000000):  ins(i,'digoal.zhou',32,'digoal@126.com','276732431')db.execute("create index idx_tt_id on tt(id)")print(db.query("select count(1) as a from tt"))class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread  def __init__(self, num):    threading.Thread.__init__(self)    self.thread_num = num  def run(self): #Overwrite run() method, put what you want the thread do here    conn = { "user": "postgres",              "database": "postgres",             "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"           }    db = postgresql.open(**conn)    upd = db.prepare("update tt set age=$1 where id=$2")    start_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(start_t))    for i in range(0,125000):      upd(random.randrange(0,100), random.randrange(0,1000000))    stop_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(stop_t))    print(stop_t-start_t)def test():  t_names = dict()  for i in range(0,8):    t_names[i] = n_t(i)     t_names[i].start()  returnif __name__ == '__main__':  test()
测试结果244秒 : 
TID:0 1423072792.0481002TID:1 1423072792.050467TID:3 1423072792.0514963TID:2 1423072792.051693TID:5 1423072792.059382TID:4 1423072792.0605848TID:7 1423072792.0643597TID:6 1423072792.0657377TID:2 1423073034.8827112242.8310182094574TID:5 1423073035.024978242.96559596061707TID:4 1423073035.2550452243.1944603919983TID:7 1423073035.5245414243.46018171310425TID:1 1423073036.0639975244.0135304927826TID:3 1423073036.3519847244.30048847198486TID:0 1423073036.5292883244.48118805885315TID:6 1423073036.5383787244.47264099121094
查询测试 : 
import threadingimport timeimport postgresqlimport randomconn = { "user": "postgres",         "database": "postgres",         "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"       }db = postgresql.open(**conn)# db.execute("drop table if exists tt")# db.execute("create table tt(id int, username name, age int2, email text, qq text)")# ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")# for i in range(0,1000000):#   ins(i,'digoal.zhou',32,'digoal@126.com','276732431')# db.execute("create index idx_tt_id on tt(id)")print(db.query("select count(1) as a from tt"))class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread  def __init__(self, num):    threading.Thread.__init__(self)    self.thread_num = num  def run(self): #Overwrite run() method, put what you want the thread do here    conn = { "user": "postgres",              "database": "postgres",             "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"           }    db = postgresql.open(**conn)    sel = db.prepare("select * from tt where id=$1 limit 1")    start_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(start_t))    for i in range(0,125000):      sel(random.randrange(0,1000000))    stop_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(stop_t))    print(stop_t-start_t)def test():  t_names = dict()  for i in range(0,8):    t_names[i] = n_t(i)     t_names[i].start()  returnif __name__ == '__main__':  test()
测试结果438秒 : 
postgres@localhost-> python test.py[(1000000,)]TID:2 1423081634.6041436TID:1 1423081634.6072564TID:5 1423081634.6098883TID:0 1423081634.6110475TID:6 1423081634.611464TID:7 1423081634.6147678TID:3 1423081634.617597TID:4 1423081634.6184704TID:7 1423082070.8112974436.1965296268463TID:4 1423082071.5796437436.96117329597473TID:5 1423082071.6695313437.0596430301666TID:0 1423082071.8521369437.24108934402466TID:1 1423082072.5634701437.95621371269226TID:2 1423082072.678791438.0746474266052TID:3 1423082072.9825838438.3649866580963TID:6 1423082072.9963892438.3849251270294
更新, 插入, 查询综合测试 : 
import threadingimport timeimport postgresqlimport randomconn = { "user": "postgres",         "database": "postgres",         "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"       }db = postgresql.open(**conn)# db.execute("drop table if exists tt")# db.execute("create table tt(id int, username name, age int2, email text, qq text)")# ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")# for i in range(0,1000000):#   ins(i,'digoal.zhou',32,'digoal@126.com','276732431')# db.execute("create index idx_tt_id on tt(id)")print(db.query("select count(1) as a from tt"))class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread  def __init__(self, num):    threading.Thread.__init__(self)    self.thread_num = num  def run(self): #Overwrite run() method, put what you want the thread do here    conn = { "user": "postgres",              "database": "postgres",             "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"           }    db = postgresql.open(**conn)    ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")    upd = db.prepare("update tt set age=$1 where id=$2")    sel = db.prepare("select * from tt where id=$1")    start_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(start_t))    for i in range(0,125000):      ins(random.randrange(1000001,2000000),'digoal.zhou',32,'digoal@126.com','276732431')      upd(random.randrange(0,100), random.randrange(0,1000000))      sel(random.randrange(0,1000000))    stop_t = time.time()    print("TID:" + str(self.thread_num) + " " + str(stop_t))    print(stop_t-start_t)def test():  t_names = dict()  for i in range(0,8):    t_names[i] = n_t(i)     t_names[i].start()  returnif __name__ == '__main__':  test()
测试结果938秒 : 
postgres@localhost-> python test.py[(1000000,)]TID:0 1423083626.888068TID:2 1423083626.8912995TID:1 1423083626.8920445TID:3 1423083626.893638TID:4 1423083626.8974612TID:6 1423083626.9039218TID:5 1423083626.9061637TID:7 1423083626.908666TID:5 1423084561.3804135934.4742498397827TID:6 1423084563.4344044936.5304825305939TID:2 1423084564.1677904937.2764909267426TID:0 1423084564.5768228937.6887547969818TID:4 1423084564.839536937.9420747756958TID:1 1423084564.9242597938.0322151184082TID:7 1423084565.0638845938.1552186012268TID:3 1423084565.345857938.4522190093994
PostgreSQL使用pgbench的测试结果 : 
# 初始化数据psqltruncate tt;insert into tt select generate_series(1,1000000), 'digoal.zhou',32,'digoal@126.com','276732431';
更新
postgres@localhost-> vi test.sql\setrandom v_id 0 1000000\setrandom v_age 0 100update tt set age=:v_age where id=:v_id;
测试结果32秒 : 
postgres@localhost-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -t 125000transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 8number of threads: 4number of transactions per client: 125000number of transactions actually processed: 1000000/1000000tps = 31631.177435 (including connections establishing)tps = 31637.366682 (excluding connections establishing)statement latencies in milliseconds:        0.002963        \setrandom v_id 0 1000000        0.000671        \setrandom v_age 0 100        0.232106        update tt set age=:v_age where id=:v_id;
查询
postgres@localhost-> vi test.sql\setrandom v_id 0 1000000select * from tt where id=:v_id;
测试结果15秒 : 
postgres@localhost-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -t 125000transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 8number of threads: 4number of transactions per client: 125000number of transactions actually processed: 1000000/1000000tps = 66487.929382 (including connections establishing)tps = 66514.422913 (excluding connections establishing)statement latencies in milliseconds:        0.002007        \setrandom v_id 0 1000000        0.104391        select * from tt where id=:v_id;
插入, 更新, 查询
postgres@localhost-> vi test.sql\setrandom v_newid 1000001 2000000\setrandom v_id 0 1000000\setrandom v_age 0 100insert into tt values(:v_newid, 'digoal.zhou',32,'digoal@126.com','276732431');update tt set age=:v_age where id=:v_id;select * from tt where id=:v_id;
测试结果耗时69秒 : 
postgres@localhost-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -t 125000transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 8number of threads: 4number of transactions per client: 125000number of transactions actually processed: 1000000/1000000tps = 14429.720005 (including connections establishing)tps = 14431.006796 (excluding connections establishing)statement latencies in milliseconds:        0.003031        \setrandom v_newid 1000001 2000000        0.000816        \setrandom v_id 0 1000000        0.000733        \setrandom v_age 0 100        0.150249        insert into tt values(:v_newid, 'digoal.zhou',32,'digoal@126.com','276732431');        0.180603        update tt set age=:v_age where id=:v_id;        0.190850        select * from tt where id=:v_id;
[小结]
python 测试结果 : 
mongoDB 
插入100W记录耗时 - 364秒
更新100W记录耗时 - 379
100W记录索引检索100W次耗时 - 361
综合测试, 插入
100W记录
, 更新
100W记录
, 查询
100W次耗时 - 1150

PostgreSQL
插入100W记录耗时 - 226秒.
更新100W记录耗时 - 244
秒.
100W记录索引检索100W次耗时 - 438
秒.
综合测试, 插入
100W记录
, 更新
100W记录
, 查询
100W次耗时 - 938
秒.

PostgreSQL 使用pgbench测试结果 : 
插入100W记录耗时 - 16秒
更新100W记录耗时 - 32
秒.
100W记录索引检索100W次耗时 - 15
秒.
综合测试, 插入
100W记录
, 更新
100W记录
, 查询
100W次耗时 - 69
秒.

[参考]
1. 
2. 
3. 
4. 
5. 
6. 

转载地址:http://szdel.baihongyu.com/

你可能感兴趣的文章
一般过去时
查看>>
为了python准备一些软件
查看>>
1180 中位数
查看>>
PHP 真正多线程的使用
查看>>
hibernate-release-5.2.9.Final
查看>>
在几何画板中切割三棱锥的方法
查看>>
AutoResetEvent和ManualResetEvent用法
查看>>
简单排序
查看>>
python SMTP 发送邮件
查看>>
windows 安装zookeeper
查看>>
最大流问题
查看>>
Android 开发知识小集
查看>>
函数调用堆栈 涉及汇编(转)
查看>>
《程序员代码面试指南》第一章 栈和队列 由两个栈组成的队列
查看>>
20-python基础9-生成器
查看>>
比遇见什么样的人更重要的,是先成为什么样的人
查看>>
windows下安装python科学计算环境,numpy scipy scikit ,matplotlib等
查看>>
WPF QuickStart系列之线程模型(Thread Model)
查看>>
LoadRunner ---协议分析
查看>>
远程shell脚本执行工具类
查看>>