加载中

In moving a relatively large table from MySQL to Redis, you may find that extracting, transforming and loading a row at a time can be excruciatingly slow. Here’s a quick trick you can use that pipes the output of themysqlcommand directly toredis-cli, bypassing middleware and allowing both data stores to operate at their peak speed.

Using this technique I was able to reduce my load time of ~8 million rows from 90 minutes down to two. Shabby? I think not.

从mysql搬一个大表到redis中,你会发现在提取、转换或是载入一行数据时,速度慢的让你难以忍受。这里我就要告诉一个让你解脱的小技巧。使用“管道输出”的方式把mysql命令行产生的内容直接传递给redis-cli,以绕过“中间件”的方式使两者在进行数据操作时达到最佳速度。

一个约八百万行数据的mysql表,原本导入到redis中需要90分钟,使用这个方法后,只需要两分钟。不管你信不信,反正我是信了。

Redis Protocol Output from MySQL

redis-clihas a mass-insert mode, specially designed to execute bulk commands as quickly as possible. Input is expected to be formatted in raw redis protocol, which is simple enough to be generated from MySQL SELECT queries. Here we go!

Mysql到Redis的数据协议

redis-cli命令行工具有一个批量插入模式,是专门为批量执行命令设计的。这第一步就是把Mysql查询的内容格式化成redis-cli可用的数据格式。here we go!

My stats table:

CREATE TABLE events_all_time (
  id int(11) unsigned NOT NULL AUTO_INCREMENT,
  action varchar(255) NOT NULL,
  count int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (id),
  UNIQUE KEY uniq_action (action)
);
Desired Redis command to be run for each row:
HSET events_all_time [action] [count]
SQL statement to produce Redis-protocol-formatted output:
-- events_to_redis.sql

SELECT CONCAT(
  "*4\r\n",
  '$', LENGTH(redis_cmd), '\r\n',
  redis_cmd, '\r\n',
  '$', LENGTH(redis_key), '\r\n',
  redis_key, '\r\n',
  '$', LENGTH(hkey), '\r\n',
  hkey, '\r\n',
  '$', LENGTH(hval), '\r\n',
  hval, '\r'
)
FROM (
  SELECT
  'HSET' as redis_cmd,
  'events_all_time' AS redis_key,
  action AS hkey,
  count AS hval
  FROM events_all_time
) AS t
Tying it all together:
mysql stats_db --skip-column-names --raw < events_to_redis.sql | redis-cli --pipe
It’s important to use the--rawswitch so that MySQL doesn’t rewrite the crucial newline characters, and--skip-column-nameskeeps the formatted output clear of column headers.

我的统计表:

CREATE TABLE events_all_time (
  id int(11) unsigned NOT NULL AUTO_INCREMENT,
  action varchar(255) NOT NULL,
  count int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (id),
  UNIQUE KEY uniq_action (action)
);
准备在每行数据中执行的redis命令如下:
HSET events_all_time [action] [count]
按照以上redis命令规则,创建一个events_to_redis.sql文件,内容是用来生成redis数据协议格式的SQL:
-- events_to_redis.sql

SELECT CONCAT(
  "*4\r\n",
  '$', LENGTH(redis_cmd), '\r\n',
  redis_cmd, '\r\n',
  '$', LENGTH(redis_key), '\r\n',
  redis_key, '\r\n',
  '$', LENGTH(hkey), '\r\n',
  hkey, '\r\n',
  '$', LENGTH(hval), '\r\n',
  hval, '\r'
)
FROM (
  SELECT
  'HSET' as redis_cmd,
  'events_all_time' AS redis_key,
  action AS hkey,
  count AS hval
  FROM events_all_time
) AS t
ok, 用下面的命令执行:
mysql stats_db --skip-column-names --raw < events_to_redis.sql | redis-cli --pipe
很重要的mysql参数说明:
--raw: 使mysql不转换字段值中的换行符。
--skip-column-names: 使mysql输出的每行中不包含列名。
返回顶部
顶部