Apache Cassandra 初体验 :安装和简单的demo

小编辑 发布于 2010/05/25 09:18
阅读 3K+
收藏 5

Apache Cassandra是一个开源的分布式数据库管理系统。它最初由Facebook开发(后来捐献给开源社区,现在 他们内部是用的是一个非开源的分支),用于储存大数据的信息。

主要特性:

  • 分布式
  • 基于字段的结构化
  • 高可伸展性

Cassandra是Google的BigTable的开源实现。Cassandra的主要特点就是它不是一个数据库,而是由一堆数据库节点共同构 成的一个分布式网络服务,对Cassandra 的一个写操作,会被复制到其他节点上去,对Cassandra的读操作,也会被路由到某个节点上面去读取。对于一个Cassandra群集来说,扩展性能 是比较简单的事情,只管在群集里面添加节点就可以了。

下载与安装

首先去官方网站下载最新包:http://cassandra.apache.org/

wget 'http://labs.renren.com/apache-mirror/cassandra/0.5.1/apache-cassandra-0.5.1-bin.tar.gz'
tar -xzvf apache-cassandra-0.5.1-bin.tar.gz

然后修改配置文件 log4j.properties 和 storage-conf.xml 。

log4j.properties:

log4j.rootLogger=INFO,stdout,R

# stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.SimpleLayout

# rolling log file
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.file.maxFileSize=20MB
log4j.appender.file.maxBackupIndex=50
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line %L) %m%n
# Edit the next line to point to your logs directory
log4j.appender.R.File={改成有写权限的目录}/var/log/cassandra/system.log

# Application logging options
#log4j.logger.com.facebook=DEBUG
#log4j.logger.com.facebook.infrastructure.gms=DEBUG
#log4j.logger.com.facebook.infrastructure.db=DEBUG

storage-conf.xml:

  <commitlogdirectory>{改成有写权限的目录}/var/lib/cassandra/commitlog</commitlogdirectory>
<datafiledirectories>
<datafiledirectory>{改成有写权限的目录}/var/lib/cassandra/data</datafiledirectory>
</datafiledirectories>
<calloutlocation>{改成有写权限的目录}/var/lib/cassandra/callouts</calloutlocation>
<stagingfiledirectory>{改成有写权限的目录}/var/lib/cassandra/staging</stagingfiledirectory>

如果你想让远程能通过9160能访问这台机器还需要做以下改动:

  <!-- 0.0.0.0 表示监控所有的网络接口,不然远程连不上来。 --->
<thriftaddress>0.0.0.0</thriftaddress>
<!-- Thrift RPC port (the port clients connect to). -->
<thriftport>9160</thriftport>

然后执行 bin/cassandra 启动服务:

[nosql@localhost cassandra]$ bin/cassandra
[nosql@localhost cassandra]$ Listening for transport dt_socket at address: 8888
INFO - Sampling index for /home/nosql/var/lib/cassandra/data/Keyspace1/Standard1-1-Data.db
INFO - Sampling index for /home/nosql/var/lib/cassandra/data/Keyspace1/Standard1-2-Data.db
INFO - Sampling index for /home/nosql/var/lib/cassandra/data/system/LocationInfo-5-Data.db
INFO - Replaying /home/nosql/var/lib/cassandra/commitlog/CommitLog-1270795886891.log
INFO - Standard1 has reached its threshold; switching in a fresh Memtable
INFO - Enqueuing flush of Memtable(Standard1)@16761835
INFO - Sorting Memtable(Standard1)@16761835
INFO - Writing Memtable(Standard1)@16761835
INFO - Completed flushing /home/nosql/var/lib/cassandra/data/Keyspace1/Standard1-3-Data.db
INFO - Log replay complete
INFO - Saved Token found: 31733744575597623818945310333122705815
INFO - Starting up server gossip

服务器启动完毕,接下来写个简单的客户端吧。

package org.dueam.oss.cassandra;
import java.util.List;
import java.io.UnsupportedEncodingException;
 
import org.apache.thrift.transport.TTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.TException;
import org.apache.cassandra.service.*;
 
public class CClient
{
public static void main(String[] args)
throws TException, InvalidRequestException, UnavailableException
, UnsupportedEncodingException, NotFoundException, TimedOutException
{
TTransport tr = new TSocket("10.249.134.1", 9160);
TProtocol proto = new TBinaryProtocol(tr);
Cassandra.Client client = new Cassandra.Client(proto);
tr.open();
String key_user_id = "1";
for(int i=0;i<10000;i++){
key_user_id = i +"";
 
// insert data
long timestamp = System.currentTimeMillis();
client.insert("Keyspace1",key_user_id,new ColumnPath("Standard1", null,
"name".getBytes("UTF-8")),
("Chris Goffinet" + i).getBytes("UTF-8"),timestamp,ConsistencyLevel.ONE);
client.insert("Keyspace1",
key_user_id,
new ColumnPath("Standard1", null, "age".getBytes("UTF-8")),
String.valueOf((24*i)).getBytes("UTF-8"),
timestamp,
ConsistencyLevel.ONE);
}
// read single column
ColumnPath path = new ColumnPath("Standard1", null, "name".getBytes("UTF-8"));
System.out.println(client.get("Keyspace1", key_user_id, path, ConsistencyLevel.ONE));
 
// read entire row
SlicePredicate predicate = new SlicePredicate(null, new SliceRange(new byte[0],
new byte[0], false, 10));
ColumnParent parent = new ColumnParent("Standard1", null);
List<ColumnOrSuperColumn> results = client.get_slice("Keyspace1", key_user_id,
parent, predicate, ConsistencyLevel.ONE);
for (ColumnOrSuperColumn result : results)
{
Column column = result.column;
System.out.println(new String(column.name, "UTF-8") + " -> "
+ new String(column.value, "UTF-8"));
}
 
tr.close();
}
}

初步压测结果:在Java内存没用过的时候读的性能和写的性能都不错,但是一旦内存用光(数据量大于JVM设置的内存),写的性能不变,读的性能直 线下降,目前看来比较适合SNS体系中消息之类的写多读少的应用。

原文转自 http://dueam.org/

加载中
返回顶部
顶部