Apache Lucene 4.10.3 发布,文本搜索引擎库

oschina
 oschina
发布于 2015年04月09日
收藏 6

Apache Lucene 4.10.3 发布,它是一个高性能的全 java 编写的文本搜索引擎库。几乎适用于所有需要全文搜索的应用程序。此版本中主要修复了 12 个 Bug。

Bug 修复:

  1. LUCENE-6019, LUCENE-6117: Remove -Dtests.assert to make IndexWriter infoStream sane.
    (Robert Muir, Mike McCandless)

  2. LUCENE-6161: Resolving deletes was failing to reuse DocsEnum likely causing substantial performance cost for use cases that frequently delete old documents
    (Mike McCandless)

  3. LUCENE-6192: Fix int overflow corruption case in skip data for high frequency terms in extremely large indices
    (Robert Muir, Mike McCandless)

  4. LUCENE-6207: Fixed consumption of several terms enums on the same sorted (set) doc values instance at the same time.
    (Tom Shally, Robert Muir, Adrien Grand)

  5. LUCENE-6093: Don't throw NullPointerException from BlendedInfixSuggester for lookups that do not end in a prefix token.
    (jane chang via Mike McCandless)

  6. LUCENE-6279: Don't let an abusive leftover _N_upgraded.si in the index directory cause index corruption on upgrade
    (Robert Muir, Mike McCandless)

  7. LUCENE-6287: Fix concurrency bug in IndexWriter that could cause index corruption (missing _N.si files) the first time 4.x kisses a 3.x index if merges are also running.
    (Simon Willnauer, Mike McCandless)

  8. LUCENE-6205: Fixed intermittent concurrency issue that could cause FileNotFoundException when writing doc values updates at the same time that a merge kicks off.
    (Mike McCandless)

  9. LUCENE-6214: Fixed IndexWriter deadlock when one thread is committing while another opens a near-real-time reader and an unrecoverable (tragic) exception is hit.
    (Simon Willnauer, Mike McCandless)

  10. LUCENE-6105: Don't cache FST root arcs if the number of root arcs is small, or if the cache would be > 20% of the size of the FST.
    (Robert Muir, Mike McCandless)

  11. LUCENE-6001: DrillSideways hits NullPointerException for certain BooleanQuery searches.
    (Dragan Jotannovic, jane chang via Mike McCandless)

  12. LUCENE-6306: Merging of doc values and norms now checks whether the merge was aborted so IndexWriter.rollback can more promptly abort a running merge.
    (Robert Muir, Mike McCandless)

更多详情内容请见发行页面

本版本已提供下载:

http://lucene.apache.org/core/mirrors-core-latest-redir.html

本站文章除注明转载外,均为本站原创或编译。欢迎任何形式的转载,但请务必注明出处,尊重他人劳动共创开源社区。
转载请注明:文章转载自 OSCHINA 社区 [http://www.oschina.net]
本文标题:Apache Lucene 4.10.3 发布,文本搜索引擎库
加载中

最新评论(6

eechen
eechen
也可以单独使用XunSearch提供的PHP中文分词扩展SCWS和词典,结合MySQL InnoDB/MyISAM的FullText或者SQLite的FTS3/FTS4虚拟表实现简单的全文搜索.

http://www.xunsearch.com/scws/docs.php#instscws
安装scws:
wget http://www.xunsearch.com/scws/down/scws-1.2.2.tar.bz2
tar xjf scws-1.2.2.tar.bz2
cd scws-1.2.2
./configure --prefix=/png/php/scws/1.2.2
make
make install
下载字典:
cd /png/php/scws/1.2.2/etc
wget http://www.xunsearch.com/scws/down/scws-dict-chs-gbk.tar.bz2
wget http://www.xunsearch.com/scws/down/scws-dict-chs-utf8.tar.bz2
tar xjf scws-dict-chs-gbk.tar.bz2
tar xvf scws-dict-chs-utf8.tar.bz2
安装scws的pecl扩展:
cd scws-1.2.2/phpext
/png/php/5.4.39NTS/bin/phpize
./configure \
--with-php-config=/png/php/5.4.39NTS/bin/php-config \
--with-scws=/png/php/scws/1.2.2
make
make install

PHP内嵌了SQLite,这是关于全文搜索的使用说明:
http://www.sqlite.org/fts3.html
愤怒的小兔
愤怒的小兔

引用来自“_K_”的评论

...总感觉中国是外星人阵地
IE6、XP、奇葩的全球同步发行(中国大陆除外)、一些学校还在不停Visual Basic,欢迎来火星~
eechen
eechen
开源中文搜索引擎XunSearch:
【性能劲爆】XunSearch 单库最多支持 40 亿条数据,在 5 亿网页大约 1.5TB 的数据中检索时间不超过 1 秒(非缓存)。
【简单易用】前端是使用脚本语言 PHP 编写的开发工具包。API 简单清晰,开发难度极低,提供全中文的示例代码、文档、辅助脚本工具等。
【功能丰富】除支持基础的自定义分词、字段检索、布尔搜索外,还直接支持用户急需的相关搜索、拼音搜索、搜索建议等专业功能。
XunSearch作者同时是中文分词SCWS的作者hightman。

编译安装开源中文全文搜索引擎XunSearch:
http://www.xunsearch.com/doc/php/guide/start.installation
wget http://www.xunsearch.com/download/xunsearch-full-latest.tar.bz2
tar xjf xunsearch-full-latest.tar.bz2
cd xunsearch-full-1.4.9
sh setup.sh #我输入的安装路径是/png/xunsearch/1.4.9
服务管理脚本:
/png/xunsearch/1.4.9/bin/xs-ctl.sh restart
索引数据目录:
/png/xunsearch/1.4.9/data
PHP的SDK开发包:
/png/xunsearch/1.4.9/sdk/php/README
贾珣
贾珣
@红薯 这页面底下的软件链接不应该是Lucene么,怎么是Apache>,<
java9
java9

引用来自“依米艳”的评论

不是已经出到5.0了吗
不同的分支吧。
chloe900
chloe900
不是已经出到5.0了吗
返回顶部
顶部