java中用Lucene做搜索,在建索引时遇到的2个异常

剑指天涯 发布于 2013/09/09 16:42
阅读 931
收藏 1

版本信息:

jdk: 1.6

lucene: 3.2.0

情况是这样的,现在项目用lucene做搜索,我在服务器上跑了个后台线程用于建索引(每次最多从数据库中取出2w条),隔10分钟会跑一次,但是 隔一段时间就会抛出一些莫名其妙的异常,而且修复不了,导致我必须得把索引目录下所有文件全部删除,再重新建索引。这在数据大的情况下,是致命的啊(目前 有将近110w数据,如果重建,每10分钟2w条数据,要9个小时)

有碰到过这个异常的大神们,可否为小弟指引一下,滴水之恩,必当涌泉相报!

我建索引的代码部分:

private synchronized void createIndexMutil(List contentList) {
	FSDirectory fsDir = null;
	RAMDirectory ramDir = null;
	IndexWriter indexWriter = null;
	IndexWriter fsInedxWriter = null;
	int commit_size = Config.INDEX_COMMIT_SIZE;
	String indexPath = Config.INDEXPATH;
	int count = 0;
	int sum = 0;
	try{
		File indexDir  = new File(indexPath);
		if((!indexDir.exists()) || (!indexDir.isDirectory())){
			indexDir.mkdirs();
		}
		
		fsDir = FSDirectory.open(indexDir);
		LockFactory lfactory = new SimpleFSLockFactory();
		fsDir.setLockFactory(lfactory);
		Analyzer luceneAnalyzer = new PaodingAnalyzer();
		fsInedxWriter = new IndexWriter(fsDir, luceneAnalyzer, IndexWriter.MaxFieldLength.LIMITED);
		for(int i = 0; i < contentList.size(); i++){
			ContentVo contentVo = (ContentVo)contentList.get(i);
			if(!contentVo.getContentContent().equals("")){
				if(count == 0){
					ramDir = new RAMDirectory();
					ramDir.setLockFactory(lfactory);
					indexWriter = new IndexWriter(ramDir, luceneAnalyzer, IndexWriter.MaxFieldLength.LIMITED);
				}
				Document document = new Document();
				Field f;
				String s = contentVo.getContentContent().replaceAll("<[^>]+>","").replaceAll("&nbsp;+"," ").replaceAll("\\s+"," ").replaceAll("&(\\w)+;", "");
				document.add(new Field("taskid", contentVo.getTaskID(), Field.Store.YES, Field.Index.NOT_ANALYZED));
				document.add(new Field("contentid", contentVo.getContentID(), Field.Store.YES, Field.Index.NOT_ANALYZED));
				document.add(new Field("content",s,Field.Store.YES,Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));
				
				indexWriter.addDocument(document);
				log.info("[LUCENE THREAD]index created!taskid:"+ contentVo.getTaskID() +",contentid:" + contentVo.getContentID());
				updateIsIndex(contentVo.getTaskID(), contentVo.getContentID(), 1, contentVo.getIndexType());
				
				count ++;
				sum ++;
				if(count >= commit_size || i == contentList.size() - 1){
					indexWriter.close();
					fsInedxWriter.addIndexes(ramDir);
					count = 0;
				}
			}else{
				log.info("[LUCENE THREAD]content is empty!taskid:"+ contentVo.getTaskID() +",contentid:" + contentVo.getContentID());
				updateIsIndex(contentVo.getTaskID(), contentVo.getContentID(), 2, contentVo.getIndexType());
			}
		}
		fsInedxWriter.optimize();
		log.info("[LUCENE THREAD]append " + sum + ":" + " document!");
	}catch(Exception ex){
		log.error("操作异常", ex);
	}finally{
		if(fsInedxWriter != null){
			try {
				fsInedxWriter.close();
			} catch (Exception e) {
				log.error("操作异常", e);
			}
			fsInedxWriter = null;
		}
		if(indexWriter != null){
			try {
				indexWriter.close();
			} catch (Exception e) {
				log.error("操作异常", e);
			}
			indexWriter = null;
		}
	}
}

异常堆栈信息:

异常1:
java.io.IOException: background merge hit exception: _4(3.2):C19999 _5(3.2):Cv5000 _6(3.2):Cv5000 _7(3.2):Cv5000 _8(3.2):Cv5000 into _9 [optimize]
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2536)
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2474)
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2444)
        at com.paic.rsms_case.biz.LuceneService.createIndexMutil(LuceneService.java:138)
        at com.paic.rsms_case.biz.LuceneService.createNoIndexed(LuceneService.java:84)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:296)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:177)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:144)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:107)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:166)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at $Proxy0.createNoIndexed(Unknown Source)
        at com.paic.rsms_case.biz.ThreadService.run(ThreadService.java:59)
Caused by: java.io.IOException: Stale NFS file handle
        at java.io.RandomAccessFile.close0(Native Method)
        at java.io.RandomAccessFile.close(RandomAccessFile.java:543)
        at org.apache.lucene.store.FSDirectory$FSIndexOutput.close(FSDirectory.java:493)
        at org.apache.lucene.util.IOUtils.closeSafely(IOUtils.java:80)
        at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:127)
        at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:250)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4194)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3837)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:388)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:456)


异常2:
java.io.FileNotFoundException: /nfsc/ifbsm_rsms_220017_vol1/casefile/indexfile/_8e.cfs (No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.<init>(SimpleFSDirectory.java:69)
        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.<init>(SimpleFSDirectory.java:90)
        at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFSDirectory.java:91)
        at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78)
        at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66)
        at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:55)
        at org.apache.lucene.index.IndexWriter.getFieldInfos(IndexWriter.java:1210)
        at org.apache.lucene.index.IndexWriter.getCurrentFieldInfos(IndexWriter.java:1230)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1166)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:958)
        at com.paic.rsms_case.biz.LuceneService.createIndexMutil(LuceneService.java:106)
        at com.paic.rsms_case.biz.LuceneService.createNoIndexed(LuceneService.java:84)
        at sun.reflect.GeneratedMethodAccessor398.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:296)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:177)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:144)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:107)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:166)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at $Proxy0.createNoIndexed(Unknown Source)
        at com.paic.rsms_case.biz.ThreadService.run(ThreadService.java:59)



抛出异常1的方法行代码是:fsInedxWriter.optimize();
抛出异常2的方法行代码是:fsInedxWriter = new IndexWriter(fsDir, luceneAnalyzer, IndexWriter.MaxFieldLength.LIMITED);



加载中
0
sunyh
sunyh

1:没错误信息

2:多开几个线程 处理。对lucene来说你这个是小数据。

0
sunyh
sunyh
黏贴下错误堆栈。
0
剑指天涯
剑指天涯

引用来自“sunyh”的答案

黏贴下错误堆栈。

你好,sunyh:

不好意思,忘记贴异常堆栈信息了

返回顶部
顶部