Apache UIMA Ruta (基于规则的文本注解) 2.1.0 发布了,改进记录包括:
UIMA Ruta Language and Analysis Engine:
- Combinations of rule elements that specify no sequential constraint
- Rule elements with inlined rules, which are interpreted as "actions"
or "conditions"
- Dot notation for accessing features, also available as matching conditions
- Implicit actions and conditions
- New analysis engines for CAS manipulation
- Direct import of uimaFIT analysis engines without descriptor
- RutaEngine is now a uimaFIT component
- Manual specification of the start anchor for a rule match
- Many bug fixes
UIMA Ruta Workbench:
- Compatible with uimaj 2.4.2 and Eclipse Kepler
- New framework for constraint-driven evaluation
- Two new rule learning algorithms for the TextRuler framework
- New views, e.g., Annotation Check and Html viewer
- Combinations with java projects
- Open CAS Editor at Eclipse startup causes less problems
- Many bug fixes
完整列表请看:http://uima.apache.org/d/ruta-2.1.0/issuesFixed/jira-report.html
UIMA (Unstructured Information Management applications) 是一个软件系统,用来分析大量的非结构化信息从而发掘中对最终用户有用的知识点,一个最典型的 UIM 应用就是从文本文件中提取有用信息,例如人员、地址和组织等相关信息。
下面是 UIMA 的结构图:
暂无更多评论