Alluxio 1.1.0 发布,分布式文件系统

来源: 投稿
作者: 愚_者
2016-06-12 00:00:00

Alluxio 1.1.0 发布了,Alluxio 是一个高容错的分布式文件系统,允许文件以内存的速度在集群框架中进行可靠的共享,类似Spark和 MapReduce。通过利用lineage信息,积极地使用内存,Alluxio的吞吐量要比HDFS高300多倍。Alluxio都是在内存中处理缓 存文件,并且让不同的 Jobs/Queries以及框架都能内存的速度来访问缓存文件。


  • 类 Java 的文件 API

  • 兼容性:实现 Hadoop 文件系统接口

  • 可插入式的底层文件系统

  • 内建 Raw 原生表的支持

  • 基于 Web 的 UI

  • 提供命令行接口



  • Improved performance of metadata operations: Alluxio 1.1 master utilizes multiple CPUs more effectively, achieving up to 20x higher throughput and up to 5x lower 99 percentile latency for metadata heavy workloads.

  • Improved performance of reading Parquet files:  Workloads with heavy random I/Os, such as reading Parquet files in Spark, are better supported, In particular, Alluxio now allows users to cache partially read blocks with higher read concurrency.

  • Improved performance of writing small files: By optimizing storage structures, Alluxio 1.1 achieves up to 1000x higher throughput when writing small files.


  • Simplified under storage metadata-loading process: Alluxio 1.1 greatly simplifies the work for users to surface under storage information in Alluxio. Loading information about files and directories from under storage now happens automatically the first time a file or directory is accessed.

  • Easy configuration: Alluxio configurations, such as conf/, have been restructured to be simpler to reason about and maintain. Alluxio also provides easier ways to customize Alluxio properties for external jobs (e.g., Spark or MapReduce) which interact with Alluxio. These changes help admins who launch and maintain Alluxio as well as users who run applications on Alluxio.

  • Deployment without Sudo Access: To help users try out Alluxio, from the 1.1 release onward, Alluxio can be deployed without sudo access. This feature is targeted toward users who want to try out the system’s features but do no require performance guarantees.

  • Under storage I/O delegation (Alpha): To simplify application development and maintenance, Alluxio 1.1 introduces an experimental option to read and write data from the under storage system through Alluxio workers. For example, applications will no longer require libraries to interact with under storage systems, greatly reducing the complexity to develop, configure, and maintain applications. This feature can be enabled by setting the flagalluxio.user.ufs.delegation.enabled=true and is disabled by default.

  • Improved user and file system permission checking: Alluxio provides permission semantics similar to the standard UNIX permission semantics, with command-line interfaces includingchmod, chown, and chgrp. This feature can be enabled by setting the and is disabled by default.


  • Integration with GCS and GCE: Alluxio supports using Google Cloud Storage (GCS) as an under storage system. Users can use the one-click Alluxio deployment to launch a cluster on Google Compute Engine (GCE) which is backed by GCS as the under storage.


56 收藏
4 评论
56 收藏