加载中

It happens with the best of intentions: your design team adds their large graphic files to your project repository - and you see it grow and grow until it's a multi-gigabyte clump…

Working with large binary files in Git can indeed be tricky. Every time a tiny change in a 100 MB Photoshop file is committed, your repository grows by another 100 MB. This quickly adds up and makes your repository almost unusable due to its enormous size.

But of course, not using version control for your design / concept / movie / audio / executables / <other-large-file-use-case> work cannot be the solution. The general benefits of version control still apply and should be reaped in all kinds of projects.

Luckily, there's a Git extension that makes working with large files a lot more efficient: say hello to "Large File Storage" (or simply "LFS" if you prefer nicknames).

出于好意:设计团队开始把他们大尺寸的图形文件添加到你的项目仓库当中,然而引起的结果是,你看着仓库不断增大直到数 GB 大小......

在 GIT 中以二进制文件来运行确实是一种明智的做法,每当提交一个 100MB 的 Photoshop 文件中的细微改变,你仓库的大小当然也会增长 100MB,这样快速的增长会使你的仓库因为内容太过于庞大而变得几乎无法使用。

但是,如果说不使用版本控制你的设计/概念/视频/音频/可执行文件/<other-large-file-use-case>工作也不能解决问题(知识库过大)。一般来说,版本控制的好处还是存在的,而且应该用于各种各样的项目当中去。

幸运的是,这有一个 GIT 扩展可以让使用大型文件更加有效率,跟“Large File Storage”(或者叫"LFS",如果你喜欢这个简称)问个好吧。

Without LFS: Bloated Repositories

Before we look at how exactly LFS works its wonders, we'll take a closer look at the actual problem. Let's consider a simple website project as an example:

A simple project setup

Nothing special: some HTML, CSS, and JS files and a couple of small image assets. However, until now, we haven't included our design assets (Photoshop, Sketch, etc.). It makes a lot of sense to put your design assets under version control, too.

Big binary files in a project

However, here's the catch: each time our designer makes a change (no matter how small) to this new Photoshop file, she will commit another 100 MB to the repository. Very quickly, the repository will weigh tons of megabytes and soon gigabytes - which makes cloning and managing it very tedious.

Although I only talked about "design" files, this is really a problem with all "large" files: movies, audio recordings, datasets, etc.

没有 LFS:臃肿的仓库

在我们着眼于 LFS 如何创造奇迹之前,我们先进一步看一个实际的问题,以简单的网站项目为例:

A simple project setup

如上图所示,并没什么特别:一些 HTML,CSS 和 JS 文件和几个小的图像资源。然而,目前我们还没有把我们的设计资源(Photoshop,Sketch,etc.)包括进来,把设计资源加入版本控制会使它更加有意义。

Big binary files in a project

然而,关键在于:每次我们的设计发生变化(无论多小),都会另外提交 100MB 到我们的仓库中,很快,我们仓库就会从 MB 变成 GB 了,这就会使克隆和复制编程异常冗长。

虽然我只讨论了“设计”的文件,但它确实与所有"大"文件有关:如视频,音频记录,数据集等的问题。

With LFS: Efficient Large File Handling

Of course, LFS cannot simply "magic away" all that large data: it accrues with every change and has to be saved. However, it shifts that burden to the remote server - allowing the local repository to stay relatively lean!

To make this possible, LFS uses a simple trick: it does not keep all of a file's versions in the local repository. Instead, it provides only the files that are necessary in the checked out revision, on demand.

But this poses an interesting question: if those huge files themselves are not present in your local repository… what is present instead? LFS saves lightweight pointers in place of real file data. When you check out a revision with such a pointer, LFS simply looks up the original file (possibly on the server if it's not in its own, special cache) and downloads it for you.

Thereby, you end up with only the files you really want - not a whole bunch of superfluous data that you might never need.

使用 LFS : 有效的处理大文件

当然,LFS 并不能像"变魔术一样"处理所有的大型数据:它需要记录并保存每一个变化。然而,这就把负担转移给了远程服务器 - 允许本地仓库保持相对的精简。

为了实现这个可能,LFS 耍了一个小把戏:它在本地仓库中并不保留所有的文件版本,而是仅根据需要提供检出版本中必需的文件。

但这引发了一个有意思的问题:如果这些庞大的文件本身没有出现在你的本地仓库中....改用什么来代替呢? LFS 保存轻量级指针中有真实的文件数据。当你用一个这样的指针去迁出一个修订版时,LFS 会很轻易地找到源文件(不在他上面可能就在服务器上,特殊缓存)然后你下载就行了。

因此,你最终只会得到你真正想要的文件 - 而不是一些你可能永远都不需要冗余数据。

Installing LFS

LFS is not (yet) part of the core Git binary, but it's available as an extension. This means that, before we can work with LFS, we need to make sure it's installed.

Server

Not all code hosting services support LFS already. As a GitLab user, however, there's not much to worry about: if you're using GitLab.com or a halfway recent version of GitLab CE or EE, support for LFS is already baked in! Your administrator only need to enable the LFS option.

Local Machine

Your local Git installation also needs to support LFS. If you're using Tower, a Git desktop client, you don't have to install anything: Tower supports the Git Large File System out of the box.

If you're using Git on the command line, there are different installation options available to you:

  • Binary Packages: Up-to-date binary packages are available for Windows, Mac, Linux, and FreeBSD.

  • Linux: Packages for Debian and RPM are available from PackageCloud.

  • macOS: You can use Homebrew via "brew install git-lfs" or MacPorts via "port install git-lfs".

  • Windows: You can use the Chocolatey package manager via "choco install git-lfs".

After your package manager has finished its work, you need to complete the installation with the "lfs install" command:

git lfs install

安装 LFS

LFS 尚未加入 Git 的核心二进制文件,但是它可以作为扩展使用。这意味着,使用之前我们需要提前安装。

服务端

目前并不是所有主机服务器都支持 LFS。 但如果你是 GitLab 用户就不必担心。如果你使用 GitLab.com 或 GitLab CE 或 EE 的中间版本,它们已经支持LFS了! 您的管理员只需要开启LFS选项

本地机器

你本地安装的 Git 也需要支持 LFS。 如果你使用 Tower,一个 Git 桌面客户端,你就不需要额外安装了,因为 Tower 已经非常好地支持 Git 大文件系统。

如果你使用命令行来运行 Git,可以选择以下方法来安装 LFS:

  • 二进制安装包:最新版本的 binary packages, 支持 Windows,Mac,Linux 以及 FreeBSD 。

  • Linux:Debian 和 RPM 的软件包可从 PackageCloud 获得。

  • macOS:你可以使用 Homebrew 执行“brew install git-lfs”来安装,也可以使用 MacPorts 执行“port install git-lfs”安装.

  • Windows:你可以使用包管理器 Chocolatey 执行“choco install git-lfs”来安装。

等你的包管理器完成上述安装后,你还需要执行“lfs install”命令:

git lfs install

Tracking Files with LFS

Without further instructions, LFS won't take care of your large file problems. We'll have to tell LFS explicitly which files it should handle!

So let's return to our "big Photoshop file" example. We can instruct LFS to take care of the "design.psd" file using the "lfs track" command:

git lfs track "design-resources/design.psd"

At first glance, the command didn't seem to have much effect. However, you'll notice that a new file in the project's root folder has been created (or changed, if it already existed): .gitattributes collects all file patterns that we choose to track via LFS. Let's take a look at its contents:

cat .gitattributes 
design-resources/design.psd filter=lfs diff=lfs merge=lfs -text

Perfect! From now on, LFS will handle this file. We can now go ahead and add it to the repository in the way we're used to. Notice that any changes to .gitattributes also have to be committed to the repository, just like other modifications:

git add .gitattributes
git add design-resources/design.psd
git commit -m "Add design file"

使用 LFS 追踪文件

没有特别说明的情况下,LFS 不会处理大文件问题,因此,我们必须明确告诉 LFS 该处理哪些文件。

让我们回到“大 Photoshop 文件”的示例, 我们可以使用“lfs track”命令来告诉 LFS 处理“design.psd”文件:

git lfs track "design-resources/design.psd"

乍一看,这条命令好像没生效,不过,你会看到项目根目录下新建了一个新文件 ".gitattributes" (如果已存在,将会被修改),".gitattributes" 文件记录了我们用 LFS 追踪的所有的文件路径。

cat .gitattributes 
design-resources/design.psd filter=lfs diff=lfs merge=lfs -text

棒棒哒!在这之后 LFS 会处理这个文件。我们接下来只要像往常那样把这个文件提交到仓库。值得注意的是,".gitattributes" 文件也需要提交到仓库,操作和提交其他修改文件一样:

git add .gitattributes
git add design-resources/design.psd
git commit -m "Add design file"

Tracking File Patterns

Adding a specific, single file like this is all well and good… but what if you want to track, for example, every .indd file in our project? Please relax: you don't have to add each file manually! LFS allows you to define file patterns, much like when ignoring files. The following command, for example, will instruct LFS to track all InDesign files - existing ones and future ones:

git lfs track "*.indd"

You could also tell LFS to track the contents of a whole directory:

git lfs track "design-assets/*"

追踪文件路径

添加单一文件如上所示就可以。但是,比如说,如果你想追踪项目里所有后缀名为 indd 的文件呢?放心,你不用手动的添加每个文件。LFS 允许你定义文件路径,就像忽略文件时的用法那样。举个例子,下面的命令会告诉 LFS 追踪所有的 InDesign 文件 — 已经存在的和以后添加的。

git lfs track "*.indd"

你也可以告诉 LFS 追踪整个文件夹里的所有内容:

git lfs track "design-assets/*"

Getting an Overview of Tracked Files

At some point, you might want to know which files exactly are tracked by LFS at the moment. You could simply take a look at the .gitattributes file. However, these are not actual files, but only rules and therefore highly "theoretical": individual files might have slipped through, e.g. due to typos or overly restrictive rules.

To see a list of the actual files that you're currently tracking, simply use the git lfs ls-files command:

git lfs ls-files
194dcdb603 * design-resources/design.psd

追踪文件概述

有时候,你可能想知道到底有哪些文件在被 LFS 追踪。你可以简单的看看.gitattributes文件。然而,它们不是真实的文件,而是包含一些规则和理论的文件:某些文件可能会漏掉,例如由于拼写错误或者过分严格的规则。

想要查看你当前正在追踪的实际文件的列表,可以使用 git lfs ls-files 命令:

git lfs ls-files
194dcdb603 * design-resources/design.psd

Track as Early as Possible

Remember that LFS does not change the laws of nature: things that were committed to the repository are there to stay. It's very hard (and dangerous) to change a project's commit history.

This means that you should tell LFS to track a file before it's committed to the repository.

Otherwise, it has become part of your project's history - including all of its megabytes and gigabytes…

The ideal moment to configure which file patterns you want to track is right when initializing a repository (just like with ignoring files).

尽早追踪

记住 LFS 并没有改变 git 本身的原理:被提交到仓库中的文件还会留在那儿。改变项目的提交历史很困难(也有风险)。

这意味着你应该在文件没有提交到仓库前就让 LFS 进行追踪。

不然,它就成了项目历史的一部分 - 令项目增大数 MB 或数 GB 的大小...

初始化仓库时要选择配置要追踪的文件规则的完美时机(就跟配置忽略文件一样)。

Using LFS in a GUI

Although LFS is not difficult to use, there are still commands to remember and things to mess up. If you want to be more productive with Git (and LFS), have a look at Tower, a Git desktop client for Mac and Windows. Since Tower comes with built-in support for Git LFS, there is nothing to install. The app has been around for several years and is trusted by over 80,000 users all over the world.

Using Tower to be more productive with Git and Git LFS

Additionally, Tower provides a direct integration with GitLab! After connecting your GitLab account in Tower, you can clone and create repositories with just a single click.

通过图形化工具使用 LFS

尽管 LFS 使用起来并不困难,但是还是需要记住一堆命令和,处理一些事情。 如果你想更高效地使用 Git(还有 LFS),可以看一下 Tower,Mac 和 Windows 上的一个 Git 桌面客户端。现在 Tower 已经内置支持 Git LFS,不需要再安装任何东西。这个应用已经面市了数年并得到了世界各地约 80,000 的用户的支持和信赖。

另外, Tower 可以直接与 Gitlab 集成!使用 Tower 连接到你的 GitLab 账号后,你就可以点下鼠标来克隆和创建仓库啦。

Working with Git

A great aspect of LFS is that you can maintain your normal Git workflow: staging, committing, pushing, pulling and everything else works just like before. Apart from the commands we've discussed, there's nothing to watch out for.

LFS will provide the files you need, when you need them.

In case you're looking for more information about LFS, have a look at this free online book. For general insights about Git, take a look at the Git Tips & Tricks blog post and Tower's video series.

About Guest Author

This is a guest post written by Tobias Günther, who is part of the team behind the Tower Git client.

使用 Git

LFS 有一个很赞的特点是,可以像之前一样保留正常的 Git 工作流程:暂存,提交,push,pull 和其他所有的操作。 除了我们讨论过的命令,没有什么需要额外注意的了。

LFS 将在你需要时提供所需的文件。

如果你正在寻找有关 LFS 的更多信息,请看看这本免费在线书。 有关 Git 的常用技巧,请参阅 Git Tips&Tricks 中发布的博文和 Tower 的视频系列

关于 Guest 的作者

本文是由 Tobias Günther 撰写,是 Tower Git client 团队的一员,发布在用户文章中。

返回顶部
顶部