-repository. There's one more thing we have to deal with in bup: filesystem
-metadata. git repositories are really only intended to store file contents
-with a small bit of extra information, like symlink support and
-differentiating between executable and non-executable files. For the rest,
-we'll have to store it some other way.
-
-As of this writing, bup's support for metadata is... pretty much
-nonexistent. People are working on it. But the plan goes like this:
-
- - Each git tree will contain a file called .bupmeta.
-
- - .bupmeta contains an entry for every entry in the tree object, sorted in
- the same order as in the tree.
-
- - the .bupmeta entry lists information like modification times, attributes,
- file ownership, and so on for each file in the tree.
-
- - for backward compatibility with pre-metadata versions of bup (and git,
- for that matter) the .bupmeta file for each tree is optional, and if it's
- missing, files will be assumed to have default permissions.
-
- The nice thing about this design is that you can walk through each file in
- a tree just by opening the tree and the .bupmeta contents, and iterating
- through both at the same time.
-
- Trust us, it'll be awesome.
+repository. There's just one more thing we have to deal with:
+filesystem metadata. Git repositories are really only intended to
+store file contents with a small bit of extra information, like
+symlink targets and executable bits, so we have to store the rest
+some other way.
+
+Bup stores more complete metadata in the VFS in a file named .bupm in
+each tree. This file contains one entry for each file in the tree
+object, sorted in the same order as the tree. The first .bupm entry
+is for the directory itself, i.e. ".", and its name is the empty
+string, "".
+
+Each .bupm entry contains a variable length sequence of records
+containing the metadata for the corresponding path. Each record
+records one type of metadata. Current types include a common record
+type (containing the normal stat information), a symlink target type,
+a hardlink target type, a POSIX1e ACL type, etc. See metadata.py for
+the complete list.
+
+The .bupm file is optional, and when it's missing, bup will behave as
+it did before the addition of metadata, and restore files using the
+tree information.
+
+The nice thing about this design is that you can walk through each
+file in a tree just by opening the tree and the .bupm contents, and
+iterating through both at the same time.
+
+Since the contents of any .bupm file should match the state of the
+filesystem when it was *indexed*, bup must record the detailed
+metadata in the index. To do this, bup records four values in the
+index, the atime, mtime, and ctime (as timespecs), and an integer
+offset into a secondary "metadata store" which has the same name as
+the index, but with ".meta" appended. This secondary store contains
+the encoded Metadata object corresponding to each path in the index.
+
+Currently, in order to decrease the storage required for the metadata
+store, bup only writes unique values there, reusing offsets when
+appropriate across the index. The effectiveness of this approach
+relies on the expectation that there will be many duplicate metadata
+records. Storing the full timestamps in the index is intended to make
+that more likely, because it makes it unnecessary to record those
+values in the secondary store. So bup clears them before encoding the
+Metadata objects destined for the index, and timestamp differences
+don't contribute to the uniqueness of the metadata.
+
+Bup supports recording and restoring hardlinks, and it does so by
+tracking sets of paths that correspond to the same dev/inode pair when
+indexing. This information is stored in an optional file with the
+same name as the index, but ending with ".hlink".
+
+If there are multiple index runs, and the hardlinks change, bup will
+notice this (within whatever subtree it is asked to reindex) and
+update the .hlink information accordingly.
+
+The current hardlink implementation will refuse to link to any file
+that resides outside the restore tree, and if the restore tree spans a
+different set of filesystems than the save tree, complete sets of
+hardlinks may not be restored.