Store metadata in the index, in bupindex.meta; only store unique values.

[bup.git] / DESIGN
diff --git a/DESIGN b/DESIGN

index bd1576e353a4fa62718cbc3d48231f2cc64231be..7d2b64d6307f2fbc509c116ccf160696f056cc61 100644 (file)
--- a/DESIGN
+++ b/DESIGN
@@ -55,15 +55,9 @@ Essentially, copying data from the filesystem to your repository is called
  a backup using the 'bup save' command, but that's getting ahead of
  ourselves.
  
-As most backup experts know, backing stuff up is normally about 100x more
-common than restoring stuff, ie.  copying from the repository to your
-filesystem.  For that reason, and also because bup is so new, there is no
-actual 'bup restore' command that does the obvious inverse operation to 'bup
-save'.  There are 'bup ftp' and 'bup fuse', which let you access your
-backed-up data, but they aren't as efficient as a fully optimized restore
-tool intended for high-volume restores.  There's nothing stopping us from
-writing one; we just haven't written it yet.  Feel free to pester us about
-it on the bup mailing list (see the README to find out about the list).
+For the inverse operation, ie. copying from the repository to your
+filesystem, you have several choices; the main ones are 'bup restore', 'bup
+ftp', 'bup fuse', and 'bup web'.
  
  Now, those are the basics of backups.  In other words, we just spent about
  half a page telling you that bup backs up and restores data.  Are we having
@@ -287,7 +281,7 @@ they're written.
  But that leads us to our next problem.
  
  
-Huge numbers of huge packfiles (git.PackMidx, cmd/midx)
+Huge numbers of huge packfiles (midx.py, bloom.py, cmd/midx, cmd/bloom)
  ------------------------------
  
  Git isn't actually designed to handle super-huge repositories.  Most git
@@ -360,9 +354,13 @@ You generate midx files with 'bup midx'.  The downside of midx files is that
  generating one takes a while, and you have to regenerate it every time you
  add a few packs.
  
-(Computer Sciency observers will note that there are some interesting data
-structures out there that could help make things even better.  A very
-promising sounding one is called a "bloom filter." Look it up in Wikipedia.)
+UPDATE: Brandon Low contributed an implementation of "bloom filters", which
+have even better characteristics than midx for certain uses.  Look it up in
+Wikipedia.  He also massively sped up both midx and bloom by rewriting the
+key parts in C.  The nicest thing about bloom filters is we can update them
+incrementally every time we get a new idx, without regenerating from
+scratch.  That makes the update phase much faster, and means we can also get
+away with generating midxes less often.
  
  midx files are a bup-specific optimization and git doesn't know what to do
  with them.  However, since they're stored as separate files, they don't
@@ -373,32 +371,64 @@ Detailed Metadata
  -----------------
  
  So that's the basic structure of a bup repository, which is also a git
-repository.  There's one more thing we have to deal with in bup: filesystem
-metadata.  git repositories are really only intended to store file contents
-with a small bit of extra information, like symlink support and
-differentiating between executable and non-executable files.  For the rest,
-we'll have to store it some other way.
-
-As of this writing, bup's support for metadata is... pretty much
-nonexistent.  People are working on it.  But the plan goes like this:
-
- - Each git tree will contain a file called .bupmeta.
- 
- - .bupmeta contains an entry for every entry in the tree object, sorted in
-   the same order as in the tree.
- 
- - the .bupmeta entry lists information like modification times, attributes,
-   file ownership, and so on for each file in the tree.
-   
- - for backward compatibility with pre-metadata versions of bup (and git,
-   for that matter) the .bupmeta file for each tree is optional, and if it's
-   missing, files will be assumed to have default permissions.
-   
- The nice thing about this design is that you can walk through each file in
- a tree just by opening the tree and the .bupmeta contents, and iterating
- through both at the same time.
- 
- Trust us, it'll be awesome.  
+repository.  There's just one more thing we have to deal with:
+filesystem metadata.  Git repositories are really only intended to
+store file contents with a small bit of extra information, like
+symlink targets and and executable bits, so we have to store the rest
+some other way.
+
+Bup stores more complete metadata in the VFS in a file named .bupm in
+each tree.  This file contains one entry for each file in the tree
+object, sorted in the same order as the tree.  The first .bupm entry
+is for the directory itself, i.e. ".", and its name is the empty
+string, "".
+
+Each .bupm entry contains a variable length sequence of records
+containing the metadata for the corresponding path.  Each record
+records one type of metadata.  Current types include a common record
+type (containing the normal stat information), a symlink target type,
+a hardlink target type, a POSIX1e ACL type, etc.  See metadata.py for
+the complete list.
+
+The .bupm file is optional, and when it's missing, bup will behave as
+it did before the addition of metadata, and restore files using the
+tree information.
+
+The nice thing about this design is that you can walk through each
+file in a tree just by opening the tree and the .bupm contents, and
+iterating through both at the same time.
+
+Since the contents of any .bupm file should match the state of the
+filesystem when it was *indexed*, bup must record the detailed
+metadata in the index.  To do this, bup records four values in the
+index, the atime, mtime, and ctime (as timespecs), and an integer
+offset into a secondary "metadata store" which has the same name as
+the index, but with ".meta" appended.  This secondary store contains
+the encoded Metadata object corresponding to each path in the index.
+
+Currently, in order to decrease the storage required for the metadata
+store, bup only writes unique values there, reusing offsets when
+appropriate across the index.  The effectiveness of this approach
+relies on the expectation that there will be many duplicate metadata
+records.  Storing the full timestamps in the index is intended to make
+that more likely, because it makes it unnecessary to record those
+values in the secondary store.  So bup clears them before encoding the
+Metadata objects destined for the index, and timestamp differences
+don't contribute to the uniqueness of the metadata.
+
+Bup supports recording and restoring hardlinks, and it does so by
+tracking sets of paths that correspond to the same dev/inode pair when
+indexing.  This information is stored in an optional file with the
+same name as the index, but ending with ".hlink".
+
+If there are multiple index runs, and the hardlinks change, bup will
+notice this (within whatever subtree it is asked to reindex) and
+update the .hlink information accordingly.
+
+The current hardlink implementation will refuse to link to any file
+that resides outside the restore tree, and if the restore tree spans a
+different set of filesystems than the save tree, complete sets of
+hardlinks may not be restored.
  
  
  Filesystem Interaction
@@ -568,8 +598,9 @@ compare the files in the index against the ones in the backup set, and
  update only the ones that have changed.  (Even more interesting things
  happen if people are using the files on the restored system and you haven't
  updated the index yet; the net result would be an automated merge of all
-non-conflicting files.)  This would be a poor man's distributed filesystem. 
-The only catch is that nobody has written 'bup restore' yet.  Someday!
+non-conflicting files.) This would be a poor man's distributed filesystem. 
+The only catch is that nobody has written this feature for 'bup restore'
+yet.  Someday!
  
  
  How 'bup save' works (cmd/save)
@@ -606,3 +637,7 @@ things to note:
  We hope you'll enjoy bup.  Looking forward to your patches!
  
  -- apenwarr and the rest of the bup team
+
+Local Variables:
+mode: text
+End: