Wrote man pages for bup(1) and all the subcommands.

author Avery Pennarun <apenwarr@gmail.com>

Sat, 6 Feb 2010 22:18:33 +0000 (17:18 -0500)

committer Avery Pennarun <apenwarr@gmail.com>

Sat, 6 Feb 2010 22:18:33 +0000 (17:18 -0500)
author Avery Pennarun <apenwarr@gmail.com>
Sat, 6 Feb 2010 22:18:33 +0000 (17:18 -0500)
committer Avery Pennarun <apenwarr@gmail.com>
Sat, 6 Feb 2010 22:18:33 +0000 (17:18 -0500)
diff --git a/Documentation/bup-damage.1.md b/Documentation/bup-damage.1.md

new file mode 100644 (file)

index 0000000..7d604d8
--- /dev/null
+++ b/Documentation/bup-damage.1.md
@@ -0,0 +1,94 @@
+% bup-damage(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup damage - randomly destroy blocks of a file
+
+# SYNOPSIS
+
+bup damage [-n count] [-s maxsize] [--percent pct] [-S seed] 
+[--equal] <filenames...>
+
+# DESCRIPTION
+
+Use `bup damage` to deliberately destroy blocks in a
+`.pack` or `.idx` file (from `.bup/objects/pack`) to test
+the recovery features of `bup-fsck`(1) or other programs.
+
+*THIS PROGRAM IS EXTREMELY DANGEROUS AND WILL DESTROY YOUR
+DATA*
+
+`bup damage` is primarily useful for automated or manual tests
+of data recovery tools, to reassure yourself that the tools
+actually work.
+
+# OPTIONS
+
+-n, --num=*numblocks*
+:   the number of separate blocks to damage in each file
+    (default 10).
+    Note that it's possible for more than one damaged
+    segment to fall in the same `bup-fsck`(1) recovery block,
+    so you might not damage as many recovery blocks as you
+    expect.  If this is a problem, use `--equal`.
+
+-s, --size=*maxblocksize*
+:   the maximum size, in bytes, of each damaged block
+    (default 1 unless `--percent` is specified).  Note that
+    because of the way `bup-fsck`(1) works, a multi-byte
+    block could fall on the boundary between two recovery
+    blocks, and thus damaging two separate recovery blocks. 
+    In small files, it's also possible for a damaged block
+    to be larger than a recovery block.  If these issues
+    might be a problem, you should use the default damage
+    size of one byte.
+    
+--percent=*maxblockpercent*
+:   the maximum size, in percent of the original file, of
+    each damaged block.  If both `--size` and `--percent`
+    are given, the maximum block size is the minimum of the
+    two restrictions.  You can use this to ensure that a
+    given block will never damage more than one or two
+    `git-fsck`(1) recovery blocks.
+    
+-S, --seed=*randomseed*
+:   seed the random number generator with the given value. 
+    If you use this option, your tests will be repeatable,
+    since the damaged block offsets, sizes, and contents
+    will be the same every time.  By default, the random
+    numbers are different every time (so you can run tests
+    in a loop and repeatedly test with different
+    damage each time).
+    
+--equal
+:   instead of choosing random offsets for each damaged
+    block, space the blocks equally throughout the file,
+    starting at offset 0.  If you also choose a correct
+    maximum block size, this can guarantee that any given
+    damage block never damages more than one `git-fsck`(1)
+    recovery block.  (This is also guaranteed if you use
+    `-s 1`.)
+    
+# EXAMPLE
+
+    # make a backup in case things go horribly wrong
+    cp -a ~/.bup/objects/pack ~/bup-packs.bak
+    
+    # generate recovery blocks for all packs
+    bup fsck -g
+    
+    # deliberately damage the packs
+    bup damage -n 10 -s 1 -S 0 ~/.bup/objects/pack/*.{pack,idx}
+    
+    # recover from the damage
+    bup fsck -r
+
+# SEE ALSO
+
+`bup-fsck`(1), `par2`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-drecurse.1.md b/Documentation/bup-drecurse.1.md

new file mode 100644 (file)

index 0000000..a1f6f1b
--- /dev/null
+++ b/Documentation/bup-drecurse.1.md
@@ -0,0 +1,52 @@
+% bup-drecurse(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup drecurse - recursively list files in the filesystem
+
+# SYNOPSIS
+
+bup drecurse [-x] [-q] [--profile] <path>
+
+# DESCRIPTION
+
+`bup drecurse` traverses files in the filesystem in a way
+similar to `find`(1).  In most cases, you should use
+`find`(1) instead.
+
+This program is useful mainly for testing the file
+traversal algorithm used in `bup-index`(1).
+
+Note that filenames are returned in reverse alphabetical
+order, as in `bup-index`(1).  This is important because you
+can't generate the hash of a parent directory until you
+have generated the hashes of all its children.  When
+listing files in reverse order, the parent directory will
+come after its children, making this easy.
+
+# OPTIONS
+
+-x, --xdev, --one-file-system
+:   don't cross filesystem boundaries.
+
+-q, --quiet
+:   don't print filenames as they are encountered.  Useful
+    when testing performance of the traversal algorithms.
+    
+--profile
+:   print profiling information upon completion.  Useful
+    when testing performance of the traversal algorithms.
+    
+# EXAMPLE
+
+    bup drecurse -x /
+
+# SEE ALSO
+
+`bup-index`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-fsck.1.md b/Documentation/bup-fsck.1.md

new file mode 100644 (file)

index 0000000..90d837e
--- /dev/null
+++ b/Documentation/bup-fsck.1.md
@@ -0,0 +1,117 @@
+% bup-fsck(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup fsck - verify or repair a bup repository
+
+# SYNOPSIS
+
+bup fsck [-r] [-g] [-v] [--quick] [-j *jobs*] [--par2-ok]
+[--disable-par2] [filenames...]
+
+# DESCRIPTION
+
+`bup fsck` is a tool for validating bup repositories in the
+same way that `git fsck` validates git repositories.
+
+It can also generate and/or use "recovery blocks" using the
+`par2`(1) tool (if you have it installed).  This allows you
+to recover from damaged blocks covering up to 5% of your
+`.pack` files.
+
+In a normal backup system, damaged blocks are less
+important, because there tends to be enough data duplicated
+between backup sets that a single damaged backup set is
+non-critical.  In a deduplicating backup system like bup,
+however, no block is ever stored more than once, even if it
+is used in every single backup.  If that block were to be
+unrecoverable, *all* your backup sets would be
+damaged at once.  Thus, it's important to be able to verify
+the integrity of your backups and recover from disk errors
+if they occur.
+
+*WARNING*: bup fsck's recovery features are not available
+unless you have the free `par2`(1) package installed on
+your bup server.
+
+*WARNING*: bup fsck obviously cannot recover from a
+complete disk failure.  If your backups are important, you
+need to carefully consider redundancy (such as using RAID
+for multi-disk redundancy, or making off-site backups for
+site redundancy).
+
+# OPTIONS
+
+-r, --repair
+:   attempt to repair any damaged packs using
+    existing recovery blocks.  (Requires `par2`(1).)
+    
+-g, --generate
+:   generate recovery blocks for any packs that don't
+    already have them.  (Requires `par2`(1).)
+
+-v, --verbose
+:   increase verbosity (can be used more than once).
+
+--quick
+:   don't run a full `git verify-pack` on each pack file;
+    instead just check the final checksum.  This can cause
+    a significant speedup with no obvious decrease in
+    reliability.  However, you may want to avoid this
+    option if you're paranoid.  Has no effect on packs that
+    already have recovery information.
+    
+-j, --jobs=*numjobs*
+:   maximum number of pack verifications to run at a time. 
+    The optimal value for this option depends how fast your
+    CPU can verify packs vs. your disk throughput.  If you
+    run too many jobs at once, your disk will get saturated
+    by seeking back and forth between files and performance
+    will actually decrease, even if *numjobs* is less than
+    the number of CPU cores on your system.  You can
+    experiment with this option to find the optimal value.
+    
+--par2-ok
+:   immediately return 0 if `par2`(1) is installed and
+    working, or 1 otherwise.  Do not actually check
+    anything.
+    
+--disable-par2
+:   pretend that `par2`(1) is not installed, and ignore all
+    recovery blocks.
+
+
+# EXAMPLE
+
+    # generate recovery blocks for all packs that don't
+    # have them
+    bup fsck -g
+    
+    # generate recovery blocks for a particular pack
+    bup fsck -g ~/.bup/objects/pack/153a1420cb1c8*.pack
+    
+    # check all packs for correctness (can be very slow!)
+    bup fsck
+    
+    # check all packs for correctness and recover any
+    # damaged ones
+    bup fsck -r
+    
+    # check a particular pack for correctness and recover
+    # it if damaged
+    bup fsck -r ~/.bup/objects/pack/153a1420cb1c8*.pack
+    
+    # check if recovery blocks are available on this system
+    if bup fsck --par2-ok; then
+       echo "par2 is ok"
+    fi
+
+# SEE ALSO
+
+`bup-damage`(1), `fsck`(1), `git-fsck`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-fuse.1.md b/Documentation/bup-fuse.1.md

new file mode 100644 (file)

index 0000000..bb29364
--- /dev/null
+++ b/Documentation/bup-fuse.1.md
@@ -0,0 +1,49 @@
+% bup-fuse(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup fuse - mount a bup repository as a filesystem
+
+# SYNOPSIS
+
+bup fuse [-d] [-f] <mountpoint>
+
+# DESCRIPTION
+
+`bup fuse` opens a bup repository and exports it as a
+`fuse`(7) userspace filesystem.
+
+This feature is only available on systems (such as Linux)
+which support FUSE.
+
+**WARNING**: bup fuse is still experimental and does not
+enforce any file permissions!  All files will be readable
+by all users.
+
+
+# OPTIONS
+
+-d, --debug
+:   run in the foreground and print FUSE debug information
+    for each request.
+
+-f, --foreground
+:   run in the foreground and exit only when the filesystem
+    is unmounted.
+
+
+# EXAMPLE
+
+    rm -rf /tmp/buptest
+    mkdir /tmp/buptest
+    sudo bup fuse -d /tmp/buptest
+
+# SEE ALSO
+
+`fuse`(7), `fusermount`(1), `bup-ls`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-index.1.md b/Documentation/bup-index.1.md

new file mode 100644 (file)

index 0000000..cb8b95e
--- /dev/null
+++ b/Documentation/bup-index.1.md
@@ -0,0 +1,117 @@
+% bup-index(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup index - print and/or update the bup filesystem index
+
+# SYNOPSIS
+
+bup index <-p|-m|-u> [-s] [-H] [-l] [-x] [--fake-valid]
+[--check] [-f *indexfile*] [-v] <filenames...>
+
+# DESCRIPTION
+
+`bup index` prints and/or updates the bup filesystem index,
+which is a cache of the filenames, attributes, and sha-1
+hashes of each file and directory in the filesystem.  The
+bup index is similar in function to the `git`(1) index, and
+can be found in `~/.bup/bupindex`.
+
+Creating a backup in bup consists of two steps: updating
+the index with `bup index`, then actually backing up the
+files (or a subset of the files) with `bup save`.  The
+separation exists for these reasons:
+
+1. There is more than one way to generate a list of files
+that need to be backed up.  For example, you might want to
+use `inotify`(7) or `dnotify`(7).
+
+2. Even if you back up files to multiple destinations (for
+added redundancy), the file names, attributes, and hashes
+will be the same each time.  Thus, you can save the trouble
+of repeatedly re-generating the list of files for each
+backup set.
+
+3. You may want to use the data tracked by bup index for
+other purposes (such as speeding up other programs that
+need the same information).
+
+
+# OPTIONS
+
+-u, --update
+:   (recursively) update the index for the given filenames and
+    their descendants.  One or more filenames must be
+    given.
+
+-p, --print
+:   print the contents of the index.  If filenames are
+    given, shows the given entries and their descendants. 
+    If no filenames are given, shows the entries starting
+    at the current working directory (.).
+    
+-m, --modified
+:   prints only files which are marked as modified (ie.
+    changed since the most recent backup) in the index. 
+    Implies `-p`.
+
+-s, --status
+:   prepend a status code (A, M, D, or space) before each
+    filename.  Implies `-p`.  The codes mean, respectively,
+    that a file is marked in the index as added, modified,
+    deleted, or unchanged since the last backup.
+    
+-H, --hash
+:   for each file printed, prepend the most recently
+    recorded hash code.  The hash code is normally
+    generated by `bup save`.  For objects which have not yet
+    been backed up, the hash code will be
+    0000000000000000000000000000000000000000.  Note that
+    the hash code is printed even if the file is known to
+    be modified or deleted in the index (ie. the file on
+    the filesystem no longer matches the recorded hash). 
+    If this is a problem for you, use `--status`.
+    
+-l, --long
+:   print more information about each file, in a similar
+    format to the `-l` option to `ls`(1).  (INCOMPLETE)
+
+-x, --xdev, --one-file-system
+:   don't cross filesystem boundaries when recursing
+    through the filesystem.  Only applicable if you're
+    using `-u`.
+    
+--fake-valid
+:   mark specified filenames as up-to-date even if they
+    aren't.  This can be useful for testing, or to avoid
+    unnecessarily backing up files that you know are
+    boring.
+    
+--check
+:   carefully check index file integrity before and after
+    updating.  Mostly useful for automated tests.
+
+-f, --indexfile=*indexfile*
+:   use a different index filename instead of
+    `~/.bup/bupindex`.
+
+-v, --verbose
+:   increase log output during update (can be used more
+    than once).  With one `-v`, print each directory as it
+    is updated; with two `-v`, print each file too.
+
+
+# EXAMPLE
+
+    bup index -vux /etc /var /usr
+    
+
+# SEE ALSO
+
+`bup-save`(1), `bup-drecurse`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-init.1.md b/Documentation/bup-init.1.md

new file mode 100644 (file)

index 0000000..6a5e205
--- /dev/null
+++ b/Documentation/bup-init.1.md
@@ -0,0 +1,40 @@
+% bup-init(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup init - initialize a bup repository
+
+# SYNOPSIS
+
+[BUP_DIR=*localpath*] bup init [-r *host*:*path*]
+
+# DESCRIPTION
+
+`bup init` initializes your local bup repository.  You
+usually don't need to run it unless you have set BUP_DIR
+explicitly.  By default, BUP_DIR is `~/.bup` and will be
+initialized automatically whenever you run any bup command.
+
+# OPTIONS
+
+-r, --remote=*host*:*path*
+:   Initialize not only the local repository, but also the
+    remote repository given by the *host* and *path*.  This is
+    not necessary if you intend to back up to the default
+    location on the server (ie. a blank *path*).
+
+
+# EXAMPLE
+
+    bup init
+    
+
+# SEE ALSO
+
+`bup-fsck`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-join.1.md b/Documentation/bup-join.1.md

new file mode 100644 (file)

index 0000000..68239a4
--- /dev/null
+++ b/Documentation/bup-join.1.md
@@ -0,0 +1,53 @@
+% bup-join(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup join - concatenate files from a bup repository
+
+# SYNOPSIS
+
+bup join [-r *host*:*path*] [refs or hashes...]
+
+# DESCRIPTION
+
+`bup join` is roughly the opposite operation to
+`bup-split`(1).  You can use it to retrieve the contents of
+a file from a local or remote bup repository.
+
+The supplied list of refs or hashes can be in any format
+accepted by `git`(1), including branch names, commit ids,
+tree ids, or blob ids.
+
+If no refs or hashes are given on the command line, `bup
+join` reads them from stdin instead.
+
+# OPTIONS
+
+-r, --remote=*host*:*path*
+:   Retrieves objects from the given remote repository
+    instead of the local one. *path* may be blank, in which
+    case the default remote repository is used.
+
+
+# EXAMPLE
+
+    # split and then rejoin a file using its tree id
+    TREE=$(tar -cvf - /etc | bup split -t)
+    bup join $TREE | tar -tf -
+    
+    # make two backups, then get the second-most-recent.
+    # mybackup~1 is git(1) notation for the second most
+    # recent commit on the branch named mybackup.
+    tar -cvf - /etc | bup split -n mybackup
+    tar -cvf - /etc | bup split -n mybackup
+    bup join mybackup~1 | tar -tf -
+
+# SEE ALSO
+
+`bup-split`(1), `bup-save`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-ls.1.md b/Documentation/bup-ls.1.md

new file mode 100644 (file)

index 0000000..9d6a82e
--- /dev/null
+++ b/Documentation/bup-ls.1.md
@@ -0,0 +1,42 @@
+% bup-ls(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup ls - list the contents of a bup repository
+
+# SYNOPSIS
+
+bup ls [-s] <paths...>
+
+# DESCRIPTION
+
+`bup ls` lists files and directories in your bup repository
+in the same layout as they would appear with `bup-fuse`(1).
+
+The top level directory is the branch (corresponding to
+the `-n` option in `bup save`), the next level is the date
+of the backup, and subsequent levels correspond to files in
+the backup.
+
+Once you have identified the file you want using `bup ls`,
+you can view its contents using `bup join` or `git show`.
+
+# OPTIONS
+
+-s, --hash
+:   show hash for each file/directory.
+
+
+# EXAMPLE
+
+    bup ls /myserver/1999-01-01/etc/profile
+
+# SEE ALSO
+
+`bup-join`(1), `bup-fuse`(1), `bup-save`(1), `git-show`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-margin.1.md b/Documentation/bup-margin.1.md

new file mode 100644 (file)

index 0000000..7547aee
--- /dev/null
+++ b/Documentation/bup-margin.1.md
@@ -0,0 +1,53 @@
+% bup-margin(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup margin - figure out your deduplication safety margin
+
+# SYNOPSIS
+
+bup margin
+
+# DESCRIPTION
+
+`bup margin` iterates through all objects in your bup
+repository, calculating the largest number of prefix bits
+shared between any two entries.  This number, `n`,
+identifies the longest subset of SHA-1 you could use and still
+encounter a collision between your object ids.
+
+For example, one system that was tested had a collection of
+11 million objects (70 GB), and `bup margin` returned 45.
+That means a 46-bit hash would be sufficient to avoid all
+collisions among that set of objects; each object in that
+repository could be uniquely identified by its first 46
+bits.
+
+The number of bits needed seems to increase by about 1 or 2
+for every doubling of the number of objects.  Since SHA-1
+hashes have 160 bits, that leaves 115 bits of margin.  Of
+course, because SHA-1 hashes are essentially random, it's
+theoretically possible to use many more bits with far fewer
+objects.
+
+If you're paranoid about the possibility of SHA-1
+collisions, you can monitor your repository by running `bup
+margin` occasionally to see if you're getting dangerously
+close to 160 bits.
+
+# EXAMPLE
+
+    $ bup margin
+    Reading indexes: 100.00% (11188299/11188299), done.
+    45
+    
+
+# SEE ALSO
+
+`bup-midx`(1), `bup-save`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-midx.1.md b/Documentation/bup-midx.1.md

new file mode 100644 (file)

index 0000000..6dbc1bc
--- /dev/null
+++ b/Documentation/bup-midx.1.md
@@ -0,0 +1,92 @@
+% bup-midx(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup midx - create a multi-index (.midx) file from several .idx files
+
+# SYNOPSIS
+
+bup midx [-o *outfile*] <-a|-f|*idxnames*...>
+
+# DESCRIPTION
+
+`bup midx` creates a multi-index (.midx) file from one or more
+git pack index (.idx) files.
+
+You should run this command
+occasionally to ensure your backups run quickly and without
+requiring too much RAM.
+
+# OPTIONS
+
+-o, --output
+:   use the given output filename for the .midx file. 
+    Default is auto-generated.
+    
+-a, --auto
+:   automatically generate new .midx files for any .idx
+    files where it would be appropriate.
+    
+-f, --force
+:   force generation of a single new .midx file containing
+    *all* your .idx files, even if other .midx files
+    already exist.  This will result in the fastest backup
+    performance, but may take a long time to run.
+
+
+# EXAMPLE
+
+    $ bup midx -a
+    Merging 21 indexes (2278559 objects).
+    Table size: 524288 (17 bits)
+    Reading indexes: 100.00% (2278559/2278559), done.
+    midx-b66d7c9afc4396187218f2936a87b865cf342672.midx
+    
+# DISCUSSION
+
+By default, bup uses git-formatted pack files, which
+consist of a pack file (containing objects) and an idx
+file (containing a sorted list of object names and their
+offsets in the .pack file).
+
+Normal idx files are convenient because it means you can use
+`git`(1) to access your backup datasets.  However, idx
+files can get slow when you have a lot of very large packs
+(which git typically doesn't have, but bup often does).
+
+bup .midx files consist of a single sorted list of all the objects
+contained in all the .pack files it references.  This list
+can be binary searched in about log2(m) steps, where m is
+the total number of objects.
+
+To further speed up the search, midx files also have a
+variable-sized fanout table that reduces the first n
+steps of the binary search.  With the help of this fanout
+table, bup can narrow down which page of the midx file a
+given object id would be in (if it exists) with a single
+lookup.  Thus, typical searches will only need to swap in
+two pages: one for the fanout table, and one for the object
+id.
+
+midx files are most useful when creating new backups, since
+searching for a nonexistent object in the repository
+necessarily requires searching through *all* the index
+files to ensure that it does not exist.  (Searching for
+objects that *do* exist can be optimized; for example,
+consecutive objects are often stored in the same pack, so
+we can search that one first using an MRU algorithm.)
+
+With large repositories, you should be sure to run
+`bup midx -a` or `bup midx -f` every now and then so that
+creating backups will remain efficient.
+
+
+# SEE ALSO
+
+`bup-save`(1), `bup-margin`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-save.1.md b/Documentation/bup-save.1.md

new file mode 100644 (file)

index 0000000..363cef4
--- /dev/null
+++ b/Documentation/bup-save.1.md
@@ -0,0 +1,81 @@
+% bup-save(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup save - create a new bup backup set
+
+# SYNOPSIS
+
+bup save [-r *host*:*path*] <-t|-c|-n *name*> [-v] [-q]
+  [--smaller=*maxsize*] <paths...>
+
+# DESCRIPTION
+
+`bup save` saves the contents of the given files or paths
+into a new backup set and optionally names that backup set.
+
+Before trying to save files using `bup save`, you should
+first update the index using `bup index`.  The reasons
+for separating the two steps are described in the man page
+for `bup-index`(1).
+
+# OPTIONS
+
+-r, --remote=*host*:*path*
+:   save the backup set to the given remote server.  If
+    *path* is omitted, uses the default path on the remote
+    server (you still need to include the ':')
+
+-t, --tree
+:   after creating the backup set, print out the git tree
+    id of the resulting backup.
+    
+-c, --commit
+:   after creating the backup set, print out the git commit
+    id of the resulting backup.
+
+-n, --name=*name*
+:   after creating the backup set, create a git branch
+    named *name* so that the backup can be accessed using
+    that name.  If *name* already exists, the new backup
+    will be considered a descendant of the old *name*. 
+    (Thus, you can continually create new backup sets with
+    the same name, and later view the history of that
+    backup set to see how files have changed over time.)
+    
+-v, --verbose
+:   increase verbosity (can be used more than once).  With
+    one -v, prints every directory name as it gets backed up.  With
+    two -v, also prints every filename.
+
+-q, --quiet
+:   disable progress messages.
+
+--smaller=*maxsize*
+:   don't back up files >= *maxsize* bytes.  You can use
+    this to run frequent incremental backups of your small
+    files, which can usually be backed up quickly, and skip
+    over large ones (like virtual machine images) which
+    take longer.  Then you can back up the large files
+    less frequently.
+    
+
+# EXAMPLE
+    
+    $ bup index -ux /etc
+    Indexing: 1981, done.
+    
+    $ bup save -r myserver: -n my-pc-backup /etc
+    Reading index: 1981, done.
+    Saving: 100.00% (998/998k, 1981/1981 files), done.    
+    
+
+# SEE ALSO
+
+`bup-index`(1), `bup-split`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-server.1.md b/Documentation/bup-server.1.md

new file mode 100644 (file)

index 0000000..63e7c3e
--- /dev/null
+++ b/Documentation/bup-server.1.md
@@ -0,0 +1,28 @@
+% bup-server(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup server - the server side of a remote bup session
+
+# SYNOPSIS
+
+bup server
+
+# DESCRIPTION
+
+`bup server` is the server side of a remote bup session. 
+If you use `bup-split`(1) or `bup-save`(1) with the `-r`
+option, they will ssh to the remote server and run `bup
+server` to receive the transmitted objects.
+
+There is normally no reason to run `bup server` yourself.
+
+# SEE ALSO
+
+`bup-save`(1), `bup-split`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-split.1.md b/Documentation/bup-split.1.md

new file mode 100644 (file)

index 0000000..0196c3c
--- /dev/null
+++ b/Documentation/bup-split.1.md
@@ -0,0 +1,110 @@
+% bup-split(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup split - save individual files to bup backup sets
+
+# SYNOPSIS
+
+bup split [-r *host*:*path*] <-b|-t|-c|-n *name*> [-v] [-q]
+  [--bench] [--max-pack-size=*bytes*]
+  [--max-pack-objects=*n*] [--fanout=*count] [filenames...]
+
+# DESCRIPTION
+
+`bup split` concatenates the contents of the given files
+(or if no filenames are given, reads from stdin), splits
+the content into chunks of around 8k using a rolling
+checksum algorithm, and saves the chunks into a bup
+repository.  Chunks which have previously been stored are
+not stored again (ie. they are "deduplicated").
+
+Because of the way the rolling checksum works, chunks
+tend to be very stable across changes to a given file,
+including adding, deleting, and changing bytes.
+
+For example, if you use `bup split` to back up an XML dump
+of a database, and the XML file changes slightly from one
+run to the next, nearly all the data will still be
+deduplicated and the size of each backup after the first
+will typically be quite small.
+
+Another technique is to pipe the output of the `tar`(1) or
+`cpio`(1) programs to `bup split`.  When individual files
+in the tarball change slightly or are added or removed, bup
+still processes the remainder of the tarball efficiently. 
+(Note that `bup save` is usually a more efficient way to
+accomplish this, however.)
+
+To get the data back, use `git-join`(1).
+
+# OPTIONS
+
+-r, --remote=*host*:*path*
+:   save the backup set to the given remote server.  If
+    *path* is omitted, uses the default path on the remote
+    server (you still need to include the ':')
+    
+-b, --blobs
+:   output a series of git blob ids that correspond to the
+    chunks in the dataset.
+
+-t, --tree
+:   output the git tree id of the resulting dataset.
+    
+-c, --commit
+:   output the git commit id of the resulting dataset.
+
+-n, --name=*name*
+:   after creating the dataset, create a git branch
+    named *name* so that it can be accessed using
+    that name.  If *name* already exists, the new dataset
+    will be considered a descendant of the old *name*. 
+    (Thus, you can continually create new datasets with
+    the same name, and later view the history of that
+    dataset to see how it has changed over time.)
+    
+-v, --verbose
+:   increase verbosity (can be used more than once).
+
+-q, --quiet
+:   disable progress messages.
+
+--bench
+:   print benchmark timings to stderr.
+
+--max-pack-size=*bytes*
+:   never create git packfiles larger than the given number
+    of bytes.  Default is 1 billion bytes.  Usually there
+    is no reason to change this.
+
+--max-pack-objects=*numobjs*
+:   never create git packfiles with more than the given
+    number of objects.  Default is 200 thousand objects. 
+    Usually there is no reason to change this.
+    
+--fanout=*numobjs*
+:   when splitting very large files, never put more than
+    this number of git blobs in a single git tree.  Instead,
+    generate a new tree and link to that.  Default is
+    4096 objects per tree.
+
+# EXAMPLE
+    
+    $ tar -cf - /etc | bup split -r myserver: -n mybackup-tar
+    tar: Removing leading /' from member names
+    Indexing objects: 100% (196/196), done.
+    
+    $ bup join -r myserver: mybackup-tar | tar -tf - | wc -l
+    1961
+    
+
+# SEE ALSO
+
+`bup-join`(1), `bup-index`(1), `bup-save`(1)
+
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup-tick.1.md b/Documentation/bup-tick.1.md

new file mode 100644 (file)

index 0000000..56b8b7e
--- /dev/null
+++ b/Documentation/bup-tick.1.md
@@ -0,0 +1,32 @@
+% bup-tick(1) Bup %BUP_VERSION%
+% Avery Pennarun <apenwarr@gmail.com>
+% %BUP_DATE%
+
+# NAME
+
+bup tick - wait for up to one second
+
+# SYNOPSIS
+
+bup tick
+
+# DESCRIPTION
+
+`bup tick` waits until `time`(2) returns a different value
+than it originally did.  Since time() has a granularity of
+one second, this can cause a delay of up to one second.
+
+This program is useful for writing tests that need to
+ensure a file date will be seen as modified.  It is
+slightly better than `sleep`(1) since it sometimes waits
+for less than one second.
+
+# EXAMPLE
+    
+    $ date; bup tick; date
+    Sat Feb  6 16:59:58 EST 2010
+    Sat Feb  6 16:59:59 EST 2010
+    
+# BUP
+
+Part of the `bup`(1) suite.
diff --git a/Documentation/bup.1.md b/Documentation/bup.1.md

index 2f09f4660166d31372348b5a2d33b1b5a98bfa33..bd0497fee9dda3d5df13af71bf239d8f577ac4d0 100644 (file)
--- a/Documentation/bup.1.md
+++ b/Documentation/bup.1.md
@@ -8,49 +8,55 @@ bup - Backup program using rolling checksums and git file formats
  
  # SYNOPSIS
  
-bup [*options*] [*input-file*]...
+bup <command> [options...]
  
  # DESCRIPTION
  
-This is the sample description.
-
-    embeddeded code
-    more code
-  
-More stuff.
-
-## Subsection
-
-Yay!
-
-- this is a list.
-
-- another list item.
-
-    list continuation.
-    
-- another item.
-
-        with some code
-        and more code
-
-1. numbered item.
-
-1. another numbered item.
-
-    - with a list
-    - of items
-    - that say stuff
-
-1. yet another number.
-
-# OPTIONS
-
--o, --output=*output*
-:   the stuff about the term
-
---hello
-:   more stuff
+`bup` is a program for making backups of your files using
+the git file format.
+
+Unlike `git`(1) itself, bup is
+optimized for handling huge data sets including individual
+very large files (such a virtual machine images).  However,
+once a backup set is created, it can still be accessed
+using git tools.
+
+The individual bup subcommands appear in their own man
+pages.
+
+# COMMONLY USED SUBCOMMANDS
+
+`bup-index`(1)
+:   Manage the index of files to back up.
+`bup-fsck`(1)
+:   Verify or recover the bup repository.
+`bup-fuse`(1)
+:   Mount the bup repository as a filesystem.
+`bup-save`(1)
+:   Back up the files in the index.
+`bup-split`(1)
+:   Back up an individual file, such as a tarball.
+`bup-join`(1)
+:   Retrieve a file backed up using `bup-split`(1).
+`bup-midx`(1)
+:   Make backups go faster by generating midx files.
+
+# RARELY USED SUBCOMMANDS
+
+`bup-damage`(1)
+:   Deliberately destroy data.
+`bup-drecurse`(1)
+:   Recursively list files in your filesystem.
+`bup-init`(1)
+:   Initialize a bup repository.
+`bup-ls`(1)
+:   List the files in a bup repository.
+`bup-margin`(1)
+:   Determine how close your bup repository is to armageddon.
+`bup-server`(1)
+:   The server side of the bup client-server relationship.
+`bup-tick`(1)
+:   Sleep for up to one second.
  
  # SEE ALSO
  
diff --git a/cmd-fuse.py b/cmd-fuse.py

index b69d5cb9ce79fac3f47a75eefcd06f07b22f40ce..0bf8ad946698cfee0146f7c6f586bee70e93a15f 100755 (executable)
--- a/cmd-fuse.py
+++ b/cmd-fuse.py
@@ -180,7 +180,7 @@ if not hasattr(fuse, '__version__'):
  fuse.fuse_python_api = (0, 2)
  
  optspec = """
-bup fuse [mountpoint]
+bup fuse [-d] [-f] <mountpoint>
  --
  d,debug   increase debug level
  f,foreground  run in foreground
diff --git a/cmd-index.py b/cmd-index.py

index c64f8f20c2a269364be1e3d2cc686fa405def434..fbbc890fe279c37fbb787abd8bd023b708ddab78 100755 (executable)
--- a/cmd-index.py
+++ b/cmd-index.py
@@ -134,7 +134,7 @@ def update_index(top):
  
  
  optspec = """
-bup index <-p|s|m|u> [options...] <filenames...>
+bup index <-p|m|u> [options...] <filenames...>
  --
  p,print    print the index entries for the given names (also works with -u)
  m,modified print only added/deleted/modified files (implies -p)
diff --git a/git.py b/git.py

index d85d7e2c336a56f2afd40b736806372900d18617..9d7bad5e6aee1f01941b64ee5a99351a0edd6f6f 100644 (file)
--- a/git.py
+++ b/git.py
@@ -289,7 +289,7 @@ def idxmerge(idxlist):
      count = 0
      while heap:
          if (count % 10024) == 0:
-            progress('Creating midx: %.2f%% (%d/%d)\r'
+            progress('Reading indexes: %.2f%% (%d/%d)\r'
                       % (count*100.0/total, count, total))
          (e, it) = heap[0]
          yield e
@@ -299,7 +299,7 @@ def idxmerge(idxlist):
              heapq.heapreplace(heap, (e, it))
          else:
              heapq.heappop(heap)
-    log('Creating midx: %.2f%% (%d/%d), done.\n' % (100, total, total))
+    log('Reading indexes: %.2f%% (%d/%d), done.\n' % (100, total, total))
  
      
  class PackWriter:
author	Avery Pennarun <apenwarr@gmail.com>
	Sat, 6 Feb 2010 22:18:33 +0000 (17:18 -0500)
committer	Avery Pennarun <apenwarr@gmail.com>
	Sat, 6 Feb 2010 22:18:33 +0000 (17:18 -0500)
Documentation/bup-damage.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-drecurse.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-fsck.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-fuse.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-index.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-init.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-join.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-ls.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-margin.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-midx.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-save.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-server.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-split.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup-tick.1.md	[new file with mode: 0644]	patch \| blob
Documentation/bup.1.md		patch \| blob \| history
cmd-fuse.py		patch \| blob \| history
cmd-index.py		patch \| blob \| history
git.py		patch \| blob \| history