From: Avery Pennarun Date: Sun, 5 Sep 2010 18:04:42 +0000 (-0700) Subject: Rename Documentation/*.1.md to Documentation/*.md X-Git-Tag: bup-0.18~9 X-Git-Url: https://arthur.barton.de/gitweb/?p=bup.git;a=commitdiff_plain;h=d05d9df50c50ac944c81338a274b775b9972100f Rename Documentation/*.1.md to Documentation/*.md All our man pages end up in section 1 of man anyway, and it looks like that will probably never change. So let's make our filenames simpler and easier to understand. Even if we do end up adding a page in (say) section 5 someday, it's no big deal; we can just add an exception to the Makefile for it or something. Signed-off-by: Avery Pennarun --- diff --git a/Documentation/Makefile b/Documentation/Makefile index 15cb3cd..067f698 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -14,14 +14,14 @@ default: all all: man html -man: $(patsubst %.md,%,$(wildcard *.md)) +man: $(patsubst %.md,%.1,$(wildcard *.md)) -html: $(patsubst %.1.md,%.html,$(wildcard *.md)) +html: $(patsubst %.md,%.html,$(wildcard *.md)) -%: %.md.tmp Makefile +%.1: %.md.tmp Makefile $(PANDOC) -s -r markdown -w man -o $@ $< -%.html: %.1.md.tmp Makefile +%.html: %.md.tmp Makefile $(PANDOC) -s -r markdown -w html -o $@ $< .PRECIOUS: %.md.tmp diff --git a/Documentation/bup-damage.1.md b/Documentation/bup-damage.1.md deleted file mode 100644 index 868902d..0000000 --- a/Documentation/bup-damage.1.md +++ /dev/null @@ -1,94 +0,0 @@ -% bup-damage(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-damage - randomly destroy blocks of a file - -# SYNOPSIS - -bup damage [-n count] [-s maxsize] [--percent pct] [-S seed] -[--equal] - -# DESCRIPTION - -Use `bup damage` to deliberately destroy blocks in a -`.pack` or `.idx` file (from `.bup/objects/pack`) to test -the recovery features of `bup-fsck`(1) or other programs. - -*THIS PROGRAM IS EXTREMELY DANGEROUS AND WILL DESTROY YOUR -DATA* - -`bup damage` is primarily useful for automated or manual tests -of data recovery tools, to reassure yourself that the tools -actually work. - -# OPTIONS - --n, --num=*numblocks* -: the number of separate blocks to damage in each file - (default 10). - Note that it's possible for more than one damaged - segment to fall in the same `bup-fsck`(1) recovery block, - so you might not damage as many recovery blocks as you - expect. If this is a problem, use `--equal`. - --s, --size=*maxblocksize* -: the maximum size, in bytes, of each damaged block - (default 1 unless `--percent` is specified). Note that - because of the way `bup-fsck`(1) works, a multi-byte - block could fall on the boundary between two recovery - blocks, and thus damaging two separate recovery blocks. - In small files, it's also possible for a damaged block - to be larger than a recovery block. If these issues - might be a problem, you should use the default damage - size of one byte. - ---percent=*maxblockpercent* -: the maximum size, in percent of the original file, of - each damaged block. If both `--size` and `--percent` - are given, the maximum block size is the minimum of the - two restrictions. You can use this to ensure that a - given block will never damage more than one or two - `git-fsck`(1) recovery blocks. - --S, --seed=*randomseed* -: seed the random number generator with the given value. - If you use this option, your tests will be repeatable, - since the damaged block offsets, sizes, and contents - will be the same every time. By default, the random - numbers are different every time (so you can run tests - in a loop and repeatedly test with different - damage each time). - ---equal -: instead of choosing random offsets for each damaged - block, space the blocks equally throughout the file, - starting at offset 0. If you also choose a correct - maximum block size, this can guarantee that any given - damage block never damages more than one `git-fsck`(1) - recovery block. (This is also guaranteed if you use - `-s 1`.) - -# EXAMPLE - - # make a backup in case things go horribly wrong - cp -a ~/.bup/objects/pack ~/bup-packs.bak - - # generate recovery blocks for all packs - bup fsck -g - - # deliberately damage the packs - bup damage -n 10 -s 1 -S 0 ~/.bup/objects/pack/*.{pack,idx} - - # recover from the damage - bup fsck -r - -# SEE ALSO - -`bup-fsck`(1), `par2`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-damage.md b/Documentation/bup-damage.md new file mode 100644 index 0000000..868902d --- /dev/null +++ b/Documentation/bup-damage.md @@ -0,0 +1,94 @@ +% bup-damage(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-damage - randomly destroy blocks of a file + +# SYNOPSIS + +bup damage [-n count] [-s maxsize] [--percent pct] [-S seed] +[--equal] + +# DESCRIPTION + +Use `bup damage` to deliberately destroy blocks in a +`.pack` or `.idx` file (from `.bup/objects/pack`) to test +the recovery features of `bup-fsck`(1) or other programs. + +*THIS PROGRAM IS EXTREMELY DANGEROUS AND WILL DESTROY YOUR +DATA* + +`bup damage` is primarily useful for automated or manual tests +of data recovery tools, to reassure yourself that the tools +actually work. + +# OPTIONS + +-n, --num=*numblocks* +: the number of separate blocks to damage in each file + (default 10). + Note that it's possible for more than one damaged + segment to fall in the same `bup-fsck`(1) recovery block, + so you might not damage as many recovery blocks as you + expect. If this is a problem, use `--equal`. + +-s, --size=*maxblocksize* +: the maximum size, in bytes, of each damaged block + (default 1 unless `--percent` is specified). Note that + because of the way `bup-fsck`(1) works, a multi-byte + block could fall on the boundary between two recovery + blocks, and thus damaging two separate recovery blocks. + In small files, it's also possible for a damaged block + to be larger than a recovery block. If these issues + might be a problem, you should use the default damage + size of one byte. + +--percent=*maxblockpercent* +: the maximum size, in percent of the original file, of + each damaged block. If both `--size` and `--percent` + are given, the maximum block size is the minimum of the + two restrictions. You can use this to ensure that a + given block will never damage more than one or two + `git-fsck`(1) recovery blocks. + +-S, --seed=*randomseed* +: seed the random number generator with the given value. + If you use this option, your tests will be repeatable, + since the damaged block offsets, sizes, and contents + will be the same every time. By default, the random + numbers are different every time (so you can run tests + in a loop and repeatedly test with different + damage each time). + +--equal +: instead of choosing random offsets for each damaged + block, space the blocks equally throughout the file, + starting at offset 0. If you also choose a correct + maximum block size, this can guarantee that any given + damage block never damages more than one `git-fsck`(1) + recovery block. (This is also guaranteed if you use + `-s 1`.) + +# EXAMPLE + + # make a backup in case things go horribly wrong + cp -a ~/.bup/objects/pack ~/bup-packs.bak + + # generate recovery blocks for all packs + bup fsck -g + + # deliberately damage the packs + bup damage -n 10 -s 1 -S 0 ~/.bup/objects/pack/*.{pack,idx} + + # recover from the damage + bup fsck -r + +# SEE ALSO + +`bup-fsck`(1), `par2`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-drecurse.1.md b/Documentation/bup-drecurse.1.md deleted file mode 100644 index 13db28d..0000000 --- a/Documentation/bup-drecurse.1.md +++ /dev/null @@ -1,52 +0,0 @@ -% bup-drecurse(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-drecurse - recursively list files in your filesystem - -# SYNOPSIS - -bup drecurse [-x] [-q] [--profile] \ - -# DESCRIPTION - -`bup drecurse` traverses files in the filesystem in a way -similar to `find`(1). In most cases, you should use -`find`(1) instead. - -This program is useful mainly for testing the file -traversal algorithm used in `bup-index`(1). - -Note that filenames are returned in reverse alphabetical -order, as in `bup-index`(1). This is important because you -can't generate the hash of a parent directory until you -have generated the hashes of all its children. When -listing files in reverse order, the parent directory will -come after its children, making this easy. - -# OPTIONS - --x, --xdev, --one-file-system -: don't cross filesystem boundaries. - --q, --quiet -: don't print filenames as they are encountered. Useful - when testing performance of the traversal algorithms. - ---profile -: print profiling information upon completion. Useful - when testing performance of the traversal algorithms. - -# EXAMPLE - - bup drecurse -x / - -# SEE ALSO - -`bup-index`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-drecurse.md b/Documentation/bup-drecurse.md new file mode 100644 index 0000000..13db28d --- /dev/null +++ b/Documentation/bup-drecurse.md @@ -0,0 +1,52 @@ +% bup-drecurse(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-drecurse - recursively list files in your filesystem + +# SYNOPSIS + +bup drecurse [-x] [-q] [--profile] \ + +# DESCRIPTION + +`bup drecurse` traverses files in the filesystem in a way +similar to `find`(1). In most cases, you should use +`find`(1) instead. + +This program is useful mainly for testing the file +traversal algorithm used in `bup-index`(1). + +Note that filenames are returned in reverse alphabetical +order, as in `bup-index`(1). This is important because you +can't generate the hash of a parent directory until you +have generated the hashes of all its children. When +listing files in reverse order, the parent directory will +come after its children, making this easy. + +# OPTIONS + +-x, --xdev, --one-file-system +: don't cross filesystem boundaries. + +-q, --quiet +: don't print filenames as they are encountered. Useful + when testing performance of the traversal algorithms. + +--profile +: print profiling information upon completion. Useful + when testing performance of the traversal algorithms. + +# EXAMPLE + + bup drecurse -x / + +# SEE ALSO + +`bup-index`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-fsck.1.md b/Documentation/bup-fsck.1.md deleted file mode 100644 index da3cfc5..0000000 --- a/Documentation/bup-fsck.1.md +++ /dev/null @@ -1,117 +0,0 @@ -% bup-fsck(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-fsck - verify or repair a bup repository - -# SYNOPSIS - -bup fsck [-r] [-g] [-v] [--quick] [-j *jobs*] [--par2-ok] -[--disable-par2] [filenames...] - -# DESCRIPTION - -`bup fsck` is a tool for validating bup repositories in the -same way that `git fsck` validates git repositories. - -It can also generate and/or use "recovery blocks" using the -`par2`(1) tool (if you have it installed). This allows you -to recover from damaged blocks covering up to 5% of your -`.pack` files. - -In a normal backup system, damaged blocks are less -important, because there tends to be enough data duplicated -between backup sets that a single damaged backup set is -non-critical. In a deduplicating backup system like bup, -however, no block is ever stored more than once, even if it -is used in every single backup. If that block were to be -unrecoverable, *all* your backup sets would be -damaged at once. Thus, it's important to be able to verify -the integrity of your backups and recover from disk errors -if they occur. - -*WARNING*: bup fsck's recovery features are not available -unless you have the free `par2`(1) package installed on -your bup server. - -*WARNING*: bup fsck obviously cannot recover from a -complete disk failure. If your backups are important, you -need to carefully consider redundancy (such as using RAID -for multi-disk redundancy, or making off-site backups for -site redundancy). - -# OPTIONS - --r, --repair -: attempt to repair any damaged packs using - existing recovery blocks. (Requires `par2`(1).) - --g, --generate -: generate recovery blocks for any packs that don't - already have them. (Requires `par2`(1).) - --v, --verbose -: increase verbosity (can be used more than once). - ---quick -: don't run a full `git verify-pack` on each pack file; - instead just check the final checksum. This can cause - a significant speedup with no obvious decrease in - reliability. However, you may want to avoid this - option if you're paranoid. Has no effect on packs that - already have recovery information. - --j, --jobs=*numjobs* -: maximum number of pack verifications to run at a time. - The optimal value for this option depends how fast your - CPU can verify packs vs. your disk throughput. If you - run too many jobs at once, your disk will get saturated - by seeking back and forth between files and performance - will actually decrease, even if *numjobs* is less than - the number of CPU cores on your system. You can - experiment with this option to find the optimal value. - ---par2-ok -: immediately return 0 if `par2`(1) is installed and - working, or 1 otherwise. Do not actually check - anything. - ---disable-par2 -: pretend that `par2`(1) is not installed, and ignore all - recovery blocks. - - -# EXAMPLE - - # generate recovery blocks for all packs that don't - # have them - bup fsck -g - - # generate recovery blocks for a particular pack - bup fsck -g ~/.bup/objects/pack/153a1420cb1c8*.pack - - # check all packs for correctness (can be very slow!) - bup fsck - - # check all packs for correctness and recover any - # damaged ones - bup fsck -r - - # check a particular pack for correctness and recover - # it if damaged - bup fsck -r ~/.bup/objects/pack/153a1420cb1c8*.pack - - # check if recovery blocks are available on this system - if bup fsck --par2-ok; then - echo "par2 is ok" - fi - -# SEE ALSO - -`bup-damage`(1), `fsck`(1), `git-fsck`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-fsck.md b/Documentation/bup-fsck.md new file mode 100644 index 0000000..da3cfc5 --- /dev/null +++ b/Documentation/bup-fsck.md @@ -0,0 +1,117 @@ +% bup-fsck(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-fsck - verify or repair a bup repository + +# SYNOPSIS + +bup fsck [-r] [-g] [-v] [--quick] [-j *jobs*] [--par2-ok] +[--disable-par2] [filenames...] + +# DESCRIPTION + +`bup fsck` is a tool for validating bup repositories in the +same way that `git fsck` validates git repositories. + +It can also generate and/or use "recovery blocks" using the +`par2`(1) tool (if you have it installed). This allows you +to recover from damaged blocks covering up to 5% of your +`.pack` files. + +In a normal backup system, damaged blocks are less +important, because there tends to be enough data duplicated +between backup sets that a single damaged backup set is +non-critical. In a deduplicating backup system like bup, +however, no block is ever stored more than once, even if it +is used in every single backup. If that block were to be +unrecoverable, *all* your backup sets would be +damaged at once. Thus, it's important to be able to verify +the integrity of your backups and recover from disk errors +if they occur. + +*WARNING*: bup fsck's recovery features are not available +unless you have the free `par2`(1) package installed on +your bup server. + +*WARNING*: bup fsck obviously cannot recover from a +complete disk failure. If your backups are important, you +need to carefully consider redundancy (such as using RAID +for multi-disk redundancy, or making off-site backups for +site redundancy). + +# OPTIONS + +-r, --repair +: attempt to repair any damaged packs using + existing recovery blocks. (Requires `par2`(1).) + +-g, --generate +: generate recovery blocks for any packs that don't + already have them. (Requires `par2`(1).) + +-v, --verbose +: increase verbosity (can be used more than once). + +--quick +: don't run a full `git verify-pack` on each pack file; + instead just check the final checksum. This can cause + a significant speedup with no obvious decrease in + reliability. However, you may want to avoid this + option if you're paranoid. Has no effect on packs that + already have recovery information. + +-j, --jobs=*numjobs* +: maximum number of pack verifications to run at a time. + The optimal value for this option depends how fast your + CPU can verify packs vs. your disk throughput. If you + run too many jobs at once, your disk will get saturated + by seeking back and forth between files and performance + will actually decrease, even if *numjobs* is less than + the number of CPU cores on your system. You can + experiment with this option to find the optimal value. + +--par2-ok +: immediately return 0 if `par2`(1) is installed and + working, or 1 otherwise. Do not actually check + anything. + +--disable-par2 +: pretend that `par2`(1) is not installed, and ignore all + recovery blocks. + + +# EXAMPLE + + # generate recovery blocks for all packs that don't + # have them + bup fsck -g + + # generate recovery blocks for a particular pack + bup fsck -g ~/.bup/objects/pack/153a1420cb1c8*.pack + + # check all packs for correctness (can be very slow!) + bup fsck + + # check all packs for correctness and recover any + # damaged ones + bup fsck -r + + # check a particular pack for correctness and recover + # it if damaged + bup fsck -r ~/.bup/objects/pack/153a1420cb1c8*.pack + + # check if recovery blocks are available on this system + if bup fsck --par2-ok; then + echo "par2 is ok" + fi + +# SEE ALSO + +`bup-damage`(1), `fsck`(1), `git-fsck`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-ftp.1.md b/Documentation/bup-ftp.1.md deleted file mode 100644 index ba11b48..0000000 --- a/Documentation/bup-ftp.1.md +++ /dev/null @@ -1,88 +0,0 @@ -% bup-ftp(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-ftp - ftp-like client for navigating bup repositories - -# SYNOPSIS - -bup ftp - -# DESCRIPTION - -`bup ftp` is a command-line tool for navigating bup -repositories. It has commands similar to the Unix `ftp`(1) -command. The file hierarchy is the same as that shown by -`bup-fuse`(1) and `bup-ls`(1). - -Note: if your system has the python-readline library -installed, you can use the \ key to complete filenames -while navigating your backup data. This will save you a -lot of typing. - - -# COMMANDS - -The following commands are available inside `bup ftp`: - -ls -: print the contents of the current working directory - -cd *dirname* -: change to a different working directory - -pwd -: print the path of the current working directory - -cat *filenames...* -: print the contents of one or more files to stdout - -get *filename* *localname* -: download the contents of *filename* and save it to disk - as *localname*. If *localname* is omitted, uses - *filename* as the local name. - -mget *filenames...* -: download the contents of the given *filenames* and - stores them to disk under the same names. The - filenames may contain Unix filename globs (`*`, `?`, - etc.) - -help -: print a list of available commands - -quit -: exit the `bup ftp` client - - -# EXAMPLE - - $ bup ftp - bup> ls - mybackup/ - yourbackup/ - bup> cd mybackup/ - bup> ls - .2fe288dedbfab372c84b0502ee2bc1504270f3b3/ - .ae760aa4cfc13b689b46e3d2ce5ae50e92299c72/ - 2010-02-05-185507@ - 2010-02-05-185508@ - latest@ - bup> cd latest/ - bup> ls - (...etc...) - bup> get myfile - Saving 'myfile' - bup> quit - - -# SEE ALSO - -`bup-join`(1), `bup-fuse`(1), `bup-ls`(1), `bup-save`(1), `git-show`(1) - - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-ftp.md b/Documentation/bup-ftp.md new file mode 100644 index 0000000..ba11b48 --- /dev/null +++ b/Documentation/bup-ftp.md @@ -0,0 +1,88 @@ +% bup-ftp(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-ftp - ftp-like client for navigating bup repositories + +# SYNOPSIS + +bup ftp + +# DESCRIPTION + +`bup ftp` is a command-line tool for navigating bup +repositories. It has commands similar to the Unix `ftp`(1) +command. The file hierarchy is the same as that shown by +`bup-fuse`(1) and `bup-ls`(1). + +Note: if your system has the python-readline library +installed, you can use the \ key to complete filenames +while navigating your backup data. This will save you a +lot of typing. + + +# COMMANDS + +The following commands are available inside `bup ftp`: + +ls +: print the contents of the current working directory + +cd *dirname* +: change to a different working directory + +pwd +: print the path of the current working directory + +cat *filenames...* +: print the contents of one or more files to stdout + +get *filename* *localname* +: download the contents of *filename* and save it to disk + as *localname*. If *localname* is omitted, uses + *filename* as the local name. + +mget *filenames...* +: download the contents of the given *filenames* and + stores them to disk under the same names. The + filenames may contain Unix filename globs (`*`, `?`, + etc.) + +help +: print a list of available commands + +quit +: exit the `bup ftp` client + + +# EXAMPLE + + $ bup ftp + bup> ls + mybackup/ + yourbackup/ + bup> cd mybackup/ + bup> ls + .2fe288dedbfab372c84b0502ee2bc1504270f3b3/ + .ae760aa4cfc13b689b46e3d2ce5ae50e92299c72/ + 2010-02-05-185507@ + 2010-02-05-185508@ + latest@ + bup> cd latest/ + bup> ls + (...etc...) + bup> get myfile + Saving 'myfile' + bup> quit + + +# SEE ALSO + +`bup-join`(1), `bup-fuse`(1), `bup-ls`(1), `bup-save`(1), `git-show`(1) + + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-fuse.1.md b/Documentation/bup-fuse.1.md deleted file mode 100644 index 1a4469d..0000000 --- a/Documentation/bup-fuse.1.md +++ /dev/null @@ -1,57 +0,0 @@ -% bup-fuse(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-fuse - mount a bup repository as a filesystem - -# SYNOPSIS - -bup fuse [-d] [-f] [-o] \ - -# DESCRIPTION - -`bup fuse` opens a bup repository and exports it as a -`fuse`(7) userspace filesystem. - -This feature is only available on systems (such as Linux) -which support FUSE. - -**WARNING**: bup fuse is still experimental and does not -enforce any file permissions! All files will be readable -by all users. - -When you're done accessing the mounted fuse filesystem, you -should unmount it with `umount`(8). - -# OPTIONS - --d, --debug -: run in the foreground and print FUSE debug information - for each request. - --f, --foreground -: run in the foreground and exit only when the filesystem - is unmounted. - --o, --allow-other -: permit other users to access the filesystem. Necessary for - exporting the filesystem via Samba, for example. - -# EXAMPLE - - rm -rf /tmp/buptest - mkdir /tmp/buptest - sudo bup fuse -d /tmp/buptest - ls /tmp/buptest/*/latest - ... - umount /tmp/buptest - -# SEE ALSO - -`fuse`(7), `fusermount`(1), `bup-ls`(1), `bup-ftp`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-fuse.md b/Documentation/bup-fuse.md new file mode 100644 index 0000000..1a4469d --- /dev/null +++ b/Documentation/bup-fuse.md @@ -0,0 +1,57 @@ +% bup-fuse(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-fuse - mount a bup repository as a filesystem + +# SYNOPSIS + +bup fuse [-d] [-f] [-o] \ + +# DESCRIPTION + +`bup fuse` opens a bup repository and exports it as a +`fuse`(7) userspace filesystem. + +This feature is only available on systems (such as Linux) +which support FUSE. + +**WARNING**: bup fuse is still experimental and does not +enforce any file permissions! All files will be readable +by all users. + +When you're done accessing the mounted fuse filesystem, you +should unmount it with `umount`(8). + +# OPTIONS + +-d, --debug +: run in the foreground and print FUSE debug information + for each request. + +-f, --foreground +: run in the foreground and exit only when the filesystem + is unmounted. + +-o, --allow-other +: permit other users to access the filesystem. Necessary for + exporting the filesystem via Samba, for example. + +# EXAMPLE + + rm -rf /tmp/buptest + mkdir /tmp/buptest + sudo bup fuse -d /tmp/buptest + ls /tmp/buptest/*/latest + ... + umount /tmp/buptest + +# SEE ALSO + +`fuse`(7), `fusermount`(1), `bup-ls`(1), `bup-ftp`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-help.1.md b/Documentation/bup-help.1.md deleted file mode 100644 index 9799279..0000000 --- a/Documentation/bup-help.1.md +++ /dev/null @@ -1,28 +0,0 @@ -% bup-help(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-help - open the documentation for a given bup command - -# SYNOPSIS - -bup help \ - -# DESCRIPTION - -`bup help ` opens the documentation for the given command. -This is currently equivalent to typing `man bup-`. - - -# EXAMPLE - - $ bup help help - (Imagine that this man page was pasted below, - recursively. Because that would cause an endless - we include this silly remark instead. Chicken.) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-help.md b/Documentation/bup-help.md new file mode 100644 index 0000000..9799279 --- /dev/null +++ b/Documentation/bup-help.md @@ -0,0 +1,28 @@ +% bup-help(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-help - open the documentation for a given bup command + +# SYNOPSIS + +bup help \ + +# DESCRIPTION + +`bup help ` opens the documentation for the given command. +This is currently equivalent to typing `man bup-`. + + +# EXAMPLE + + $ bup help help + (Imagine that this man page was pasted below, + recursively. Because that would cause an endless + we include this silly remark instead. Chicken.) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-index.1.md b/Documentation/bup-index.1.md deleted file mode 100644 index 9b72a00..0000000 --- a/Documentation/bup-index.1.md +++ /dev/null @@ -1,117 +0,0 @@ -% bup-index(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-index - print and/or update the bup filesystem index - -# SYNOPSIS - -bup index <-p|-m|-u> [-s] [-H] [-l] [-x] [--fake-valid] -[--check] [-f *indexfile*] [-v] - -# DESCRIPTION - -`bup index` prints and/or updates the bup filesystem index, -which is a cache of the filenames, attributes, and sha-1 -hashes of each file and directory in the filesystem. The -bup index is similar in function to the `git`(1) index, and -can be found in `~/.bup/bupindex`. - -Creating a backup in bup consists of two steps: updating -the index with `bup index`, then actually backing up the -files (or a subset of the files) with `bup save`. The -separation exists for these reasons: - -1. There is more than one way to generate a list of files -that need to be backed up. For example, you might want to -use `inotify`(7) or `dnotify`(7). - -2. Even if you back up files to multiple destinations (for -added redundancy), the file names, attributes, and hashes -will be the same each time. Thus, you can save the trouble -of repeatedly re-generating the list of files for each -backup set. - -3. You may want to use the data tracked by bup index for -other purposes (such as speeding up other programs that -need the same information). - - -# OPTIONS - --u, --update -: (recursively) update the index for the given filenames and - their descendants. One or more filenames must be - given. - --p, --print -: print the contents of the index. If filenames are - given, shows the given entries and their descendants. - If no filenames are given, shows the entries starting - at the current working directory (.). - --m, --modified -: prints only files which are marked as modified (ie. - changed since the most recent backup) in the index. - Implies `-p`. - --s, --status -: prepend a status code (A, M, D, or space) before each - filename. Implies `-p`. The codes mean, respectively, - that a file is marked in the index as added, modified, - deleted, or unchanged since the last backup. - --H, --hash -: for each file printed, prepend the most recently - recorded hash code. The hash code is normally - generated by `bup save`. For objects which have not yet - been backed up, the hash code will be - 0000000000000000000000000000000000000000. Note that - the hash code is printed even if the file is known to - be modified or deleted in the index (ie. the file on - the filesystem no longer matches the recorded hash). - If this is a problem for you, use `--status`. - --l, --long -: print more information about each file, in a similar - format to the `-l` option to `ls`(1). (INCOMPLETE) - --x, --xdev, --one-file-system -: don't cross filesystem boundaries when recursing - through the filesystem. Only applicable if you're - using `-u`. - ---fake-valid -: mark specified filenames as up-to-date even if they - aren't. This can be useful for testing, or to avoid - unnecessarily backing up files that you know are - boring. - ---check -: carefully check index file integrity before and after - updating. Mostly useful for automated tests. - --f, --indexfile=*indexfile* -: use a different index filename instead of - `~/.bup/bupindex`. - --v, --verbose -: increase log output during update (can be used more - than once). With one `-v`, print each directory as it - is updated; with two `-v`, print each file too. - - -# EXAMPLE - - bup index -vux /etc /var /usr - - -# SEE ALSO - -`bup-save`(1), `bup-drecurse`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-index.md b/Documentation/bup-index.md new file mode 100644 index 0000000..9b72a00 --- /dev/null +++ b/Documentation/bup-index.md @@ -0,0 +1,117 @@ +% bup-index(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-index - print and/or update the bup filesystem index + +# SYNOPSIS + +bup index <-p|-m|-u> [-s] [-H] [-l] [-x] [--fake-valid] +[--check] [-f *indexfile*] [-v] + +# DESCRIPTION + +`bup index` prints and/or updates the bup filesystem index, +which is a cache of the filenames, attributes, and sha-1 +hashes of each file and directory in the filesystem. The +bup index is similar in function to the `git`(1) index, and +can be found in `~/.bup/bupindex`. + +Creating a backup in bup consists of two steps: updating +the index with `bup index`, then actually backing up the +files (or a subset of the files) with `bup save`. The +separation exists for these reasons: + +1. There is more than one way to generate a list of files +that need to be backed up. For example, you might want to +use `inotify`(7) or `dnotify`(7). + +2. Even if you back up files to multiple destinations (for +added redundancy), the file names, attributes, and hashes +will be the same each time. Thus, you can save the trouble +of repeatedly re-generating the list of files for each +backup set. + +3. You may want to use the data tracked by bup index for +other purposes (such as speeding up other programs that +need the same information). + + +# OPTIONS + +-u, --update +: (recursively) update the index for the given filenames and + their descendants. One or more filenames must be + given. + +-p, --print +: print the contents of the index. If filenames are + given, shows the given entries and their descendants. + If no filenames are given, shows the entries starting + at the current working directory (.). + +-m, --modified +: prints only files which are marked as modified (ie. + changed since the most recent backup) in the index. + Implies `-p`. + +-s, --status +: prepend a status code (A, M, D, or space) before each + filename. Implies `-p`. The codes mean, respectively, + that a file is marked in the index as added, modified, + deleted, or unchanged since the last backup. + +-H, --hash +: for each file printed, prepend the most recently + recorded hash code. The hash code is normally + generated by `bup save`. For objects which have not yet + been backed up, the hash code will be + 0000000000000000000000000000000000000000. Note that + the hash code is printed even if the file is known to + be modified or deleted in the index (ie. the file on + the filesystem no longer matches the recorded hash). + If this is a problem for you, use `--status`. + +-l, --long +: print more information about each file, in a similar + format to the `-l` option to `ls`(1). (INCOMPLETE) + +-x, --xdev, --one-file-system +: don't cross filesystem boundaries when recursing + through the filesystem. Only applicable if you're + using `-u`. + +--fake-valid +: mark specified filenames as up-to-date even if they + aren't. This can be useful for testing, or to avoid + unnecessarily backing up files that you know are + boring. + +--check +: carefully check index file integrity before and after + updating. Mostly useful for automated tests. + +-f, --indexfile=*indexfile* +: use a different index filename instead of + `~/.bup/bupindex`. + +-v, --verbose +: increase log output during update (can be used more + than once). With one `-v`, print each directory as it + is updated; with two `-v`, print each file too. + + +# EXAMPLE + + bup index -vux /etc /var /usr + + +# SEE ALSO + +`bup-save`(1), `bup-drecurse`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-init.1.md b/Documentation/bup-init.1.md deleted file mode 100644 index d3e645b..0000000 --- a/Documentation/bup-init.1.md +++ /dev/null @@ -1,40 +0,0 @@ -% bup-init(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-init - initialize a bup repository - -# SYNOPSIS - -[BUP_DIR=*localpath*] bup init [-r *host*:*path*] - -# DESCRIPTION - -`bup init` initializes your local bup repository. You -usually don't need to run it unless you have set BUP_DIR -explicitly. By default, BUP_DIR is `~/.bup` and will be -initialized automatically whenever you run any bup command. - -# OPTIONS - --r, --remote=*host*:*path* -: Initialize not only the local repository, but also the - remote repository given by the *host* and *path*. This is - not necessary if you intend to back up to the default - location on the server (ie. a blank *path*). - - -# EXAMPLE - - bup init - - -# SEE ALSO - -`bup-fsck`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-init.md b/Documentation/bup-init.md new file mode 100644 index 0000000..d3e645b --- /dev/null +++ b/Documentation/bup-init.md @@ -0,0 +1,40 @@ +% bup-init(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-init - initialize a bup repository + +# SYNOPSIS + +[BUP_DIR=*localpath*] bup init [-r *host*:*path*] + +# DESCRIPTION + +`bup init` initializes your local bup repository. You +usually don't need to run it unless you have set BUP_DIR +explicitly. By default, BUP_DIR is `~/.bup` and will be +initialized automatically whenever you run any bup command. + +# OPTIONS + +-r, --remote=*host*:*path* +: Initialize not only the local repository, but also the + remote repository given by the *host* and *path*. This is + not necessary if you intend to back up to the default + location on the server (ie. a blank *path*). + + +# EXAMPLE + + bup init + + +# SEE ALSO + +`bup-fsck`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-join.1.md b/Documentation/bup-join.1.md deleted file mode 100644 index 921b92f..0000000 --- a/Documentation/bup-join.1.md +++ /dev/null @@ -1,53 +0,0 @@ -% bup-join(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-join - concatenate files from a bup repository - -# SYNOPSIS - -bup join [-r *host*:*path*] [refs or hashes...] - -# DESCRIPTION - -`bup join` is roughly the opposite operation to -`bup-split`(1). You can use it to retrieve the contents of -a file from a local or remote bup repository. - -The supplied list of refs or hashes can be in any format -accepted by `git`(1), including branch names, commit ids, -tree ids, or blob ids. - -If no refs or hashes are given on the command line, `bup -join` reads them from stdin instead. - -# OPTIONS - --r, --remote=*host*:*path* -: Retrieves objects from the given remote repository - instead of the local one. *path* may be blank, in which - case the default remote repository is used. - - -# EXAMPLE - - # split and then rejoin a file using its tree id - TREE=$(tar -cvf - /etc | bup split -t) - bup join $TREE | tar -tf - - - # make two backups, then get the second-most-recent. - # mybackup~1 is git(1) notation for the second most - # recent commit on the branch named mybackup. - tar -cvf - /etc | bup split -n mybackup - tar -cvf - /etc | bup split -n mybackup - bup join mybackup~1 | tar -tf - - -# SEE ALSO - -`bup-split`(1), `bup-save`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-join.md b/Documentation/bup-join.md new file mode 100644 index 0000000..921b92f --- /dev/null +++ b/Documentation/bup-join.md @@ -0,0 +1,53 @@ +% bup-join(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-join - concatenate files from a bup repository + +# SYNOPSIS + +bup join [-r *host*:*path*] [refs or hashes...] + +# DESCRIPTION + +`bup join` is roughly the opposite operation to +`bup-split`(1). You can use it to retrieve the contents of +a file from a local or remote bup repository. + +The supplied list of refs or hashes can be in any format +accepted by `git`(1), including branch names, commit ids, +tree ids, or blob ids. + +If no refs or hashes are given on the command line, `bup +join` reads them from stdin instead. + +# OPTIONS + +-r, --remote=*host*:*path* +: Retrieves objects from the given remote repository + instead of the local one. *path* may be blank, in which + case the default remote repository is used. + + +# EXAMPLE + + # split and then rejoin a file using its tree id + TREE=$(tar -cvf - /etc | bup split -t) + bup join $TREE | tar -tf - + + # make two backups, then get the second-most-recent. + # mybackup~1 is git(1) notation for the second most + # recent commit on the branch named mybackup. + tar -cvf - /etc | bup split -n mybackup + tar -cvf - /etc | bup split -n mybackup + bup join mybackup~1 | tar -tf - + +# SEE ALSO + +`bup-split`(1), `bup-save`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-ls.1.md b/Documentation/bup-ls.1.md deleted file mode 100644 index a838165..0000000 --- a/Documentation/bup-ls.1.md +++ /dev/null @@ -1,43 +0,0 @@ -% bup-ls(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-ls - list the contents of a bup repository - -# SYNOPSIS - -bup ls [-s] - -# DESCRIPTION - -`bup ls` lists files and directories in your bup repository -using the same directory hierarchy as they would have with -`bup-fuse`(1). - -The top level directory is the branch (corresponding to -the `-n` option in `bup save`), the next level is the date -of the backup, and subsequent levels correspond to files in -the backup. - -Once you have identified the file you want using `bup ls`, -you can view its contents using `bup join` or `git show`. - -# OPTIONS - --s, --hash -: show hash for each file/directory. - - -# EXAMPLE - - bup ls /myserver/latest/etc/profile - -# SEE ALSO - -`bup-join`(1), `bup-fuse`(1), `bup-ftp`(1), `bup-save`(1), `git-show`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-ls.md b/Documentation/bup-ls.md new file mode 100644 index 0000000..a838165 --- /dev/null +++ b/Documentation/bup-ls.md @@ -0,0 +1,43 @@ +% bup-ls(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-ls - list the contents of a bup repository + +# SYNOPSIS + +bup ls [-s] + +# DESCRIPTION + +`bup ls` lists files and directories in your bup repository +using the same directory hierarchy as they would have with +`bup-fuse`(1). + +The top level directory is the branch (corresponding to +the `-n` option in `bup save`), the next level is the date +of the backup, and subsequent levels correspond to files in +the backup. + +Once you have identified the file you want using `bup ls`, +you can view its contents using `bup join` or `git show`. + +# OPTIONS + +-s, --hash +: show hash for each file/directory. + + +# EXAMPLE + + bup ls /myserver/latest/etc/profile + +# SEE ALSO + +`bup-join`(1), `bup-fuse`(1), `bup-ftp`(1), `bup-save`(1), `git-show`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-margin.1.md b/Documentation/bup-margin.1.md deleted file mode 100644 index 042c182..0000000 --- a/Documentation/bup-margin.1.md +++ /dev/null @@ -1,53 +0,0 @@ -% bup-margin(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-margin - figure out your deduplication safety margin - -# SYNOPSIS - -bup margin - -# DESCRIPTION - -`bup margin` iterates through all objects in your bup -repository, calculating the largest number of prefix bits -shared between any two entries. This number, `n`, -identifies the longest subset of SHA-1 you could use and still -encounter a collision between your object ids. - -For example, one system that was tested had a collection of -11 million objects (70 GB), and `bup margin` returned 45. -That means a 46-bit hash would be sufficient to avoid all -collisions among that set of objects; each object in that -repository could be uniquely identified by its first 46 -bits. - -The number of bits needed seems to increase by about 1 or 2 -for every doubling of the number of objects. Since SHA-1 -hashes have 160 bits, that leaves 115 bits of margin. Of -course, because SHA-1 hashes are essentially random, it's -theoretically possible to use many more bits with far fewer -objects. - -If you're paranoid about the possibility of SHA-1 -collisions, you can monitor your repository by running `bup -margin` occasionally to see if you're getting dangerously -close to 160 bits. - -# EXAMPLE - - $ bup margin - Reading indexes: 100.00% (11188299/11188299), done. - 45 - - -# SEE ALSO - -`bup-midx`(1), `bup-save`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-margin.md b/Documentation/bup-margin.md new file mode 100644 index 0000000..042c182 --- /dev/null +++ b/Documentation/bup-margin.md @@ -0,0 +1,53 @@ +% bup-margin(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-margin - figure out your deduplication safety margin + +# SYNOPSIS + +bup margin + +# DESCRIPTION + +`bup margin` iterates through all objects in your bup +repository, calculating the largest number of prefix bits +shared between any two entries. This number, `n`, +identifies the longest subset of SHA-1 you could use and still +encounter a collision between your object ids. + +For example, one system that was tested had a collection of +11 million objects (70 GB), and `bup margin` returned 45. +That means a 46-bit hash would be sufficient to avoid all +collisions among that set of objects; each object in that +repository could be uniquely identified by its first 46 +bits. + +The number of bits needed seems to increase by about 1 or 2 +for every doubling of the number of objects. Since SHA-1 +hashes have 160 bits, that leaves 115 bits of margin. Of +course, because SHA-1 hashes are essentially random, it's +theoretically possible to use many more bits with far fewer +objects. + +If you're paranoid about the possibility of SHA-1 +collisions, you can monitor your repository by running `bup +margin` occasionally to see if you're getting dangerously +close to 160 bits. + +# EXAMPLE + + $ bup margin + Reading indexes: 100.00% (11188299/11188299), done. + 45 + + +# SEE ALSO + +`bup-midx`(1), `bup-save`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-memtest.1.md b/Documentation/bup-memtest.1.md deleted file mode 100644 index 108a5a1..0000000 --- a/Documentation/bup-memtest.1.md +++ /dev/null @@ -1,119 +0,0 @@ -% bup-memtest(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-memtest - test bup memory usage statistics - -# SYNOPSIS - -bup memtest [options...] - -# DESCRIPTION - -`bup memtest` opens the list of pack indexes in your bup -repository, then searches the list for a series of -nonexistent objects, printing memory usage statistics after -each cycle. - -Because of the way Unix systems work, the output will -usually show a large (and unchanging) value in the VmSize -column, because mapping the index files in the first place -takes a certain amount of virtual address space. However, this -virtual memory usage is entirely virtual; it doesn't take -any of your RAM. Over time, bup uses *parts* of the -indexes, which need to be loaded from disk, and this is -what causes an increase in the VmRSS column. - -# OPTIONS - --n, --number=*number* -: set the number of objects to search for during each - cycle (ie. before printing a line of output) - --c, --cycles=*cycles* -: set the number of cycles (ie. the number of lines of - output after the first). The first line of output is - always 0 (ie. the baseline before searching for any - objects). - ---ignore-midx -: ignore any `.midx` files created by `bup midx`. This - allows you to compare memory performance with and - without using midx. - - -# EXAMPLE - - $ bup memtest -n300 -c5 - PackIdxList: using 1 index. - VmSize VmRSS VmData VmStk - 0 20824 kB 4528 kB 1980 kB 84 kB - 300 20828 kB 5828 kB 1984 kB 84 kB - 600 20828 kB 6844 kB 1984 kB 84 kB - 900 20828 kB 7836 kB 1984 kB 84 kB - 1200 20828 kB 8736 kB 1984 kB 84 kB - 1500 20828 kB 9452 kB 1984 kB 84 kB - - $ bup memtest -n300 -c5 --ignore-midx - PackIdxList: using 361 indexes. - VmSize VmRSS VmData VmStk - 0 27444 kB 6552 kB 2516 kB 84 kB - 300 27448 kB 15832 kB 2520 kB 84 kB - 600 27448 kB 17220 kB 2520 kB 84 kB - 900 27448 kB 18012 kB 2520 kB 84 kB - 1200 27448 kB 18388 kB 2520 kB 84 kB - 1500 27448 kB 18556 kB 2520 kB 84 kB - - -# DISCUSSION - -When optimizing bup indexing, the first goal is to keep the -VmRSS reasonably low. However, it might eventually be -necessary to swap in all the indexes, simply because -you're searching for a lot of objects, and this will cause -your RSS to grow as large as VmSize eventually. - -The key word here is *eventually*. As long as VmRSS grows -reasonably slowly, the amount of disk activity caused by -accessing pack indexes is reasonably small. If it grows -quickly, bup will probably spend most of its time swapping -index data from disk instead of actually running your -backup, so backups will run very slowly. - -The purpose of `bup memtest` is to give you an idea of how -fast your memory usage is growing, and to help in -optimizing bup for better memory use. If you have memory -problems you might be asked to send the output of `bup -memtest` to help diagnose the problems. - -Tip: try using `bup midx -a` or `bup midx -f` to see if it -helps reduce your memory usage. - -Trivia: index memory usage in bup (or git) is only really a -problem when adding a large number of previously unseen -objects. This is because for each object, we need to -absolutely confirm that it isn't already in the database, -which requires us to search through *all* the existing pack -indexes to ensure that none of them contain the object in -question. In the more obvious case of searching for -objects that *do* exist, the objects being searched for are -typically related in some way, which means they probably -all exist in a small number of packfiles, so memory usage -will be constrained to just those packfile indexes. - -Since git users typically don't add a lot of files in a -single run, git doesn't really need a program like `bup -midx`. bup, on the other hand, spends most of its time -backing up files it hasn't seen before, so its memory usage -patterns are different. - - -# SEE ALSO - -`bup-midx`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-memtest.md b/Documentation/bup-memtest.md new file mode 100644 index 0000000..108a5a1 --- /dev/null +++ b/Documentation/bup-memtest.md @@ -0,0 +1,119 @@ +% bup-memtest(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-memtest - test bup memory usage statistics + +# SYNOPSIS + +bup memtest [options...] + +# DESCRIPTION + +`bup memtest` opens the list of pack indexes in your bup +repository, then searches the list for a series of +nonexistent objects, printing memory usage statistics after +each cycle. + +Because of the way Unix systems work, the output will +usually show a large (and unchanging) value in the VmSize +column, because mapping the index files in the first place +takes a certain amount of virtual address space. However, this +virtual memory usage is entirely virtual; it doesn't take +any of your RAM. Over time, bup uses *parts* of the +indexes, which need to be loaded from disk, and this is +what causes an increase in the VmRSS column. + +# OPTIONS + +-n, --number=*number* +: set the number of objects to search for during each + cycle (ie. before printing a line of output) + +-c, --cycles=*cycles* +: set the number of cycles (ie. the number of lines of + output after the first). The first line of output is + always 0 (ie. the baseline before searching for any + objects). + +--ignore-midx +: ignore any `.midx` files created by `bup midx`. This + allows you to compare memory performance with and + without using midx. + + +# EXAMPLE + + $ bup memtest -n300 -c5 + PackIdxList: using 1 index. + VmSize VmRSS VmData VmStk + 0 20824 kB 4528 kB 1980 kB 84 kB + 300 20828 kB 5828 kB 1984 kB 84 kB + 600 20828 kB 6844 kB 1984 kB 84 kB + 900 20828 kB 7836 kB 1984 kB 84 kB + 1200 20828 kB 8736 kB 1984 kB 84 kB + 1500 20828 kB 9452 kB 1984 kB 84 kB + + $ bup memtest -n300 -c5 --ignore-midx + PackIdxList: using 361 indexes. + VmSize VmRSS VmData VmStk + 0 27444 kB 6552 kB 2516 kB 84 kB + 300 27448 kB 15832 kB 2520 kB 84 kB + 600 27448 kB 17220 kB 2520 kB 84 kB + 900 27448 kB 18012 kB 2520 kB 84 kB + 1200 27448 kB 18388 kB 2520 kB 84 kB + 1500 27448 kB 18556 kB 2520 kB 84 kB + + +# DISCUSSION + +When optimizing bup indexing, the first goal is to keep the +VmRSS reasonably low. However, it might eventually be +necessary to swap in all the indexes, simply because +you're searching for a lot of objects, and this will cause +your RSS to grow as large as VmSize eventually. + +The key word here is *eventually*. As long as VmRSS grows +reasonably slowly, the amount of disk activity caused by +accessing pack indexes is reasonably small. If it grows +quickly, bup will probably spend most of its time swapping +index data from disk instead of actually running your +backup, so backups will run very slowly. + +The purpose of `bup memtest` is to give you an idea of how +fast your memory usage is growing, and to help in +optimizing bup for better memory use. If you have memory +problems you might be asked to send the output of `bup +memtest` to help diagnose the problems. + +Tip: try using `bup midx -a` or `bup midx -f` to see if it +helps reduce your memory usage. + +Trivia: index memory usage in bup (or git) is only really a +problem when adding a large number of previously unseen +objects. This is because for each object, we need to +absolutely confirm that it isn't already in the database, +which requires us to search through *all* the existing pack +indexes to ensure that none of them contain the object in +question. In the more obvious case of searching for +objects that *do* exist, the objects being searched for are +typically related in some way, which means they probably +all exist in a small number of packfiles, so memory usage +will be constrained to just those packfile indexes. + +Since git users typically don't add a lot of files in a +single run, git doesn't really need a program like `bup +midx`. bup, on the other hand, spends most of its time +backing up files it hasn't seen before, so its memory usage +patterns are different. + + +# SEE ALSO + +`bup-midx`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-midx.1.md b/Documentation/bup-midx.1.md deleted file mode 100644 index 829a74a..0000000 --- a/Documentation/bup-midx.1.md +++ /dev/null @@ -1,92 +0,0 @@ -% bup-midx(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-midx - create a multi-index (.midx) file from several .idx files - -# SYNOPSIS - -bup midx [-o *outfile*] <-a|-f|*idxnames*...> - -# DESCRIPTION - -`bup midx` creates a multi-index (.midx) file from one or more -git pack index (.idx) files. - -You should run this command -occasionally to ensure your backups run quickly and without -requiring too much RAM. - -# OPTIONS - --o, --output -: use the given output filename for the .midx file. - Default is auto-generated. - --a, --auto -: automatically generate new .midx files for any .idx - files where it would be appropriate. - --f, --force -: force generation of a single new .midx file containing - *all* your .idx files, even if other .midx files - already exist. This will result in the fastest backup - performance, but may take a long time to run. - - -# EXAMPLE - - $ bup midx -a - Merging 21 indexes (2278559 objects). - Table size: 524288 (17 bits) - Reading indexes: 100.00% (2278559/2278559), done. - midx-b66d7c9afc4396187218f2936a87b865cf342672.midx - -# DISCUSSION - -By default, bup uses git-formatted pack files, which -consist of a pack file (containing objects) and an idx -file (containing a sorted list of object names and their -offsets in the .pack file). - -Normal idx files are convenient because it means you can use -`git`(1) to access your backup datasets. However, idx -files can get slow when you have a lot of very large packs -(which git typically doesn't have, but bup often does). - -bup .midx files consist of a single sorted list of all the objects -contained in all the .pack files it references. This list -can be binary searched in about log2(m) steps, where m is -the total number of objects. - -To further speed up the search, midx files also have a -variable-sized fanout table that reduces the first n -steps of the binary search. With the help of this fanout -table, bup can narrow down which page of the midx file a -given object id would be in (if it exists) with a single -lookup. Thus, typical searches will only need to swap in -two pages: one for the fanout table, and one for the object -id. - -midx files are most useful when creating new backups, since -searching for a nonexistent object in the repository -necessarily requires searching through *all* the index -files to ensure that it does not exist. (Searching for -objects that *do* exist can be optimized; for example, -consecutive objects are often stored in the same pack, so -we can search that one first using an MRU algorithm.) - -With large repositories, you should be sure to run -`bup midx -a` or `bup midx -f` every now and then so that -creating backups will remain efficient. - - -# SEE ALSO - -`bup-save`(1), `bup-margin`(1), `bup-memtest`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-midx.md b/Documentation/bup-midx.md new file mode 100644 index 0000000..829a74a --- /dev/null +++ b/Documentation/bup-midx.md @@ -0,0 +1,92 @@ +% bup-midx(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-midx - create a multi-index (.midx) file from several .idx files + +# SYNOPSIS + +bup midx [-o *outfile*] <-a|-f|*idxnames*...> + +# DESCRIPTION + +`bup midx` creates a multi-index (.midx) file from one or more +git pack index (.idx) files. + +You should run this command +occasionally to ensure your backups run quickly and without +requiring too much RAM. + +# OPTIONS + +-o, --output +: use the given output filename for the .midx file. + Default is auto-generated. + +-a, --auto +: automatically generate new .midx files for any .idx + files where it would be appropriate. + +-f, --force +: force generation of a single new .midx file containing + *all* your .idx files, even if other .midx files + already exist. This will result in the fastest backup + performance, but may take a long time to run. + + +# EXAMPLE + + $ bup midx -a + Merging 21 indexes (2278559 objects). + Table size: 524288 (17 bits) + Reading indexes: 100.00% (2278559/2278559), done. + midx-b66d7c9afc4396187218f2936a87b865cf342672.midx + +# DISCUSSION + +By default, bup uses git-formatted pack files, which +consist of a pack file (containing objects) and an idx +file (containing a sorted list of object names and their +offsets in the .pack file). + +Normal idx files are convenient because it means you can use +`git`(1) to access your backup datasets. However, idx +files can get slow when you have a lot of very large packs +(which git typically doesn't have, but bup often does). + +bup .midx files consist of a single sorted list of all the objects +contained in all the .pack files it references. This list +can be binary searched in about log2(m) steps, where m is +the total number of objects. + +To further speed up the search, midx files also have a +variable-sized fanout table that reduces the first n +steps of the binary search. With the help of this fanout +table, bup can narrow down which page of the midx file a +given object id would be in (if it exists) with a single +lookup. Thus, typical searches will only need to swap in +two pages: one for the fanout table, and one for the object +id. + +midx files are most useful when creating new backups, since +searching for a nonexistent object in the repository +necessarily requires searching through *all* the index +files to ensure that it does not exist. (Searching for +objects that *do* exist can be optimized; for example, +consecutive objects are often stored in the same pack, so +we can search that one first using an MRU algorithm.) + +With large repositories, you should be sure to run +`bup midx -a` or `bup midx -f` every now and then so that +creating backups will remain efficient. + + +# SEE ALSO + +`bup-save`(1), `bup-margin`(1), `bup-memtest`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-newliner.1.md b/Documentation/bup-newliner.1.md deleted file mode 100644 index 8224311..0000000 --- a/Documentation/bup-newliner.1.md +++ /dev/null @@ -1,42 +0,0 @@ -% bup-newliner(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-newliner - make sure progress messages don't overlap with output - -# SYNOPSIS - -\ 2>&1 | bup newliner - -# DESCRIPTION - -`bup newliner` is run automatically by bup. You shouldn't -need it unless you're using it in some other program. - -Progress messages emitted by bup (and some other tools) are -of the form "Message ### content\r", that is, a status -message containing a variable-length number, followed by a -carriage return character and no newline. If these -messages are printed more than once, they overwrite each -other, so what the user sees is a single line with a -continually-updating number. - -This works fine until some other message is printed. For -example, progress messages are usually printed to stderr, -but other program messages might be printed to stdout. If -those messages are shorter than the progress message line, -the screen will be left with weird looking artifacts as the -two messages get mixed together. - -`bup newliner` prints extra space characters at the right -time to make sure that doesn't happen. - -If you're running a program that has problems with these -artifacts, you can usually fix them by piping its stdout -*and* its stderr through bup newliner. - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-newliner.md b/Documentation/bup-newliner.md new file mode 100644 index 0000000..8224311 --- /dev/null +++ b/Documentation/bup-newliner.md @@ -0,0 +1,42 @@ +% bup-newliner(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-newliner - make sure progress messages don't overlap with output + +# SYNOPSIS + +\ 2>&1 | bup newliner + +# DESCRIPTION + +`bup newliner` is run automatically by bup. You shouldn't +need it unless you're using it in some other program. + +Progress messages emitted by bup (and some other tools) are +of the form "Message ### content\r", that is, a status +message containing a variable-length number, followed by a +carriage return character and no newline. If these +messages are printed more than once, they overwrite each +other, so what the user sees is a single line with a +continually-updating number. + +This works fine until some other message is printed. For +example, progress messages are usually printed to stderr, +but other program messages might be printed to stdout. If +those messages are shorter than the progress message line, +the screen will be left with weird looking artifacts as the +two messages get mixed together. + +`bup newliner` prints extra space characters at the right +time to make sure that doesn't happen. + +If you're running a program that has problems with these +artifacts, you can usually fix them by piping its stdout +*and* its stderr through bup newliner. + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-random.1.md b/Documentation/bup-random.1.md deleted file mode 100644 index fe710f1..0000000 --- a/Documentation/bup-random.1.md +++ /dev/null @@ -1,76 +0,0 @@ -% bup-random(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-random - generate a stream of random output - -# SYNOPSIS - -bup random [-S seed] [-f] - -# DESCRIPTION - -`bup random` produces a stream of pseudorandom output bytes to -stdout. Note: the bytes are *not* generated using a -cryptographic algorithm and should never be used for -security. - -Note that the stream of random bytes will be identical -every time `bup random` is run, unless you provide a -different `seed` value. This is intentional: the purpose -of this program is to be able to run repeatable tests on -large amounts of data, so we want identical data every -time. - -`bup random` generates about 240 megabytes per second on a -modern test system (Intel Core2), which is faster than you -could achieve by reading data from most disks. Thus, it -can be helpful when running microbenchmarks. - -# OPTIONS - - -: the number of bytes of data to generate. Can be used - with the suffices `k`, `M`, or `G` to indicate - kilobytes, megabytes, or gigabytes, respectively. - --S, --seed=*seed* -: use the given value to seed the pseudorandom number - generator. The generated output stream will be - identical for every stream seeded with the same value. - The default seed is 1. A seed value of 0 is equivalent - to 1. - --f, --force -: generate output even if stdout is a tty. (Generating - random data to a tty is generally considered - ill-advised, but you can do if you really want.) - -# EXAMPLES - - $ bup random 1k | sha1sum - 2108c55d0a2687c8dacf9192677c58437a55db71 - - - $ bup random -S1 1k | sha1sum - 2108c55d0a2687c8dacf9192677c58437a55db71 - - - $ bup random -S2 1k | sha1sum - f71acb90e135d98dad7efc136e8d2cc30573e71a - - - $ time bup random 1G >/dev/null - Random: 1024 Mbytes, done. - - real 0m4.261s - user 0m4.048s - sys 0m0.172s - - $ bup random 1G | bup split -t --bench - Random: 1024 Mbytes, done. - bup: 1048576.00kbytes in 18.59 secs = 56417.78 kbytes/sec - 1092599b9c7b2909652ef1e6edac0796bfbfc573 - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-random.md b/Documentation/bup-random.md new file mode 100644 index 0000000..fe710f1 --- /dev/null +++ b/Documentation/bup-random.md @@ -0,0 +1,76 @@ +% bup-random(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-random - generate a stream of random output + +# SYNOPSIS + +bup random [-S seed] [-f] + +# DESCRIPTION + +`bup random` produces a stream of pseudorandom output bytes to +stdout. Note: the bytes are *not* generated using a +cryptographic algorithm and should never be used for +security. + +Note that the stream of random bytes will be identical +every time `bup random` is run, unless you provide a +different `seed` value. This is intentional: the purpose +of this program is to be able to run repeatable tests on +large amounts of data, so we want identical data every +time. + +`bup random` generates about 240 megabytes per second on a +modern test system (Intel Core2), which is faster than you +could achieve by reading data from most disks. Thus, it +can be helpful when running microbenchmarks. + +# OPTIONS + + +: the number of bytes of data to generate. Can be used + with the suffices `k`, `M`, or `G` to indicate + kilobytes, megabytes, or gigabytes, respectively. + +-S, --seed=*seed* +: use the given value to seed the pseudorandom number + generator. The generated output stream will be + identical for every stream seeded with the same value. + The default seed is 1. A seed value of 0 is equivalent + to 1. + +-f, --force +: generate output even if stdout is a tty. (Generating + random data to a tty is generally considered + ill-advised, but you can do if you really want.) + +# EXAMPLES + + $ bup random 1k | sha1sum + 2108c55d0a2687c8dacf9192677c58437a55db71 - + + $ bup random -S1 1k | sha1sum + 2108c55d0a2687c8dacf9192677c58437a55db71 - + + $ bup random -S2 1k | sha1sum + f71acb90e135d98dad7efc136e8d2cc30573e71a - + + $ time bup random 1G >/dev/null + Random: 1024 Mbytes, done. + + real 0m4.261s + user 0m4.048s + sys 0m0.172s + + $ bup random 1G | bup split -t --bench + Random: 1024 Mbytes, done. + bup: 1048576.00kbytes in 18.59 secs = 56417.78 kbytes/sec + 1092599b9c7b2909652ef1e6edac0796bfbfc573 + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-save.1.md b/Documentation/bup-save.1.md deleted file mode 100644 index 394855d..0000000 --- a/Documentation/bup-save.1.md +++ /dev/null @@ -1,81 +0,0 @@ -% bup-save(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-save - create a new bup backup set - -# SYNOPSIS - -bup save [-r *host*:*path*] <-t|-c|-n *name*> [-v] [-q] - [--smaller=*maxsize*] - -# DESCRIPTION - -`bup save` saves the contents of the given files or paths -into a new backup set and optionally names that backup set. - -Before trying to save files using `bup save`, you should -first update the index using `bup index`. The reasons -for separating the two steps are described in the man page -for `bup-index`(1). - -# OPTIONS - --r, --remote=*host*:*path* -: save the backup set to the given remote server. If - *path* is omitted, uses the default path on the remote - server (you still need to include the ':') - --t, --tree -: after creating the backup set, print out the git tree - id of the resulting backup. - --c, --commit -: after creating the backup set, print out the git commit - id of the resulting backup. - --n, --name=*name* -: after creating the backup set, create a git branch - named *name* so that the backup can be accessed using - that name. If *name* already exists, the new backup - will be considered a descendant of the old *name*. - (Thus, you can continually create new backup sets with - the same name, and later view the history of that - backup set to see how files have changed over time.) - --v, --verbose -: increase verbosity (can be used more than once). With - one -v, prints every directory name as it gets backed up. With - two -v, also prints every filename. - --q, --quiet -: disable progress messages. - ---smaller=*maxsize* -: don't back up files >= *maxsize* bytes. You can use - this to run frequent incremental backups of your small - files, which can usually be backed up quickly, and skip - over large ones (like virtual machine images) which - take longer. Then you can back up the large files - less frequently. - - -# EXAMPLE - - $ bup index -ux /etc - Indexing: 1981, done. - - $ bup save -r myserver: -n my-pc-backup /etc - Reading index: 1981, done. - Saving: 100.00% (998/998k, 1981/1981 files), done. - - -# SEE ALSO - -`bup-index`(1), `bup-split`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-save.md b/Documentation/bup-save.md new file mode 100644 index 0000000..394855d --- /dev/null +++ b/Documentation/bup-save.md @@ -0,0 +1,81 @@ +% bup-save(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-save - create a new bup backup set + +# SYNOPSIS + +bup save [-r *host*:*path*] <-t|-c|-n *name*> [-v] [-q] + [--smaller=*maxsize*] + +# DESCRIPTION + +`bup save` saves the contents of the given files or paths +into a new backup set and optionally names that backup set. + +Before trying to save files using `bup save`, you should +first update the index using `bup index`. The reasons +for separating the two steps are described in the man page +for `bup-index`(1). + +# OPTIONS + +-r, --remote=*host*:*path* +: save the backup set to the given remote server. If + *path* is omitted, uses the default path on the remote + server (you still need to include the ':') + +-t, --tree +: after creating the backup set, print out the git tree + id of the resulting backup. + +-c, --commit +: after creating the backup set, print out the git commit + id of the resulting backup. + +-n, --name=*name* +: after creating the backup set, create a git branch + named *name* so that the backup can be accessed using + that name. If *name* already exists, the new backup + will be considered a descendant of the old *name*. + (Thus, you can continually create new backup sets with + the same name, and later view the history of that + backup set to see how files have changed over time.) + +-v, --verbose +: increase verbosity (can be used more than once). With + one -v, prints every directory name as it gets backed up. With + two -v, also prints every filename. + +-q, --quiet +: disable progress messages. + +--smaller=*maxsize* +: don't back up files >= *maxsize* bytes. You can use + this to run frequent incremental backups of your small + files, which can usually be backed up quickly, and skip + over large ones (like virtual machine images) which + take longer. Then you can back up the large files + less frequently. + + +# EXAMPLE + + $ bup index -ux /etc + Indexing: 1981, done. + + $ bup save -r myserver: -n my-pc-backup /etc + Reading index: 1981, done. + Saving: 100.00% (998/998k, 1981/1981 files), done. + + +# SEE ALSO + +`bup-index`(1), `bup-split`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-server.1.md b/Documentation/bup-server.1.md deleted file mode 100644 index a8c8a4c..0000000 --- a/Documentation/bup-server.1.md +++ /dev/null @@ -1,28 +0,0 @@ -% bup-server(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-server - the server side of the bup client-server relationship - -# SYNOPSIS - -bup server - -# DESCRIPTION - -`bup server` is the server side of a remote bup session. -If you use `bup-split`(1) or `bup-save`(1) with the `-r` -option, they will ssh to the remote server and run `bup -server` to receive the transmitted objects. - -There is normally no reason to run `bup server` yourself. - -# SEE ALSO - -`bup-save`(1), `bup-split`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-server.md b/Documentation/bup-server.md new file mode 100644 index 0000000..a8c8a4c --- /dev/null +++ b/Documentation/bup-server.md @@ -0,0 +1,28 @@ +% bup-server(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-server - the server side of the bup client-server relationship + +# SYNOPSIS + +bup server + +# DESCRIPTION + +`bup server` is the server side of a remote bup session. +If you use `bup-split`(1) or `bup-save`(1) with the `-r` +option, they will ssh to the remote server and run `bup +server` to receive the transmitted objects. + +There is normally no reason to run `bup server` yourself. + +# SEE ALSO + +`bup-save`(1), `bup-split`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-split.1.md b/Documentation/bup-split.1.md deleted file mode 100644 index 41a5731..0000000 --- a/Documentation/bup-split.1.md +++ /dev/null @@ -1,110 +0,0 @@ -% bup-split(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-split - save individual files to bup backup sets - -# SYNOPSIS - -bup split [-r *host*:*path*] <-b|-t|-c|-n *name*> [-v] [-q] - [--bench] [--max-pack-size=*bytes*] - [--max-pack-objects=*n*] [--fanout=*count] [filenames...] - -# DESCRIPTION - -`bup split` concatenates the contents of the given files -(or if no filenames are given, reads from stdin), splits -the content into chunks of around 8k using a rolling -checksum algorithm, and saves the chunks into a bup -repository. Chunks which have previously been stored are -not stored again (ie. they are "deduplicated"). - -Because of the way the rolling checksum works, chunks -tend to be very stable across changes to a given file, -including adding, deleting, and changing bytes. - -For example, if you use `bup split` to back up an XML dump -of a database, and the XML file changes slightly from one -run to the next, nearly all the data will still be -deduplicated and the size of each backup after the first -will typically be quite small. - -Another technique is to pipe the output of the `tar`(1) or -`cpio`(1) programs to `bup split`. When individual files -in the tarball change slightly or are added or removed, bup -still processes the remainder of the tarball efficiently. -(Note that `bup save` is usually a more efficient way to -accomplish this, however.) - -To get the data back, use `bup-join`(1). - -# OPTIONS - --r, --remote=*host*:*path* -: save the backup set to the given remote server. If - *path* is omitted, uses the default path on the remote - server (you still need to include the ':') - --b, --blobs -: output a series of git blob ids that correspond to the - chunks in the dataset. - --t, --tree -: output the git tree id of the resulting dataset. - --c, --commit -: output the git commit id of the resulting dataset. - --n, --name=*name* -: after creating the dataset, create a git branch - named *name* so that it can be accessed using - that name. If *name* already exists, the new dataset - will be considered a descendant of the old *name*. - (Thus, you can continually create new datasets with - the same name, and later view the history of that - dataset to see how it has changed over time.) - --v, --verbose -: increase verbosity (can be used more than once). - --q, --quiet -: disable progress messages. - ---bench -: print benchmark timings to stderr. - ---max-pack-size=*bytes* -: never create git packfiles larger than the given number - of bytes. Default is 1 billion bytes. Usually there - is no reason to change this. - ---max-pack-objects=*numobjs* -: never create git packfiles with more than the given - number of objects. Default is 200 thousand objects. - Usually there is no reason to change this. - ---fanout=*numobjs* -: when splitting very large files, never put more than - this number of git blobs in a single git tree. Instead, - generate a new tree and link to that. Default is - 4096 objects per tree. - -# EXAMPLE - - $ tar -cf - /etc | bup split -r myserver: -n mybackup-tar - tar: Removing leading /' from member names - Indexing objects: 100% (196/196), done. - - $ bup join -r myserver: mybackup-tar | tar -tf - | wc -l - 1961 - - -# SEE ALSO - -`bup-join`(1), `bup-index`(1), `bup-save`(1) - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-split.md b/Documentation/bup-split.md new file mode 100644 index 0000000..41a5731 --- /dev/null +++ b/Documentation/bup-split.md @@ -0,0 +1,110 @@ +% bup-split(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-split - save individual files to bup backup sets + +# SYNOPSIS + +bup split [-r *host*:*path*] <-b|-t|-c|-n *name*> [-v] [-q] + [--bench] [--max-pack-size=*bytes*] + [--max-pack-objects=*n*] [--fanout=*count] [filenames...] + +# DESCRIPTION + +`bup split` concatenates the contents of the given files +(or if no filenames are given, reads from stdin), splits +the content into chunks of around 8k using a rolling +checksum algorithm, and saves the chunks into a bup +repository. Chunks which have previously been stored are +not stored again (ie. they are "deduplicated"). + +Because of the way the rolling checksum works, chunks +tend to be very stable across changes to a given file, +including adding, deleting, and changing bytes. + +For example, if you use `bup split` to back up an XML dump +of a database, and the XML file changes slightly from one +run to the next, nearly all the data will still be +deduplicated and the size of each backup after the first +will typically be quite small. + +Another technique is to pipe the output of the `tar`(1) or +`cpio`(1) programs to `bup split`. When individual files +in the tarball change slightly or are added or removed, bup +still processes the remainder of the tarball efficiently. +(Note that `bup save` is usually a more efficient way to +accomplish this, however.) + +To get the data back, use `bup-join`(1). + +# OPTIONS + +-r, --remote=*host*:*path* +: save the backup set to the given remote server. If + *path* is omitted, uses the default path on the remote + server (you still need to include the ':') + +-b, --blobs +: output a series of git blob ids that correspond to the + chunks in the dataset. + +-t, --tree +: output the git tree id of the resulting dataset. + +-c, --commit +: output the git commit id of the resulting dataset. + +-n, --name=*name* +: after creating the dataset, create a git branch + named *name* so that it can be accessed using + that name. If *name* already exists, the new dataset + will be considered a descendant of the old *name*. + (Thus, you can continually create new datasets with + the same name, and later view the history of that + dataset to see how it has changed over time.) + +-v, --verbose +: increase verbosity (can be used more than once). + +-q, --quiet +: disable progress messages. + +--bench +: print benchmark timings to stderr. + +--max-pack-size=*bytes* +: never create git packfiles larger than the given number + of bytes. Default is 1 billion bytes. Usually there + is no reason to change this. + +--max-pack-objects=*numobjs* +: never create git packfiles with more than the given + number of objects. Default is 200 thousand objects. + Usually there is no reason to change this. + +--fanout=*numobjs* +: when splitting very large files, never put more than + this number of git blobs in a single git tree. Instead, + generate a new tree and link to that. Default is + 4096 objects per tree. + +# EXAMPLE + + $ tar -cf - /etc | bup split -r myserver: -n mybackup-tar + tar: Removing leading /' from member names + Indexing objects: 100% (196/196), done. + + $ bup join -r myserver: mybackup-tar | tar -tf - | wc -l + 1961 + + +# SEE ALSO + +`bup-join`(1), `bup-index`(1), `bup-save`(1) + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-tick.1.md b/Documentation/bup-tick.1.md deleted file mode 100644 index a8ed118..0000000 --- a/Documentation/bup-tick.1.md +++ /dev/null @@ -1,32 +0,0 @@ -% bup-tick(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup-tick - wait for up to one second - -# SYNOPSIS - -bup tick - -# DESCRIPTION - -`bup tick` waits until `time`(2) returns a different value -than it originally did. Since time() has a granularity of -one second, this can cause a delay of up to one second. - -This program is useful for writing tests that need to -ensure a file date will be seen as modified. It is -slightly better than `sleep`(1) since it sometimes waits -for less than one second. - -# EXAMPLE - - $ date; bup tick; date - Sat Feb 6 16:59:58 EST 2010 - Sat Feb 6 16:59:59 EST 2010 - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-tick.md b/Documentation/bup-tick.md new file mode 100644 index 0000000..a8ed118 --- /dev/null +++ b/Documentation/bup-tick.md @@ -0,0 +1,32 @@ +% bup-tick(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup-tick - wait for up to one second + +# SYNOPSIS + +bup tick + +# DESCRIPTION + +`bup tick` waits until `time`(2) returns a different value +than it originally did. Since time() has a granularity of +one second, this can cause a delay of up to one second. + +This program is useful for writing tests that need to +ensure a file date will be seen as modified. It is +slightly better than `sleep`(1) since it sometimes waits +for less than one second. + +# EXAMPLE + + $ date; bup tick; date + Sat Feb 6 16:59:58 EST 2010 + Sat Feb 6 16:59:59 EST 2010 + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup-web.1.md b/Documentation/bup-web.1.md deleted file mode 100644 index 23e82a8..0000000 --- a/Documentation/bup-web.1.md +++ /dev/null @@ -1,42 +0,0 @@ -% bup-ftp(1) Bup %BUP_VERSION% -% Joe Beda -% %BUP_DATE% - -# NAME - -bup-web - Start web server to browse bup repositiory - -# SYNOPSIS - -bup web [[hostname]:port] - -# DESCRIPTION - -`bup web` starts a web server that can browse bup repositories. The file -hierarchy is the same as that shown by `bup-fuse`(1), `bup-ls`(1) and -`bup-ftp`(1). - -`hostname` and `port` default to 127.0.0.1 and 8080, respectively, and hence -`bup web` will only offer up the web server to locally running clients. If -you'd like to expose the web server to anyone on your network (dangerous!) you -can omit the bind address to bind to all available interfaces: `:8080`. - -# EXAMPLE - - $ bup web - Serving HTTP on 127.0.0.1:8080... - ^C - - $ bup web :8080 - Serving HTTP on 0.0.0.0:8080... - ^C - - -# SEE ALSO - -`bup-fuse`(1), `bup-ls`(1), `bup-ftp`(1) - - -# BUP - -Part of the `bup`(1) suite. diff --git a/Documentation/bup-web.md b/Documentation/bup-web.md new file mode 100644 index 0000000..23e82a8 --- /dev/null +++ b/Documentation/bup-web.md @@ -0,0 +1,42 @@ +% bup-ftp(1) Bup %BUP_VERSION% +% Joe Beda +% %BUP_DATE% + +# NAME + +bup-web - Start web server to browse bup repositiory + +# SYNOPSIS + +bup web [[hostname]:port] + +# DESCRIPTION + +`bup web` starts a web server that can browse bup repositories. The file +hierarchy is the same as that shown by `bup-fuse`(1), `bup-ls`(1) and +`bup-ftp`(1). + +`hostname` and `port` default to 127.0.0.1 and 8080, respectively, and hence +`bup web` will only offer up the web server to locally running clients. If +you'd like to expose the web server to anyone on your network (dangerous!) you +can omit the bind address to bind to all available interfaces: `:8080`. + +# EXAMPLE + + $ bup web + Serving HTTP on 127.0.0.1:8080... + ^C + + $ bup web :8080 + Serving HTTP on 0.0.0.0:8080... + ^C + + +# SEE ALSO + +`bup-fuse`(1), `bup-ls`(1), `bup-ftp`(1) + + +# BUP + +Part of the `bup`(1) suite. diff --git a/Documentation/bup.1.md b/Documentation/bup.1.md deleted file mode 100644 index 8def75c..0000000 --- a/Documentation/bup.1.md +++ /dev/null @@ -1,75 +0,0 @@ -% bup(1) Bup %BUP_VERSION% -% Avery Pennarun -% %BUP_DATE% - -# NAME - -bup - Backup program using rolling checksums and git file formats - -# SYNOPSIS - -bup \ [options...] - -# DESCRIPTION - -`bup` is a program for making backups of your files using -the git file format. - -Unlike `git`(1) itself, bup is -optimized for handling huge data sets including individual -very large files (such a virtual machine images). However, -once a backup set is created, it can still be accessed -using git tools. - -The individual bup subcommands appear in their own man -pages. - -# COMMONLY USED SUBCOMMANDS - -`bup-fsck`(1) -: Check backup sets for damage and add redundancy information -`bup-ftp`(1) -: Browse backup sets using an ftp-like client -`bup-fuse`(1) -: Mount your backup sets as a filesystem -`bup-help`(1) -: Print detailed help for the given command -`bup-index`(1) -: Create or display the index of files to back up -`bup-join`(1) -: Retrieve a file backed up using `bup-split`(1) -`bup-ls`(1) -: Browse the files in your backup sets -`bup-midx`(1) -: Index objects to speed up future backups -`bup-save`(1) -: Save files into a backup set (note: run "bup index" first) -`bup-split`(1) -: Split a single file into its own backup set - -# RARELY USED SUBCOMMANDS - -`bup-damage`(1) -: Deliberately destroy data -`bup-drecurse`(1) -: Recursively list files in your filesystem -`bup-init`(1) -: Initialize a bup repository -`bup-margin`(1) -: Determine how close your bup repository is to armageddon -`bup-memtest`(1) -: Test bup memory usage statistics -`bup-newliner`(1) -: Make sure progress messages don't overlap with output -`bup-random`(1) -: Generate a stream of random output -`bup-server`(1) -: The server side of the bup client-server relationship -`bup-tick`(1) -: Wait for up to one second. - -# SEE ALSO - -`git`(1) and the *README* file from the bup distribution. - -The home of bup is at . diff --git a/Documentation/bup.md b/Documentation/bup.md new file mode 100644 index 0000000..8def75c --- /dev/null +++ b/Documentation/bup.md @@ -0,0 +1,75 @@ +% bup(1) Bup %BUP_VERSION% +% Avery Pennarun +% %BUP_DATE% + +# NAME + +bup - Backup program using rolling checksums and git file formats + +# SYNOPSIS + +bup \ [options...] + +# DESCRIPTION + +`bup` is a program for making backups of your files using +the git file format. + +Unlike `git`(1) itself, bup is +optimized for handling huge data sets including individual +very large files (such a virtual machine images). However, +once a backup set is created, it can still be accessed +using git tools. + +The individual bup subcommands appear in their own man +pages. + +# COMMONLY USED SUBCOMMANDS + +`bup-fsck`(1) +: Check backup sets for damage and add redundancy information +`bup-ftp`(1) +: Browse backup sets using an ftp-like client +`bup-fuse`(1) +: Mount your backup sets as a filesystem +`bup-help`(1) +: Print detailed help for the given command +`bup-index`(1) +: Create or display the index of files to back up +`bup-join`(1) +: Retrieve a file backed up using `bup-split`(1) +`bup-ls`(1) +: Browse the files in your backup sets +`bup-midx`(1) +: Index objects to speed up future backups +`bup-save`(1) +: Save files into a backup set (note: run "bup index" first) +`bup-split`(1) +: Split a single file into its own backup set + +# RARELY USED SUBCOMMANDS + +`bup-damage`(1) +: Deliberately destroy data +`bup-drecurse`(1) +: Recursively list files in your filesystem +`bup-init`(1) +: Initialize a bup repository +`bup-margin`(1) +: Determine how close your bup repository is to armageddon +`bup-memtest`(1) +: Test bup memory usage statistics +`bup-newliner`(1) +: Make sure progress messages don't overlap with output +`bup-random`(1) +: Generate a stream of random output +`bup-server`(1) +: The server side of the bup client-server relationship +`bup-tick`(1) +: Wait for up to one second. + +# SEE ALSO + +`git`(1) and the *README* file from the bup distribution. + +The home of bup is at .