X-Git-Url: https://arthur.barton.de/gitweb/?a=blobdiff_plain;f=README.md;h=e5c0422515966dd7bc9457255f92f2a28a8b628b;hb=34bdf9acc4bd85643159b79c600ed2ef26d5b47b;hp=ef19a84a7a01f0a1c4098065476b4ec9e51567ae;hpb=5f0b1cf36d8001eabaf920aa55da4b5eb1d22f26;p=bup.git
diff --git a/README.md b/README.md
index ef19a84..e5c0422 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,3 @@
-
bup: It backs things up
=======================
@@ -34,7 +33,9 @@ bup has a few advantages over other backup software:
- Unlike git, it writes packfiles *directly* (instead of having a separate
garbage collection / repacking stage) so it's fast even with gratuitously
- huge amounts of data.
+ huge amounts of data. bup's improved index formats also allow you to
+ track far more filenames than git (millions) and keep track of far more
+ objects (hundreds or thousands of gigabytes).
- Data is "automagically" shared between incremental backups without having
to know which backup is based on which other one - even if the backups
@@ -42,11 +43,23 @@ bup has a few advantages over other backup software:
other. You just tell bup to back stuff up, and it saves only the minimum
amount of data needed.
+ - You can back up directly to a remote bup server, without needing tons of
+ temporary disk space on the computer being backed up. And if your backup
+ is interrupted halfway through, the next run will pick up where you left
+ off. And it's easy to set up a bup server: just install bup on any
+ machine where you have ssh access.
+
+ - Bup can use "par2" redundancy to recover corrupted backups even if your
+ disk has undetected bad sectors.
+
- Even when a backup is incremental, you don't have to worry about
restoring the full backup, then each of the incrementals in turn; an
incremental backup *acts* as if it's a full backup, it just takes less
disk space.
+ - You can mount your bup repository as a FUSE filesystem and access the
+ content that way, and even export it over Samba.
+
- It's written in python (with some C parts to make it faster) so it's easy
for you to extend and maintain.
@@ -58,43 +71,217 @@ Reasons you might want to avoid bup
for you, but we don't know why. It is also missing some
probably-critical features.
- - It requires python >= 2.4, a C compiler, and an installed git version >=
- 1.5.3.1.
+ - It requires python >= 2.6, a C compiler, and an installed git
+ version >= 1.5.6. It also requires par2 if you want fsck to be
+ able to generate the information needed to recover from some types
+ of corruption.
- - It currently only works on Linux, MacOS X >= 10.4, or Windows (with
- Cygwin). Patches to support other platforms are welcome.
-
- - It has almost no documentation. Not even a man page! This file is all
- you get for now.
-
-
+ - It currently only works on Linux, FreeBSD, NetBSD, OS X >= 10.4,
+ Solaris, or Windows (with Cygwin, and maybe with WSL). Patches to
+ support other platforms are welcome.
+
+ - Any items in "Things that are stupid" below.
+
+
+Notable changes introduced by a release
+=======================================
+
+ - Changes in 0.29.1 as compared to 0.29
+ - Changes in 0.29 as compared to 0.28.1
+ - Changes in 0.28.1 as compared to 0.28
+ - Changes in 0.28 as compared to 0.27.1
+ - Changes in 0.27.1 as compared to 0.27
+
+
Getting started
----------------
+===============
- - check out the bup source code using git:
-
- git clone git://github.com/apenwarr/bup
+From source
+-----------
- - install the python 2.5 development libraries. On Debian or Ubuntu, this
- is:
- apt-get install python2.5-dev
-
- - build the python module and symlinks:
+ - Check out the bup source code using git:
+ git clone https://github.com/bup/bup
+
+ - This will leave you on the master branch, which is perfect if you
+ would like to help with development, but if you'd just like to use
+ bup, please check out the latest stable release like this:
+
+ git checkout 0.29.1
+
+ You can see the latest stable release here:
+ https://github.com/bup/bup/releases.
+
+ - Install the required python libraries (including the development
+ libraries).
+
+ On very recent Debian/Ubuntu versions, this may be sufficient (run
+ as root):
+
+ apt-get build-dep bup
+
+ Otherwise try this (substitute python2.6-dev if you have an older
+ system):
+
+ apt-get install python2.7-dev python-fuse
+ apt-get install python-pyxattr python-pylibacl
+ apt-get install linux-libc-dev
+ apt-get install acl attr
+ apt-get install python-tornado # optional
+
+ On CentOS (for CentOS 6, at least), this should be sufficient (run
+ as root):
+
+ yum groupinstall "Development Tools"
+ yum install python python-devel
+ yum install fuse-python pyxattr pylibacl
+ yum install perl-Time-HiRes
+
+ In addition to the default CentOS repositories, you may need to add
+ RPMForge (for fuse-python) and EPEL (for pyxattr and pylibacl).
+
+ On Cygwin, install python, make, rsync, and gcc4.
+
+ If you would like to use the optional bup web server on systems
+ without a tornado package, you may want to try this:
+
+ pip install tornado
+
+ - Build the python module and symlinks:
+
make
- - run the tests:
+ - Run the tests:
make test
- (The tests should pass. If they don't pass for you, stop here and send
- me an email.)
-
- - Try making a local backup as a tar file:
+ The tests should pass. If they don't pass for you, stop here and
+ send an email to bup-list@googlegroups.com. Though if there are
+ symbolic links along the current working directory path, the tests
+ may fail. Running something like this before "make test" should
+ sidestep the problem:
+
+ cd "$(pwd -P)"
+
+ - You can install bup via "make install", and override the default
+ destination with DESTDIR and PREFIX.
+
+ Files are normally installed to "$DESTDIR/$PREFIX" where DESTDIR is
+ empty by default, and PREFIX is set to /usr/local. So if you wanted to
+ install bup to /opt/bup, you might do something like this:
+
+ make install DESTDIR=/opt/bup PREFIX=''
+
+ - The Python executable that bup will use is chosen by ./configure,
+ which will search for a reasonable version unless PYTHON is set in
+ the environment, in which case, bup will use that path. You can
+ see which Python executable was chosen by looking at the
+ configure output, or examining cmd/python-cmd.sh, and you can
+ change the selection by re-running ./configure.
+
+From binary packages
+--------------------
+
+Binary packages of bup are known to be built for the following OSes:
+
+ - Debian:
+ http://packages.debian.org/search?searchon=names&keywords=bup
+ - Ubuntu:
+ http://packages.ubuntu.com/search?searchon=names&keywords=bup
+ - pkgsrc (NetBSD, Dragonfly, and others)
+ http://pkgsrc.se/sysutils/bup
+ http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/sysutils/bup/
+ - Arch Linux:
+ https://www.archlinux.org/packages/?sort=&q=bup
+ - Fedora:
+ https://apps.fedoraproject.org/packages/bup
+
+
+Using bup
+---------
+
+ - Get help for any bup command:
+
+ bup help
+ bup help init
+ bup help index
+ bup help save
+ bup help restore
+ ...
+
+ - Initialize the default BUP_DIR (~/.bup):
+
+ bup init
+
+ - Make a local backup (-v or -vv will increase the verbosity):
+
+ bup index /etc
+ bup save -n local-etc /etc
+
+ - Restore a local backup to ./dest:
+
+ bup restore -C ./dest local-etc/latest/etc
+ ls -l dest/etc
+
+ - Look at how much disk space your backup took:
+
+ du -s ~/.bup
+
+ - Make another backup (which should be mostly identical to the last one;
+ notice that you don't have to *specify* that this backup is incremental,
+ it just saves space automatically):
+
+ bup index /etc
+ bup save -n local-etc /etc
+
+ - Look how little extra space your second backup used (on top of the first):
+
+ du -s ~/.bup
+
+ - Get a list of your previous backups:
+
+ bup ls local-etc
+
+ - Restore your first backup again:
+
+ bup restore -C ./dest-2 local-etc/2013-11-23-11195/etc
+
+ - Make a backup to a remote server which must already have the 'bup' command
+ somewhere in its PATH (see /etc/profile, etc/environment, ~/.profile, or
+ ~/.bashrc), and be accessible via ssh.
+ Make sure to replace SERVERNAME with the actual hostname of your server:
+
+ bup init -r SERVERNAME:path/to/remote-bup-dir
+ bup index /etc
+ bup save -r SERVERNAME:path/to/remote-bup-dir -n local-etc /etc
+
+ - Make a remote backup to ~/.bup on SERVER:
+
+ bup index /etc
+ bup save -r SERVER: -n local-etc /etc
+
+ - See what saves are available in ~/.bup on SERVER:
+
+ bup ls -r SERVER:
+
+ - Restore the remote backup to ./dest:
+
+ bup restore -r SERVER: -C ./dest local-etc/latest/etc
+ ls -l dest/etc
+
+ - Defend your backups from death rays (OK fine, more likely from the
+ occasional bad disk block). This writes parity information
+ (currently via par2) for all of the existing data so that bup may
+ be able to recover from some amount of repository corruption:
+
+ bup fsck -g
+
+ - Use split/join instead of index/save/restore. Try making a local
+ backup using tar:
tar -cvf - /etc | bup split -n local-etc -vv
- - Try restoring your backup tarball:
+ - Try restoring the tarball:
bup join local-etc | tar -tf -
@@ -102,58 +289,104 @@ Getting started
du -s ~/.bup
- - Make another backup (which should be mostly identical to the last one;
- notice that you don't have to *specify* that this backup is incremental,
- it just saves space automatically):
+ - Make another tar backup:
tar -cvf - /etc | bup split -n local-etc -vv
- - Look how little extra space your second backup used on top of the first:
+ - Look at how little extra space your second backup used on top of
+ the first:
du -s ~/.bup
- - Restore your old backup again (the ~1 is git notation for "one older than
- the most recent"):
+ - Restore the first tar backup again (the ~1 is git notation for "one
+ older than the most recent"):
bup join local-etc~1 | tar -tf -
- - get a list of your previous backups:
+ - Get a list of your previous split-based backups:
GIT_DIR=~/.bup git log local-etc
- - make a backup on a remote server (which must already have the 'bup' command
- somewhere in the PATH, and be accessible via ssh; make sure to replace
- SERVERNAME with the actual hostname of your server):
+ - Save a tar archive to a remote server (without tar -z to facilitate
+ deduplication):
tar -cvf - /etc | bup split -r SERVERNAME: -n local-etc -vv
- - try restoring the remote backup tarball:
+ - Restore the archive:
bup join -r SERVERNAME: local-etc | tar -tf -
- - try using the new (slightly experimental) 'bup index' and 'bup save'
- style backups, which bypass 'tar' but have some missing features (see
- "Things that are stupid" below):
-
- bup index -uv /etc
- bup save -n local-etc /etc
-
- - do it again and see how fast an incremental backup can be:
-
- bup index -uv /etc
- bup save -n local-etc /etc
-
- (You can also use the "-r SERVERNAME:" option to 'bup save', just like
- with 'bup split' and 'bup join'. The index itself is always local,
- so you don't need -r there.)
-
That's all there is to it!
+Notes on FreeBSD
+----------------
+
+- FreeBSD's default 'make' command doesn't like bup's Makefile. In order to
+ compile the code, run tests and install bup, you need to install GNU Make
+ from the port named 'gmake' and use its executable instead in the commands
+ seen above. (i.e. 'gmake test' runs bup's test suite)
+
+- Python's development headers are automatically installed with the 'python'
+ port so there's no need to install them separately.
+
+- To use the 'bup fuse' command, you need to install the fuse kernel module
+ from the 'fusefs-kmod' port in the 'sysutils' section and the libraries from
+ the port named 'py-fusefs' in the 'devel' section.
+
+- The 'par2' command can be found in the port named 'par2cmdline'.
+
+- In order to compile the documentation, you need pandoc which can be found in
+ the port named 'hs-pandoc' in the 'textproc' section.
+
+
+Notes on NetBSD/pkgsrc
+----------------------
+
+ - See pkgsrc/sysutils/bup, which should be the most recent stable
+ release and includes man pages. It also has a reasonable set of
+ dependencies (git, par2, py-fuse-bindings).
+
+ - The "fuse-python" package referred to is hard to locate, and is a
+ separate tarball for the python language binding distributed by the
+ fuse project on sourceforge. It is available as
+ pkgsrc/filesystems/py-fuse-bindings and on NetBSD 5, "bup fuse"
+ works with it.
+
+ - "bup fuse" presents every directory/file as inode 0. The directory
+ traversal code ("fts") in NetBSD's libc will interpret this as a
+ cycle and error out, so "ls -R" and "find" will not work.
+
+ - There is no support for ACLs. If/when some entrprising person
+ fixes this, adjust t/compare-trees.
+
+
+Notes on Cygwin
+---------------
+
+ - There is no support for ACLs. If/when some enterprising person
+ fixes this, adjust t/compare-trees.
+
+ - In t/test.sh, two tests have been disabled. These tests check to
+ see that repeated saves produce identical trees and that an
+ intervening index doesn't change the SHA1. Apparently Cygwin has
+ some unusual behaviors with respect to access times (that probably
+ warrant further investigation). Possibly related:
+ http://cygwin.com/ml/cygwin/2007-06/msg00436.html
+
+
+Notes on OS X
+-------------
+
+ - There is no support for ACLs. If/when some enterprising person
+ fixes this, adjust t/compare-trees.
+
+
How it works
-------------
+============
Basic storage:
+--------------
bup stores its data in a git-formatted repository. Unfortunately, git
itself doesn't actually behave very well for bup's use case (huge numbers of
@@ -164,8 +397,8 @@ python.
Basically, 'bup split' reads the data on stdin (or from files specified on
the command line), breaks it into chunks using a rolling checksum (similar to
-rsync), and saves those chunks into a new git packfile. There is one git
-packfile per backup.
+rsync), and saves those chunks into a new git packfile. There is at least one
+git packfile per backup.
When deciding whether to write a particular chunk into the new packfile, bup
first checks all the other packfiles that exist to see if they already have that
@@ -189,6 +422,7 @@ that tree, respectively, to stdout. You can use this to construct your own
scripts that do something with those values.
The bup index:
+--------------
'bup index' walks through your filesystem and updates a file (whose name is,
by default, ~/.bup/bupindex) to contain the name, attributes, and an
@@ -211,36 +445,48 @@ a lot of files have changed.
Things that are stupid for now but which we'll fix later
---------------------------------------------------------
+========================================================
-Help with any of these problems, or others, is very, very welcome. Join the
+Help with any of these problems, or others, is very welcome. Join the
mailing list (see below) if you'd like to help.
- - 'bup save' doesn't know about file metadata.
+ - 'bup save' and 'bup restore' have immature metadata support.
- That means we aren't saving file attributes, mtimes, ownership, hard
- links, MacOS resource forks, etc. Clearly this needs to be improved.
+ On the plus side, they actually do have support now, but it's new,
+ and not remotely as well tested as tar/rsync/whatever's. However,
+ you have to start somewhere, and as of 0.25, we think it's ready
+ for more general use. Please let us know if you have any trouble.
+
+ Also, if any strip or graft-style options are specified to 'bup
+ save', then no metadata will be written for the root directory.
+ That's obviously less than ideal.
+
+ - bup is overly optimistic about mmap. Right now bup just assumes
+ that it can mmap as large a block as it likes, and that mmap will
+ never fail. Yeah, right... If nothing else, this has failed on
+ 32-bit architectures (and 31-bit is even worse -- looking at you,
+ s390).
+
+ To fix this, we might just implement a FakeMmap[1] class that uses
+ normal file IO and handles all of the mmap methods[2] that bup
+ actually calls. Then we'd swap in one of those whenever mmap
+ fails.
+
+ This would also require implementing some of the methods needed to
+ support "[]" array access, probably at a minimum __getitem__,
+ __setitem__, and __setslice__ [3].
+
+ [1] http://comments.gmane.org/gmane.comp.sysutils.backup.bup/613
+ [2] http://docs.python.org/2/library/mmap.html
+ [3] http://docs.python.org/2/reference/datamodel.html#emulating-container-types
- - There's no 'bup restore' yet.
-
- 'bup save' saves files in the standard git 'tree of blobs' format, so you
- could then "restore" the files using something like 'git checkout'. But
- that's a git command, not a bup command, so it's hard to explain and
- doesn't support retrieving objects from a remote bup server without first
- fetching and packing an entire (possibly huge) pack, which could be very
- slow. Also, like 'bup save', you would need extra features in order to
- properly restore file metadata. And files that bup has split into
- chunks would need to be recombined somehow.
-
- 'bup index' is slower than it should be.
It's still rather fast: it can iterate through all the filenames on my
- 600,000 file filesystem in a few seconds. But sometimes you just want to
- change a filename or two, so this is needlessly slow. There should be
- a way to binary search through the file list rather than always going
- through it sequentially. And if you only add a couple of filenames,
- there's no need to rewrite the entire index; just leave the new files
- in a second "extra index" file or something.
+ 600,000 file filesystem in a few seconds. But it still needs to rewrite
+ the entire index file just to add a single filename, which is pretty
+ nasty; it should just leave the new files in a second "extra index" file
+ or something.
- bup could use inotify for *really* efficient incremental backups.
@@ -249,32 +495,58 @@ mailing list (see below) if you'd like to help.
give the continuous-backup process a really low CPU and I/O priority so
you wouldn't even know it was running.
- - bup currently has no features that prune away *old* backups.
-
- Because of the way the packfile system works, backups become "entangled"
- in weird ways and it's not actually possible to delete one pack
- (corresponding approximately to one backup) without risking screwing up
- other backups.
-
- git itself has lots of ways of optimizing this sort of thing, but its
- methods aren't really applicable here; bup packfiles are just too huge.
- We'll have to do it in a totally different way. There are lots of
- options. For now: make sure you've got lots of disk space :)
+ - bup only has experimental support for pruning old backups.
+
+ While you should now be able to drop old saves and branches with
+ `bup rm`, and reclaim the space occupied by data that's no longer
+ needed by other backups with `bup gc`, these commands are
+ experimental, and should be handled with great care. See the
+ man pages for more information.
- - bup has never been tested on anything but Linux, MacOS, and Linux+Cygwin.
+ Unless you want to help test the new commands, one possible
+ workaround is to just start a new BUP_DIR occasionally,
+ i.e. bup-2013, bup-2014...
+
+ - bup has never been tested on anything but Linux, FreeBSD, NetBSD,
+ OS X, and Windows+Cygwin.
There's nothing that makes it *inherently* non-portable, though, so
that's mostly a matter of someone putting in some effort. (For a
"native" Windows port, the most annoying thing is the absence of ssh in
a default Windows installation.)
+
+ - bup needs better documentation.
+
+ According to a recent article about bup in Linux Weekly News
+ (https://lwn.net/Articles/380983/), "it's a bit short on examples and
+ a user guide would be nice." Documentation is the sort of thing that
+ will never be great unless someone from outside contributes it (since
+ the developers can never remember which parts are hard to understand).
+
+ - bup is "relatively speedy" and has "pretty good" compression.
+
+ ...according to the same LWN article. Clearly neither of those is good
+ enough. We should have awe-inspiring speed and crazy-good compression.
+ Must work on that. Writing more parts in C might help with the speed.
- - bup has no GUI. Actually, that's not stupid, but you might consider it
- a limitation. There are a bunch of Linux GUI backup programs; someday
- I expect someone will adapt one of them to use bup.
+ - bup has no GUI.
+
+ Actually, that's not stupid, but you might consider it a
+ limitation. See the ["Related Projects"](https://bup.github.io/)
+ list for some possible options.
+
+More Documentation
+==================
+
+bup has an extensive set of man pages. Try using 'bup help' to get
+started, or use 'bup help SUBCOMMAND' for any bup subcommand (like split,
+join, index, save, etc.) to get details on that command.
+
+For further technical details, please see ./DESIGN.
How you can help
-----------------
+================
bup is a work in progress and there are many ways it can still be improved.
If you'd like to contribute patches, ideas, or bug reports, please join the
@@ -288,6 +560,11 @@ and you can subscribe by sending a message to:
bup-list+subscribe@googlegroups.com
+Please see ./HACKING for
+additional information, i.e. how to submit patches (hint - no pull
+requests), how we handle branches, etc.
+
+
Have fun,
Avery