X-Git-Url: https://arthur.barton.de/cgi-bin/gitweb.cgi?p=bup.git;a=blobdiff_plain;f=README.md;h=c580ecea5fd165f37a187a748e613ec83fca6956;hp=077645f80c0d34727d359de2c20733e2e8e4377d;hb=ae9cde2e3df85569bf76487a9528080926841db5;hpb=24fb7867ac67fa7ee45fff12621db3be6c627fec diff --git a/README.md b/README.md index 077645f..c580ece 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,5 @@ - -bup 0.04: It backs things up -============================ +bup: It backs things up +======================= bup is a program that backs things up. It's short for "backup." Can you believe that nobody else has named an open source program "bup" after all @@ -34,7 +33,9 @@ bup has a few advantages over other backup software: - Unlike git, it writes packfiles *directly* (instead of having a separate garbage collection / repacking stage) so it's fast even with gratuitously - huge amounts of data. + huge amounts of data. bup's improved index formats also allow you to + track far more filenames than git (millions) and keep track of far more + objects (hundreds or thousands of gigabytes). - Data is "automagically" shared between incremental backups without having to know which backup is based on which other one - even if the backups @@ -42,11 +43,23 @@ bup has a few advantages over other backup software: other. You just tell bup to back stuff up, and it saves only the minimum amount of data needed. + - You can back up directly to a remote bup server, without needing tons of + temporary disk space on the computer being backed up. And if your backup + is interrupted halfway through, the next run will pick up where you left + off. And it's easy to set up a bup server: just install bup on any + machine where you have ssh access. + + - Bup can use "par2" redundancy to recover corrupted backups even if your + disk has undetected bad sectors. + - Even when a backup is incremental, you don't have to worry about restoring the full backup, then each of the incrementals in turn; an incremental backup *acts* as if it's a full backup, it just takes less disk space. + - You can mount your bup repository as a FUSE filesystem and access the + content that way, and even export it over Samba. + - It's written in python (with some C parts to make it faster) so it's easy for you to extend and maintain. @@ -54,105 +67,411 @@ bup has a few advantages over other backup software: Reasons you might want to avoid bup ----------------------------------- - - This is a very early version. Therefore it will most probably not work - for you, but we don't know why. It is also missing some - probably-critical features. + - It's not remotely as well tested as something like tar, so it's + more likely to eat your data. It's also missing some + probably-critical features, though fewer than it used to be. - - It requires python 2.5, a C compiler, and an installed git version >= 1.5.2. + - It requires python >= 2.6, a C compiler, and an installed git + version >= 1.5.6. It also requires par2 if you want fsck to be + able to generate the information needed to recover from some types + of corruption. - - It currently only works on Linux, MacOS X 10.5, or Windows (with Cygwin). - Patches to support other platforms are welcome. - - - It has almost no documentation. Not even a man page! This file is all - you get for now. - - + - It currently only works on Linux, FreeBSD, NetBSD, OS X >= 10.4, + Solaris, or Windows (with Cygwin, and maybe with WSL). Patches to + support other platforms are welcome. + + - Any items in "Things that are stupid" below. + + +Notable changes introduced by a release +======================================= + + - Changes in 0.30 as compared to 0.29.3 + - Changes in 0.29.3 as compared to 0.29.2 + - Changes in 0.29.2 as compared to 0.29.1 + - Changes in 0.29.1 as compared to 0.29 + - Changes in 0.29 as compared to 0.28.1 + - Changes in 0.28.1 as compared to 0.28 + - Changes in 0.28 as compared to 0.27.1 + - Changes in 0.27.1 as compared to 0.27 + + +Test status +=========== + +| branch | Debian | FreeBSD | macOS | +|--------|------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------| +| master | [![Debian test status](https://api.cirrus-ci.com/github/bup/bup.svg?branch=master&task=debian)](https://cirrus-ci.com/github/bup/bup) | [![FreeBSD test status](https://api.cirrus-ci.com/github/bup/bup.svg?branch=master&task=freebsd)](https://cirrus-ci.com/github/bup/bup) | [![macOS test status](https://api.cirrus-ci.com/github/bup/bup.svg?branch=master&task=macos)](https://cirrus-ci.com/github/bup/bup) | +| 0.29.x | [![Debian test status](https://api.cirrus-ci.com/github/bup/bup.svg?branch=0.29.x&task=debian)](https://cirrus-ci.com/github/bup/bup) | [![FreeBSD test status](https://api.cirrus-ci.com/github/bup/bup.svg?branch=0.29.x&task=freebsd)](https://cirrus-ci.com/github/bup/bup) | [![macOS test status](https://api.cirrus-ci.com/github/bup/bup.svg?branch=0.29.x&task=macos)](https://cirrus-ci.com/github/bup/bup) | + Getting started ---------------- +=============== - - check out the bup source code using git: - - git clone git://github.com/apenwarr/bup +From source +----------- - - install the python 2.5 development libraries. On Debian or Ubuntu, this - is: - apt-get install python2.5-dev - - - build the python module and symlinks: - - make - - - run the tests: - - make test - - (The tests should pass. If they don't pass for you, stop here and send - me an email.) - - - Try making a local backup as a tar file: - - tar -cvf - /etc | bup split -n local-etc -vv + - Check out the bup source code using git: + + ```sh + git clone https://github.com/bup/bup + ``` + + - This will leave you on the master branch, which is perfect if you + would like to help with development, but if you'd just like to use + bup, please check out the latest stable release like this: + + ```sh + git checkout 0.29.1 + ``` + + You can see the latest stable release here: + https://github.com/bup/bup/releases. + + - Install the required python libraries (including the development + libraries). + + On very recent Debian/Ubuntu versions, this may be sufficient (run + as root): + + ```sh + apt-get build-dep bup + ``` + + Otherwise try this (substitute python2.6-dev if you have an older + system): + + ```sh + apt-get install python2.7-dev python-fuse + apt-get install python-pyxattr python-pylibacl + apt-get install linux-libc-dev + apt-get install acl attr + apt-get install python-tornado # optional + ``` + + On CentOS (for CentOS 6, at least), this should be sufficient (run + as root): + + ```sh + yum groupinstall "Development Tools" + yum install python python-devel + yum install fuse-python pyxattr pylibacl + yum install perl-Time-HiRes + ``` + + In addition to the default CentOS repositories, you may need to add + RPMForge (for fuse-python) and EPEL (for pyxattr and pylibacl). + + On Cygwin, install python, make, rsync, and gcc4. + + If you would like to use the optional bup web server on systems + without a tornado package, you may want to try this: + + ```sh + pip install tornado + ``` + + - Build the python module and symlinks: + + ```sh + make + ``` - - Try restoring your backup tarball: - - bup join local-etc | tar -tf - + - Run the tests: + + ```sh + make long-check + ``` + + or if you're in a bit more of a hurry: + + ```sh + make check + ``` + The tests should pass. If they don't pass for you, stop here and + send an email to bup-list@googlegroups.com. Though if there are + symbolic links along the current working directory path, the tests + may fail. Running something like this before "make test" should + sidestep the problem: + + ```sh + cd "$(pwd -P)" + ``` + + - You can install bup via "make install", and override the default + destination with DESTDIR and PREFIX. + + Files are normally installed to "$DESTDIR/$PREFIX" where DESTDIR is + empty by default, and PREFIX is set to /usr/local. So if you wanted to + install bup to /opt/bup, you might do something like this: + + ```sh + make install DESTDIR=/opt/bup PREFIX='' + ``` + + - The Python executable that bup will use is chosen by ./configure, + which will search for a reasonable version unless PYTHON is set in + the environment, in which case, bup will use that path. You can + see which Python executable was chosen by looking at the + configure output, or examining cmd/python-cmd.sh, and you can + change the selection by re-running ./configure. + +From binary packages +-------------------- + +Binary packages of bup are known to be built for the following OSes: + + - Debian: + http://packages.debian.org/search?searchon=names&keywords=bup + - Ubuntu: + http://packages.ubuntu.com/search?searchon=names&keywords=bup + - pkgsrc (NetBSD, Dragonfly, and others) + http://pkgsrc.se/sysutils/bup + http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/sysutils/bup/ + - Arch Linux: + https://www.archlinux.org/packages/?sort=&q=bup + - Fedora: + https://apps.fedoraproject.org/packages/bup + + +Using bup +--------- + + - Get help for any bup command: + + ```sh + bup help + bup help init + bup help index + bup help save + bup help restore + ... + ``` + + - Initialize the default BUP_DIR (~/.bup -- you can choose another by + either specifying `bup -d DIR ...` or setting the `BUP_DIR` + environment variable for a command): + + ```sh + bup init + ``` + + - Make a local backup (-v or -vv will increase the verbosity): + + ```sh + bup index /etc + bup save -n local-etc /etc + ``` + + - Restore a local backup to ./dest: + + ```sh + bup restore -C ./dest local-etc/latest/etc + ls -l dest/etc + ``` + - Look at how much disk space your backup took: - - du -s ~/.bup - + + ```sh + du -s ~/.bup + ``` + - Make another backup (which should be mostly identical to the last one; notice that you don't have to *specify* that this backup is incremental, it just saves space automatically): - - tar -cvf - /etc | bup split -n local-etc -vv + + ```sh + bup index /etc + bup save -n local-etc /etc + ``` + + - Look how little extra space your second backup used (on top of the first): + + ```sh + du -s ~/.bup + ``` + + - Get a list of your previous backups: + + ```sh + bup ls local-etc + ``` + + - Restore your first backup again: + + ```sh + bup restore -C ./dest-2 local-etc/2013-11-23-11195/etc + ``` + + - Make a backup to a remote server which must already have the 'bup' command + somewhere in its PATH (see /etc/profile, etc/environment, ~/.profile, or + ~/.bashrc), and be accessible via ssh. + Make sure to replace SERVERNAME with the actual hostname of your server: + + ```sh + bup init -r SERVERNAME:path/to/remote-bup-dir + bup index /etc + bup save -r SERVERNAME:path/to/remote-bup-dir -n local-etc /etc + ``` + + - Make a remote backup to ~/.bup on SERVER: + + ```sh + bup index /etc + bup save -r SERVER: -n local-etc /etc + ``` + + - See what saves are available in ~/.bup on SERVER: + + ```sh + bup ls -r SERVER: + ``` + + - Restore the remote backup to ./dest: + + ```sh + bup restore -r SERVER: -C ./dest local-etc/latest/etc + ls -l dest/etc + ``` + + - Defend your backups from death rays (OK fine, more likely from the + occasional bad disk block). This writes parity information + (currently via par2) for all of the existing data so that bup may + be able to recover from some amount of repository corruption: + + ```sh + bup fsck -g + ``` + + - Use split/join instead of index/save/restore. Try making a local + backup using tar: + + ```sh + tar -cvf - /etc | bup split -n local-etc -vv + ``` - - Look how little extra space your second backup used on top of the first: - - du -s ~/.bup + - Try restoring the tarball: + + ```sh + bup join local-etc | tar -tf - + ``` - - Restore your old backup again (the ~1 is git notation for "one older than - the most recent"): - - bup join local-etc~1 | tar -tf - - - - get a list of your previous backups: + - Look at how much disk space your backup took: + + ```sh + du -s ~/.bup + ``` + + - Make another tar backup: + + ```sh + tar -cvf - /etc | bup split -n local-etc -vv + ``` + + - Look at how little extra space your second backup used on top of + the first: + + ```sh + du -s ~/.bup + ``` + + - Restore the first tar backup again (the ~1 is git notation for "one + older than the most recent"): + + ```sh + bup join local-etc~1 | tar -tf - + ``` - GIT_DIR=~/.bup git log local-etc + - Get a list of your previous split-based backups: + + ```sh + GIT_DIR=~/.bup git log local-etc + ``` - - make a backup on a remote server (which must already have the 'bup' command - somewhere in the PATH, and be accessible via ssh; make sure to replace - SERVERNAME with the actual hostname of your server): - - tar -cvf - /etc | bup split -r SERVERNAME: -n local-etc -vv - - - try restoring the remote backup tarball: - - bup join -r SERVERNAME: local-etc | tar -tf - - - - try using the new (slightly experimental) 'bup index' and 'bup save' - style backups, which bypass 'tar' but have some missing features (see - "Things that are stupid" below): - - bup index -uv /etc - bup save -n local-etc /etc - - - do it again and see how fast an incremental backup can be: + - Save a tar archive to a remote server (without tar -z to facilitate + deduplication): + + ```sh + tar -cvf - /etc | bup split -r SERVERNAME: -n local-etc -vv + ``` - bup index -uv /etc - bup save -n local-etc /etc - - (You can also use the "-r SERVERNAME:" option to 'bup save', just like - with 'bup split' and 'bup join'. The index itself is always local, - so you don't need -r there.) + - Restore the archive: + + ```sh + bup join -r SERVERNAME: local-etc | tar -tf - + ``` That's all there is to it! +Notes on FreeBSD +---------------- + +- FreeBSD's default 'make' command doesn't like bup's Makefile. In order to + compile the code, run tests and install bup, you need to install GNU Make + from the port named 'gmake' and use its executable instead in the commands + seen above. (i.e. 'gmake test' runs bup's test suite) + +- Python's development headers are automatically installed with the 'python' + port so there's no need to install them separately. + +- To use the 'bup fuse' command, you need to install the fuse kernel module + from the 'fusefs-kmod' port in the 'sysutils' section and the libraries from + the port named 'py-fusefs' in the 'devel' section. + +- The 'par2' command can be found in the port named 'par2cmdline'. + +- In order to compile the documentation, you need pandoc which can be found in + the port named 'hs-pandoc' in the 'textproc' section. + + +Notes on NetBSD/pkgsrc +---------------------- + + - See pkgsrc/sysutils/bup, which should be the most recent stable + release and includes man pages. It also has a reasonable set of + dependencies (git, par2, py-fuse-bindings). + + - The "fuse-python" package referred to is hard to locate, and is a + separate tarball for the python language binding distributed by the + fuse project on sourceforge. It is available as + pkgsrc/filesystems/py-fuse-bindings and on NetBSD 5, "bup fuse" + works with it. + + - "bup fuse" presents every directory/file as inode 0. The directory + traversal code ("fts") in NetBSD's libc will interpret this as a + cycle and error out, so "ls -R" and "find" will not work. + + - There is no support for ACLs. If/when some enterprising person + fixes this, adjust t/compare-trees. + + +Notes on Cygwin +--------------- + + - There is no support for ACLs. If/when some enterprising person + fixes this, adjust t/compare-trees. + + - In t/test.sh, two tests have been disabled. These tests check to + see that repeated saves produce identical trees and that an + intervening index doesn't change the SHA1. Apparently Cygwin has + some unusual behaviors with respect to access times (that probably + warrant further investigation). Possibly related: + http://cygwin.com/ml/cygwin/2007-06/msg00436.html + + +Notes on OS X +------------- + + - There is no support for ACLs. If/when some enterprising person + fixes this, adjust t/compare-trees. + + How it works ------------- +============ Basic storage: +-------------- bup stores its data in a git-formatted repository. Unfortunately, git itself doesn't actually behave very well for bup's use case (huge numbers of @@ -163,8 +482,8 @@ python. Basically, 'bup split' reads the data on stdin (or from files specified on the command line), breaks it into chunks using a rolling checksum (similar to -rsync), and saves those chunks into a new git packfile. There is one git -packfile per backup. +rsync), and saves those chunks into a new git packfile. There is at least one +git packfile per backup. When deciding whether to write a particular chunk into the new packfile, bup first checks all the other packfiles that exist to see if they already have that @@ -188,6 +507,7 @@ that tree, respectively, to stdout. You can use this to construct your own scripts that do something with those values. The bup index: +-------------- 'bup index' walks through your filesystem and updates a file (whose name is, by default, ~/.bup/bupindex) to contain the name, attributes, and an @@ -210,36 +530,48 @@ a lot of files have changed. Things that are stupid for now but which we'll fix later --------------------------------------------------------- +======================================================== -Help with any of these problems, or others, is very, very welcome. Let me -know if you'd like to help. Maybe we can start a mailing list. +Help with any of these problems, or others, is very welcome. Join the +mailing list (see below) if you'd like to help. - - 'bup save' doesn't know about file metadata. + - 'bup save' and 'bup restore' have immature metadata support. - That means we aren't saving file attributes, mtimes, ownership, hard - links, MacOS resource forks, etc. Clearly this needs to be improved. + On the plus side, they actually do have support now, but it's new, + and not remotely as well tested as tar/rsync/whatever's. However, + you have to start somewhere, and as of 0.25, we think it's ready + for more general use. Please let us know if you have any trouble. + + Also, if any strip or graft-style options are specified to 'bup + save', then no metadata will be written for the root directory. + That's obviously less than ideal. + + - bup is overly optimistic about mmap. Right now bup just assumes + that it can mmap as large a block as it likes, and that mmap will + never fail. Yeah, right... If nothing else, this has failed on + 32-bit architectures (and 31-bit is even worse -- looking at you, + s390). + + To fix this, we might just implement a FakeMmap[1] class that uses + normal file IO and handles all of the mmap methods[2] that bup + actually calls. Then we'd swap in one of those whenever mmap + fails. + + This would also require implementing some of the methods needed to + support "[]" array access, probably at a minimum __getitem__, + __setitem__, and __setslice__ [3]. + + [1] http://comments.gmane.org/gmane.comp.sysutils.backup.bup/613 + [2] http://docs.python.org/2/library/mmap.html + [3] http://docs.python.org/2/reference/datamodel.html#emulating-container-types - - There's no 'bup restore' yet. - - 'bup save' saves files in the standard git 'tree of blobs' format, so you - could then "restore" the files using something like 'git checkout'. But - that's a git command, not a bup command, so it's hard to explain and - doesn't support retrieving objects from a remote bup server without first - fetching and packing an entire (possibly huge) pack, which could be very - slow. Also, like 'bup save', you would need extra features in order to - properly restore file metadata. And files that bup has split into - chunks would need to be recombined somehow. - - 'bup index' is slower than it should be. It's still rather fast: it can iterate through all the filenames on my - 600,000 file filesystem in a few seconds. But sometimes you just want to - change a filename or two, so this is needlessly slow. There should be - a way to binary search through the file list rather than always going - through it sequentially. And if you only add a couple of filenames, - there's no need to rewrite the entire index; just leave the new files - in a second "extra index" file or something. + 600,000 file filesystem in a few seconds. But it still needs to rewrite + the entire index file just to add a single filename, which is pretty + nasty; it should just leave the new files in a second "extra index" file + or something. - bup could use inotify for *really* efficient incremental backups. @@ -248,41 +580,58 @@ know if you'd like to help. Maybe we can start a mailing list. give the continuous-backup process a really low CPU and I/O priority so you wouldn't even know it was running. - - bup currently has no features that prune away *old* backups. - - Because of the way the packfile system works, backups become "entangled" - in weird ways and it's not actually possible to delete one pack - (corresponding approximately to one backup) without risking screwing up - other backups. - - git itself has lots of ways of optimizing this sort of thing, but its - methods aren't really applicable here; bup packfiles are just too huge. - We'll have to do it in a totally different way. There are lots of - options. For now: make sure you've got lots of disk space :) + - bup only has experimental support for pruning old backups. - - bup doesn't ever validate existing backups/packs to ensure they're - correct. - - This would be easy to implement (given that git uses hashes and CRCs all - over the place), but nobody has implemented it. For now, you could try - doing a test restore of your tarball; doing so should trigger git's error - handling if any of the objects are corrupted. 'git fsck' would - theoreticaly work too, but it's too slow for huge backups. + While you should now be able to drop old saves and branches with + `bup rm`, and reclaim the space occupied by data that's no longer + needed by other backups with `bup gc`, these commands are + experimental, and should be handled with great care. See the + man pages for more information. - - bup has never been tested on anything but Linux, MacOS, and Linux+Cygwin. + Unless you want to help test the new commands, one possible + workaround is to just start a new BUP_DIR occasionally, + i.e. bup-2013, bup-2014... + + - bup has never been tested on anything but Linux, FreeBSD, NetBSD, + OS X, and Windows+Cygwin. There's nothing that makes it *inherently* non-portable, though, so that's mostly a matter of someone putting in some effort. (For a "native" Windows port, the most annoying thing is the absence of ssh in a default Windows installation.) + + - bup needs better documentation. + + According to an article about bup in Linux Weekly News + (https://lwn.net/Articles/380983/), "it's a bit short on examples and + a user guide would be nice." Documentation is the sort of thing that + will never be great unless someone from outside contributes it (since + the developers can never remember which parts are hard to understand). + + - bup is "relatively speedy" and has "pretty good" compression. + + ...according to the same LWN article. Clearly neither of those is good + enough. We should have awe-inspiring speed and crazy-good compression. + Must work on that. Writing more parts in C might help with the speed. - - bup has no GUI. Actually, that's not stupid, but you might consider it - a limitation. There are a bunch of Linux GUI backup programs; someday - I expect someone will adapt one of them to use bup. + - bup has no GUI. + + Actually, that's not stupid, but you might consider it a + limitation. See the ["Related Projects"](https://bup.github.io/) + list for some possible options. + +More Documentation +================== + +bup has an extensive set of man pages. Try using 'bup help' to get +started, or use 'bup help SUBCOMMAND' for any bup subcommand (like split, +join, index, save, etc.) to get details on that command. + +For further technical details, please see ./DESIGN. How you can help ----------------- +================ bup is a work in progress and there are many ways it can still be improved. If you'd like to contribute patches, ideas, or bug reports, please join the @@ -296,7 +645,11 @@ and you can subscribe by sending a message to: bup-list+subscribe@googlegroups.com +Please see ./HACKING for +additional information, i.e. how to submit patches (hint - no pull +requests), how we handle branches, etc. + + Have fun, Avery -January 2010