Aaron M. Ucko [Mon, 30 May 2011 23:03:01 +0000 (19:03 -0400)]
index.py: new format (V3), with inodes, link counts, and 64-bit times.
To allow unambiguous preservation of hard-link structure, index device
numbers, inode numbers (new) and link counts (new) at 64 bits apiece
per GNU libc, which uses uint64_t, uint64_t, and unsigned long respectively.
Take the opportunity to use 64 bits for mtime and ctime as well, both
to be ready for Y2038 and to handle NTFS's zero value (Y1600).
Aaron M. Ucko [Mon, 30 May 2011 23:03:00 +0000 (19:03 -0400)]
Cap timestamps in index to avoid needing to worry about fractional parts.
Avoid a potential race condition by which bup's use of whole-second
granularity for timestamps in the index could let it theoretically
miss some last-second changes by capping timestamps to at most one
second before the start of indexing per a newly introduced mandatory
parameter to bup.index.Writer.
Aaron M. Ucko [Mon, 30 May 2011 23:02:35 +0000 (19:02 -0400)]
Improve formatting of error and warning messages.
log() trailing newlines as appropriate. Fix a format string typo in
lib/bup/git.py encountered when verifying that exceptions' string
values already end with newlines.
Avery Pennarun [Mon, 30 May 2011 00:50:25 +0000 (20:50 -0400)]
Merge branch 'master' into meta
* master: (27 commits)
t/test.sh: 'ls' on NetBSD sets -A by default as root; work around it.
README: add a list of binary packages
README: rework the title hierarchy
Clarify the message when the BUP_DIR doesn't exist.
Refactor: unify ls/ftp-ls code
ftp/ls: Adjust documentation
ls: include hidden files when explicitly requested
ftp: implement ls -s (show hashes)
ftp/ls: columnate output attached to a tty, else don't
ftp: don't output trailing line for 'ls'
ftp: output a newline on EOF when on a tty
config: more config stuff to config/ subdir, call it from Makefile.
cmd/{split,save}: support any compression level using the new -# feature.
options.py: add support for '-#' style compression options.
Add documentation for compression levels
Add test case for compression level
Add compression level options to bup save and bup split
Make zlib compression level a parameter for Client
Make zlib compression level a parameter of git.PackWriter
Use is_superuser() rather than checking euid directly
...
Gabriel Filion [Mon, 16 May 2011 05:13:28 +0000 (01:13 -0400)]
README: add a list of binary packages
Debian/Ubuntu are known to have bup packages in their archives, thanks
to Jon Dowland.
Also, a NetBSD package is currently being built, as was shared by Thomas
Klausner. However, it is still not found in the official NetBSD packages
search engine.
Gabriel Filion [Mon, 16 May 2011 05:13:27 +0000 (01:13 -0400)]
README: rework the title hierarchy
In Markdown, a line underlining another one with '=' characters
represents a first level title, while a line underlining another one
with '-' characters represents a second level title.
Rework the title levels to gain visibility on the different sections and
to allow to split "Getting started" more easily (see my next commit for
additions to this section).
Gabriel Filion [Mon, 16 May 2011 04:27:24 +0000 (00:27 -0400)]
Refactor: unify ls/ftp-ls code
Both the 'ls' command and the 'ls' subcommand of the 'ftp' command use
some code that is very similar. Modifications must be done in two places
instead of one and this can lead to inconsistencies.
Refactor code so that both paths use the same function with the same opt
spec.
Gabriel Filion [Mon, 16 May 2011 04:27:22 +0000 (00:27 -0400)]
ls: include hidden files when explicitly requested
The current code of 'bup ls' insists on hiding a file from its listing
even if the file was explicitly requested as an argument. This is not
what users would expect. Remove the condition and always list files
(not directories) starting with a dot when they were given in the
argument list.
Gabriel Filion [Mon, 16 May 2011 04:27:21 +0000 (00:27 -0400)]
ftp: implement ls -s (show hashes)
'bup ls' has a -s flag that can be used to show file hashes on the left
of each file name. 'bup ftp ls' doesn't have that feature.
Implement the feature by copying code from 'bup ls'. This is the last
feature difference between 'bup ls' and 'bup ftp ls' and bringing them
to the same level will make it possible to unify the code that is used
by both.
Gabriel Filion [Mon, 16 May 2011 04:27:20 +0000 (00:27 -0400)]
ftp/ls: columnate output attached to a tty, else don't
'bup ftp ls' and 'bup ls' currently behave in a different manner.
'bup ftp ls' always formats its output in columns regardless of whether
the program's stdout is a tty or not.
'bup ls' always prints one name on each line.
Make both of those commands behave the same. By using lib/bup/helpers'
istty1 variable, decide to format in columns when outputting to a tty,
and to output one file name per line when the output is not a tty.
Gabriel Filion [Mon, 16 May 2011 04:27:19 +0000 (00:27 -0400)]
ftp: don't output trailing line for 'ls'
'ls' is currently the only 'ftp' subcommand that outputs a trailing
newline before the prompt is re-displayed. This is cause by the use of
"print" to output a string that already contains an ending newline.
For a matter of consistency of output, make 'ls' output without that
extra trailing newline.
Gabriel Filion [Sat, 14 May 2011 23:07:56 +0000 (19:07 -0400)]
ftp: output a newline on EOF when on a tty
Using the 'quit' command with ftp while in interactive mode -- attached
to a tty -- ends up clearing the line for the shell to use a fresh one
for the next prompt.
Using Ctrl-D to send an EOF to the application's input while in
interactive mode currently does not clear the line in the same way.
Let's force a newline when an EOF is received from a tty so that the
program exits in a more aesthetic way.
Avery Pennarun [Sun, 15 May 2011 21:06:51 +0000 (17:06 -0400)]
Merge branch 'master' into config
* master:
cmd/{split,save}: support any compression level using the new -# feature.
options.py: add support for '-#' style compression options.
Add documentation for compression levels
Add test case for compression level
Add compression level options to bup save and bup split
Make zlib compression level a parameter for Client
Make zlib compression level a parameter of git.PackWriter
Use is_superuser() rather than checking euid directly
Add is_superuser() helper function
Makefile: add a PREFIX variable for locations other than /usr.
Avery Pennarun [Sun, 8 May 2011 19:09:04 +0000 (19:09 +0000)]
Earlier "negative timestamp" patch had a 64-bit timestamp in the test.
The date in the comment is correct - for -0x80000000. Sadly, the *code*
actually said -0x90000000. That works on 64-bit systems (and filesystems)
not not 32-bit ones, where python gives an encoding error.
In any case, based on the comment (June 10, 1893) it seems tat -0x80000000
must have been the intended value anyway. Now 'make test' passes on 32-bit
Linux again.
Rob Browning [Sun, 27 Mar 2011 17:01:45 +0000 (12:01 -0500)]
Replace os.*stat() with xstat.*stat(); use integer ns for all fs times.
Replace all calls of the os.*stat() functions with calls to the xstat
equivalents. This should leave bup with the xstat stat representation
(and integer ns timestamps) everywhere.
Remove FSTime, and add a few xstat conversion functions to replace the
bits we still want: timespec_to_nsecs(), nsecs_to_timespec(),
fstime_floor_secs(), fstime_to_timespec().
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 27 Mar 2011 17:01:44 +0000 (12:01 -0500)]
Drop xstat floating point timestamp support -- use integer ns.
Drop conditional support for floating point timestamps (the xstat
os.stat() fallback). Switch to integer nanosecond timestamps
everywhere except the metadata record encoding and _helpers.c.
The metadata encoding is still a timespec because separate s and ns
timespec vints compress much better, and timespecs are still returned
by _helpers because the conversion to integer nanoseconds is much more
convenient in Python.
Enforce timespec range expectations -- throw an exception if the
system returns a nanosecond value less than 0 or greater than 999999999. Remove _have_ns_fs_timestamps.
Depend on bup_stat(), bup_stat(), and bup_lstat() unconditionally, and
change the timespec return conversion from "(ll)" to "(Ll)", i.e. long
long range for secs.
This commit may break the build on some platforms -- we'll have to add
suitable conditionals once we see what's needed.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 27 Mar 2011 17:01:43 +0000 (12:01 -0500)]
xstst-cmd.py: test for _have_utimensat rather than _have_ns_fs_timestamps.
Test for _have_utimensat rather than _have_ns_fs_timestamps to decide
whether or not to print the atime and mtime, since the existence of
utimensat() is the real indicator, and since _have_ns_fs_timestamps is
going away.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Avery Pennarun [Sun, 8 May 2011 07:13:48 +0000 (03:13 -0400)]
Merge branch 'master' of git://github.com/thatch/bup
* 'master' of git://github.com/thatch/bup:
Missing space in optspec
Fix a bug where marginally old files couldn't be stored in the index
Show better errors with out-of-range Entry values
Gabriel Filion [Mon, 2 May 2011 23:12:34 +0000 (19:12 -0400)]
Doc: add some precisions for --remote and dumb mode
The -r/--remote argument to some of bup's commands currently doesn't
give enough information about how to customize options to SSH. Let's add
information about this so that users know how to customize options for
SSH connections.
Also, in bup-server's documentation, point out which mode is the default
one for more clarity.
Tim Hatch [Sat, 16 Apr 2011 00:18:51 +0000 (17:18 -0700)]
Fix a bug where marginally old files couldn't be stored in the index
Due to the struct having unsigned timestamps, files with dates between Dec 13,
1901 and Jan 1, 1970 were not representable. This change extends the struct to
be able to pack signed timestamps, which was the spirit of code in _fixup, and
extends the useful range back to 1901. Timestamps prior to 1901 are still
adjusted to zero, as they were before.
There should be no compatibility problems loading packed structures created
before this change, since positive values were truncated at 0x7fffffff.
Rob Browning [Mon, 21 Mar 2011 01:35:53 +0000 (20:35 -0500)]
tgit.py: provoke ENOTDIR rather than EACCES in test_check_repo_or_die().
Replace the objects/pack directory with an empty file to provoke an
ENOTDIR error from stat('objects/pack/.').
Previously the code changed the permissions of the test directory to
0000 in order to provoke an error other than ENOENT (i.e. EACCES), but
that doesn't work when the tests are run as root or fakeroot.
(As Gabriel Filion pointed out, the chmod of the testdir is no
longer necessary, so I removed it and squashed that into this patch.
-- apenwarr)
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
Avery Pennarun [Sun, 20 Mar 2011 10:53:42 +0000 (03:53 -0700)]
cmd/xstat: don't report mtime/atime for symlinks if we don't have_ns_timestamps.
We can't set the atime/mtime on a symlink anyway if we don't
have_ns_timestamps, which means the values are meaningless. Report them as
0 in order to avoid triggering a unit test failure.
Avery Pennarun [Sun, 20 Mar 2011 10:46:32 +0000 (03:46 -0700)]
test-meta.sh: remove a bashism, and don't delete dirs on exit.
It's really annoying to have it wiping out directories that you want to
examine after a failed test. And "set -o pipefail" is not available in the
version of bash on MacOS 10.4.
Avery Pennarun [Sun, 20 Mar 2011 10:45:32 +0000 (03:45 -0700)]
metadata.py: be careful with the umask() when restoring symlinks.
On MacOS, the umask affects symlink permissions, although not in any sort of
useful way that I can see. Still, getting the permissions wrong breaks the
unit tests, so let's be careful about it.
Avery Pennarun [Sun, 20 Mar 2011 09:19:08 +0000 (02:19 -0700)]
tmetadata: the "non existent group name" test didn't make any sense.
There's certainly no reason to expect the file's uid/gid would have changed
after a call that's supposed to fail. It was passing by pure luck on Linux,
which doesn't have a sticky gid bit causing the newly created file to have
a gid != os.getgid(). But on MacOS, the file was originally created with a
gid != os.getgid(), and so restoring its numeric id restored that, and then
the test failed.
The test is still kind of pointless; it doesn't actually test anything
useful, like (for example) automatic fallback to restoring by numeric gid if
the groupname can't be found. In fact, looking at the code, it doesn't seem
like it *would* fall back, which is a bug. But I'm not going to fix that
right now.
Avery Pennarun [Sun, 20 Mar 2011 09:07:03 +0000 (02:07 -0700)]
A bunch of IOError->OSError conversions.
Some of our replacement functions were throwing IOError when the function
they replaced would throw OSError. This was particularly noticeable with
utime() on MacOS, since it caused a unit test failure.
Avery Pennarun [Sun, 20 Mar 2011 08:38:07 +0000 (01:38 -0700)]
metadata: don't die if Linux attr (not xattr) support is missing.
We don't need an import warning for this one, since linuxattr support is
always available on linux, and never available elsewhere, since it's in
_helpers.c and there are no special python modules to install.
Avery Pennarun [Sun, 20 Mar 2011 08:25:13 +0000 (01:25 -0700)]
metadata: recover politely if xattr/acl support is missing.
...previously we'd just crash, which is definitely not polite.
metadata.py now prints warning on import if these features are missing.
That's probably overly obnoxious, especially on systems that don't support
those types of metadata at all. Is there a way to determine whether a
kernel *should* support that type of metadata, so we can warn only if so?
(Obviously if the kernel doesn't support, say, xattrs, there's no point
warning that bup doesn't support them, because no files will be using them
anyway. Hmm...)
Rob Browning [Mon, 14 Mar 2011 01:37:56 +0000 (20:37 -0500)]
Don't accidentally pass atime/ctime/mtime through from_stat_time() twice.
Don't accidentally pass atime/ctime/mtime through
FSTime.from_stat_time() twice in the xstat stat_result.from_stat_rep()
static method when _have_ns_fs_timestamps is false.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Gabriel Filion [Fri, 11 Mar 2011 01:01:30 +0000 (20:01 -0500)]
Makefile: Fix 'clean' rule
In commit 1df0bdd1ad937, I introduced a problem in the make file: the
chmod operation that gives back some permissions on
lib/bup/t/pybuptest.tmp dies if this directory does not exist.
pybuptest.tmp is only created when running the tests.
when the chmod dies, the clean rule stops, thus not completing the
cleanup, so we must make sure this operation is not fatal if the
directory doesn't exist.
Dickon Reed [Fri, 18 Mar 2011 19:25:38 +0000 (12:25 -0700)]
Allow chown to uid:0 to succeed.
The test case assumed that it was not possible to set uid:0 on a file,
which is to say that the current user is not a member of group
0. That's an environmental assumption which is not universal (I am a
counterexample).
Gabriel Filion [Thu, 10 Mar 2011 20:41:54 +0000 (12:41 -0800)]
Verify permissions in check_repo_or_die()
Currently, if one doesn't have read or access permission up to
repo('objects/pack'), bup exits with the following error:
error: repo() is not a bup/git repository
(with repo() replaced with the actual path).
This is misleading, since there is possibly really a repository there
but the user can't access it.
Make git.check_repo_or_die() verify that the current user has the
permission to access repo('objects/pack'), and if not, output a
meaningful error message.
As a bonus, we get an error if the bup_dir path is not a directory.
Gabriel Filion [Mon, 21 Feb 2011 16:14:38 +0000 (11:14 -0500)]
Verify permissions in check_repo_or_die()
Currently, if one doesn't have read or access permission up to
repo('objects/pack'), bup exits with the following error:
error: repo() is not a bup/git repository
(with repo() replaced with the actual path).
This is misleading, since there is possibly really a repository there
but the user can't access it.
Make git.check_repo_or_die() verify that the current user has the
permission to access repo('objects/pack'), and if not, output a
meaningful error message.
As a bonus, we get an error if the bup_dir path is not a directory.
Avery Pennarun [Mon, 28 Feb 2011 09:47:15 +0000 (01:47 -0800)]
Merge branch 'rlb/meta'
* rlb/meta:
t/test-meta.sh: replace 'diff -u5' with 'diff -U5'.
Don't touch correct target xattrs; remove inappropriate target xattrs.
Rename test-fs.img to testfs.img and add it to the clean target.
t/test-meta.sh: detect and handle fakeroot.
Add atime tests and fix atime capture in metadata.from_path().
Improve test-meta.sh status messages.
Handle missing files more gracefully in "bup xstat".
Add initial (trivial) root-only ACL metadata tests and fix exposed bugs.
Add initial (trivial) root-only metadata tests for attr and xattr.
Don't specify 0700 permissions when creating initial directories.
Fix "meta extract -v" directory output.
Fix _apply_common_rec() symlink chown/chmod guard.
Change os.geteuid to os.geteuid() in tmetadata.py.
Remove redundant call to get_linux_file_attr() in _add_linux_attr().
In _add_linux_attr(), catch IOError, not EnvironmentError; handle ENOTTY.
Improve some metadata error messages.
Only print secs for bup xstat times when ns == 0.
Use oct() rather than hex() when printing mode from bup xstat.
Remove bup: prefix from metadata error messages.
Don't "chmod 000" paths during restore.
Remove MetadataError and make apply error handling finer-grained.
Remove MetadataAcquireError and make error handling finer-grained
Accommodate missing owner or group name during metadata save/restore.
Preserve existing stack trace when throwing MetadataErrors.
Add (private for now) "bup xstat" command and use it in the metadata tests.
Also check defined(_ATFILE_SOURCE) in utimensat() guard.
Rename bup-meta.1.md to bup-meta.md.
Simplify FSTime() - always use an integer ns internal representation.
Rename metadata exceptions and add a parent MetadataError class.
Don't use str(e) when instantiating Metadata exceptions.
Fix typos in Metadata._encode_linux_xattr().
Fix handling of conditional definitions in xstat.
Always define _have_ns_fs_timestamps (True or False).
Change "bup meta" to use recursive_dirlist() to add support for --xdev.
Fix minor bug in "bup meta -t" argument handling (if -> elif).
Modify drecurse.py and index.py to use xstat functions.
Move stat-related work to bup.xstat; add xstat.stat.
Add helpers.fstat and _helpers._have_ns_fs_timestamps.
Add a helpers.FSTime class to handle filesystem timestamps and use it.
Attempt to unlink or rmdir existing paths as appropriate during restore.
Conditionalize build/use of get_linux_file_attr and set_linux_file_attr.
Check stat() after attempted restore of nonexistent owner/group in tests.
Don't try to restore owner unless root; handle nonexistent owner/group.
Add metadata test_restore_restricted_user_group().
Add helpers.detect_fakeroot() and use it in relevant metadata tests.
Defer metadata aquisition and application errors during create/extract.
Rename py_* functions to bup_* in lib/bup/_helpers.c.
Don't allow negative ns in metadata timestamps; normalize on read/write.
Add (sec, ns) timestamps and extended stat, lstat, utime, and lutime.
Add vint tests and signed vint support via write_vint and read_vint.
Change user to the more accurate owner in metadata.py.
Correctly respect restore_numeric_ids in Metadata _apply_common_rec().
Send bup meta --list output to stdout, not stderr.
Fix bup-meta.1 start-extract/finish-extract example.
Use Py_RETURN_TRUE in py_lutimes() and py_set_linux_file_attr().
t/test.sh: fix whitespace problems with the 'Inode:' line from 'stat'.
t/test.sh: fix occasional atime-related failure in metadata tests.
t/test.sh: refactoring to reduce duplicated code.
Add initial support for metadata archives.
Avery Pennarun [Mon, 28 Feb 2011 09:33:09 +0000 (01:33 -0800)]
Merge branch 'master' into meta
* master:
midx/bloom: use progress() and debug1() for non-critical messages
helpers: separately determine if stdout and stderr are ttys.
cmd/newliner: restrict progress lines to the screen width.
hashsplit: use shorter offset-filenames inside trees.
Replace 040000 and 0100644 constants with GIT_MODE_{TREE,FILE}
git.py: rename treeparse to tree_decode() and add tree_encode().
hashsplit.py: remove PackWriter-specific knowledge.
cmd/split: fixup progress message, and print -b output incrementally.
hashsplit.py: convert from 'bits' to 'level' earlier in the sequence.
hashsplit.py: okay, *really* fix BLOB_MAX.
hashsplit.py: simplify code and fix BLOB_MAX handling.
options.py: o.fatal(): print error after, not before, usage message.
options.py: make --usage just print the usage message.
When applying xattr metadata, make sure to remove any target xattrs
that aren't in the metadata record, but don't touch target xattrs that
already match the metadata record. Add corresponding tests.
Throw an ApplyError() if xattr set() or remove() return EPERM from
within _apply_linux_xattr_rec().
Remove lib/bup/t/testfs and lib/bup/t/testfs.img in the clean target.
* commit '6f02181':
helpers: separately determine if stdout and stderr are ttys.
cmd/newliner: restrict progress lines to the screen width.
hashsplit: use shorter offset-filenames inside trees.
Replace 040000 and 0100644 constants with GIT_MODE_{TREE,FILE}
git.py: rename treeparse to tree_decode() and add tree_encode().
hashsplit.py: remove PackWriter-specific knowledge.
cmd/split: fixup progress message, and print -b output incrementally.
hashsplit.py: convert from 'bits' to 'level' earlier in the sequence.
hashsplit.py: okay, *really* fix BLOB_MAX.
hashsplit.py: simplify code and fix BLOB_MAX handling.
options.py: o.fatal(): print error after, not before, usage message.
options.py: make --usage just print the usage message.
Gabriel Filion [Fri, 25 Feb 2011 16:16:05 +0000 (11:16 -0500)]
midx/bloom: use progress() and debug1() for non-critical messages
Some messages in these two commands indicate progress but are not
filtered out when the command is not run under a tty. This makes bup
return some unwanted messages when run under cron.
Using progress() and debug1() instead should fix that.
(Changed a few from progress() to debug1() by apenwarr.)
Signed-off-by: Gabriel Filion <lelutin@gmail.com> Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
Avery Pennarun [Sun, 20 Feb 2011 05:21:45 +0000 (21:21 -0800)]
helpers: separately determine if stdout and stderr are ttys.
Previously we only cared if stderr was a tty (since we use that to determine
if we should print progress() or not). But we might want to check stdout as
well, for the same reason that gzip does: we should be refusing to write
binary data to a terminal.
Avery Pennarun [Sun, 20 Feb 2011 02:48:06 +0000 (18:48 -0800)]
hashsplit: use shorter offset-filenames inside trees.
We previously zero-padded all the filenames (which are hexified versions of
the file offsets) to 16 characters, which corresponds to a maximum file size
that fits into a 64-bit integer. I realized that there's no reason to
use a fixed padding length; just pad all the entries in a particular tree to
the length of the longest entry (to ensure that sorting
alphabetically is still equivalent to sorting numerically).
This saves a small amount of space in each tree, which is probably
irrelevant given that gzip compression can quite easily compress extra
zeroes. But it also makes browsing the tree in git look a little prettier.
This is backwards compatible with old versions of vfs.py, since vfs.py has
always just treated the numbers as an ordered set of numbers, and doesn't
care how much zero padding they have.
Avery Pennarun [Sun, 20 Feb 2011 02:02:12 +0000 (18:02 -0800)]
Replace 040000 and 0100644 constants with GIT_MODE_{TREE,FILE}
Those constants were scattered in *way* too many places. While we're there,
fix the inconsistent usage of strings vs. ints when specifying the file
mode; there's no good reason to be passing strings around (except that I
foolishly did that in the original code in version 0.01).
Avery Pennarun [Sun, 20 Feb 2011 01:57:48 +0000 (17:57 -0800)]
git.py: rename treeparse to tree_decode() and add tree_encode().
tree_encode() gets most of its functionality from PackWriter.new_tree(),
which is not just a one liner that calls tree_encode(). We will soon want
to be able to calculate tree hashes without actually writing a tree to a
packfile, so let's split out that functionality.
Let's use callback functions explicitly instead of passing around special
objects; that makes the dependencies a bit more clear and hopefully opens
the way to some more refactoring for clarity.