Rob Browning [Sun, 6 Jul 2014 18:26:45 +0000 (13:26 -0500)]
test-fuse.sh: set TZ=UTC for ls dates
Thanks to Mark J Hewitt <mjh@idnet.com> and Scott Sugar
<scottsugar@outlook.com> for reporting the problem, and Patrick
Rouleau <prouleau72@gmail.com> for helping test the solution.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Patrick Rouleau [Thu, 3 Jul 2014 00:58:39 +0000 (20:58 -0400)]
Avoid uid/gid 0 metadata tests when ids don't exist
Cygwin may not have a 0 uid/gid, so skip the relevant tests whenever
it doesn't.
We have to explicitly exclude the gid 0 from other_group, because
/etc/group may define a group named root with the same id as
Administrators. We cannot use "root" instead of 0, because root may
not be defined.
Cygwin users: If you want to define the root group, add this line
at the begining of /etc/group:
root:S-1-5-32-544:0:
Signed-off-by: Patrick Rouleau <prouleau72@gmail.com> Reviewed-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Patrick Rouleau [Thu, 3 Jul 2014 00:58:38 +0000 (20:58 -0400)]
test-meta.sh: use user's uid/gid, not 0
Cygwin does not fully support uid and gid of 0. Instead, use the
current user's ids to avoid portability problems. Since this test is
not run for ordinary users, the result is the same.
Signed-off-by: Patrick Rouleau <prouleau72@gmail.com> Reviewed-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 28 Jun 2014 18:59:19 +0000 (13:59 -0500)]
save-cmd.py: don't test for meta index via access()
Instead just try to open the metadata index and catch any EACCES.
There may be an issue with access on Cygwin
(http://bugs.python.org/issue2528), and even if not, this approach
removes a potential race between access() and open().
Thanks to Mark J Hewitt <mjh@idnet.com> for reporting the problem and
helping track down the cause.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Greg Troxel [Tue, 3 Jun 2014 14:59:56 +0000 (10:59 -0400)]
Avoid using incomplete utimensat implementations.
At least NetBSD 6 has partial support for utimensat, implementing the
function but not providing needed definitions via sys/stat.h as POSIX
requires. Check for needed defines, and if missing, undefine
HAVE_UTIMENSAT.
Rob Browning [Fri, 30 May 2014 02:54:49 +0000 (21:54 -0500)]
fuse-cmd.py: given --meta, report original metadata
Add a new --meta option to "bup fuse", and when specified, report the
original mode, uid, gid, atime, mtime, and ctime for the mounted
paths.
Since negative timestamps cause access errors, and cause "ls -l" to
report question marks for the stat values, set a floor of 0 for all
timestamps reported to python-fuse.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Patrick Rouleau [Thu, 22 May 2014 03:36:27 +0000 (23:36 -0400)]
midx: close the mmap before erasing an midx file
Linux allows erasing a file even if it is still memory mapped, but
Windows does not. However, Cygwin tries to emulate Linux and, under
certain conditions, it silently moves the file to $RECYCLED.BIN/, but
not in a way where Windows takes care of it when we use the "empty the
recycled bin" command. The erased midx ends up comsuming more and more
disk space.
To solve this, we have to close the midx's mmap before erasing
the midx.
Cygwin users: you can use this command to clean up your
$RECYCLE.BIN/ directories:
find /cygdrive/?/\$RECYCLE.BIN/ -name "\.???[[:xdigit:]]*" -print -delete
Signed-off-by: Patrick Rouleau <prouleau72@gmail.com>
[rlb@defaultvalue.org: adjust commit message]
Rob Browning [Sun, 18 May 2014 05:51:34 +0000 (00:51 -0500)]
Use CatPipe, not show, in git_commit_dates()
Import the commit parser from the experimental bup-get branch and use
it along with CatPipe to produce the git commit dates in
git_commit_dates().
This appears to resolve the performance problem with real archives
that was introduced by the use of "git show -s --pretty=format:%ct
..." (cf. 00ba9fb811e71bb6182b9379461bc6b493e3b7a4), and is still much
faster than 0.25 for at least a 1000 branch synthetic repository here.
Thanks to Gabriel Filion <gabster@lelutin.ca> for reporting the
problem, and to him and Patrick Rouleau <prouleau72@gmail.com> for
helping test the solution.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Wed, 21 May 2014 02:40:51 +0000 (21:40 -0500)]
Move VFS cp() to git.py; handle repodir changes
Move the VFS's cp() to git.py and reset the cp() CatPipe whenever the
global repodir changes.
Previously, if two different lib/bup/t/ tests (for example) needed to
use two different repositories, and both happened to indirectly call
cp(), the second test would end up with a CatPipe() connected to the
wrong repository.
In the longer run, we may want to consider further cleanup here, but
this should fix the immediate problem without too much risk.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 17 May 2014 20:34:32 +0000 (15:34 -0500)]
Fix git_commit_dates() to handle (obvious) duplicates
Since git show will suppress duplicate results, change
git_commit_dates() to detect that situation and repair the damage.
Though currently, it can only handle cases where the duplicates are
obvious, i.e. have the same exact string in the argument list. It
won't be able to handle cases where two arguments differ, but resolve
to the same underlying hash. For our current purposes, that should be
fine since we only pass in hashes.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Wed, 14 May 2014 15:24:00 +0000 (10:24 -0500)]
Re-allow backup set names containing "/"
Revert the prohibition because all releases up to now have allowed
"/", and so that's nothing new. Disabling "/" in 0.26 would be a
regression, and one we're not ready to commit to -- in fact, we may
eventually do the opposite, and add comprehensive support for "/".
Gabriel Filion [Sat, 10 May 2014 22:45:37 +0000 (18:45 -0400)]
Move unshared version code from helpers to version-cmd.py
While this is relatively "harmless", the import * directive brings in a
whole lot of unused code into the version subcommand.
At the same time, the three version_* functions are only used in the
version subcommand, so their place inside helpers.py is unjustified. The
same goes for the _version module.
Signed-off-by: Gabriel Filion <gabster@lelutin.ca>
[rlb@defaultvalue.org: adjust commit message] Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Simon Persson [Sun, 11 May 2014 09:06:34 +0000 (17:06 +0800)]
Always line-buffer bup restore stdout
Flush stdout after every line, even when stdio is not a tty, to
provide more regular progress information. Useful for progress
monitoring by a parent process or when watching a logfile with "tail".
Signed-off-by: Simon Persson <simonpersson1@gmail.com>
[rlb@defaultvalue.org: adjust commit message] Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 10 May 2014 20:51:38 +0000 (15:51 -0500)]
Read ARG_MAX directly via os.sysconf('SC_ARG_MAX').
For now, this code also adds an optional arg_max parameter to
batchpipe(), exclusively for testing.
There appears to be an issue with wvtest at the moment that causes
assignments to something like helpers.arg_max to have no effect from
the importers perspective.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Thu, 8 May 2014 19:01:57 +0000 (14:01 -0500)]
Retrieve the dates for all branches with one bulk git call in the VFS.
Instead of calling "git rev-list" once per branch to get the branch
tip dates for the VFS branch list, make a single call to "git show"
using a suitable pretty format, and pass all the branch refs as
arguments.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Thu, 8 May 2014 18:52:25 +0000 (13:52 -0500)]
Add a batchpipe() command to helpers that behaves somewhat like xargs(1).
Add batchpipe(), which will yield the output produced by calling a
given external command with a given list of arguments.
The resulting output may be provided in chunks, from multiple
invocations of the command, if the limits imposed by ARG_MAX make that
necessary.
See http://www.in-ulm.de/~mascheck/various/argmax/ for details, but
note that batchpipe() takes the additional precaution of adding room
for the argv pointers in addition to the envp pointers.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Thu, 8 May 2014 16:50:29 +0000 (11:50 -0500)]
Only define helpers.next() if Python's isn't new enough.
Since helpers.next() now matches the Python built-in's semantics
(since Python 2.6), just rely on the built-in when it's new enough.
Otherwise, set up a matching replacement.
Borrow the fallback next documentation string from Python.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Drop helpers.next() and just use Python's built-in.
The python built-in 'next' function has an optional second parameter
that specifies what to return instead of excepting with StopIteration.
There is therefore no need for the function in helpers.py.
Patryck Rouleau [Sun, 30 Mar 2014 15:04:49 +0000 (11:04 -0400)]
test-ls.sh: handle Cygwin's coupling of timestamp/permission modifications.
In Cygwin, the file's modification date is updated when its
permissions are modified. To make the tests compatible with Cygwin,
create the files first, set their permissions, then set the date.
Signed-off-by: Patrick Rouleau <prouleau72@gmail.com>
[rlb@defaultvalue.org: adjust commit message.] Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 12 Apr 2014 19:17:48 +0000 (14:17 -0500)]
Add VFS Node release() and release nodes during restore, after traversal.
Add a release() method to Node that will drop resources that can (and
will) be automatically restored when required -- though restoring the
resources may have a non-trivial cost. For now, drop the node's
metadata and its children.
Call node.release() in restore-cmd.py after restoring a node. This
substantially decreases the memory required by a restore because the
whole tree is no longer retained in RAM.
Thanks to Patrick Rouleau <prouleau72@gmail.com> for helping track
down the problem, and proposing a slightly different initial patch to
fix it.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Reviewed-by: Patrick Rouleau <prouleau72@gmail.com>
Yung-Chin Oei [Mon, 8 Oct 2012 14:08:34 +0000 (15:08 +0100)]
Make bup-split commits appear as files to the vfs layer.
When viewing branches that were generated by bup-split through bup-fuse
(or any other frontend relying on vfs.py), these are presented as trees
of the hashsplitted blobs. This means that bup-split output is only
usefully accessible through bup-join.
This change makes bup-split store named commits such that they appear as
files, named with the last component of their branch name(*). That is,
from the vfs layer, they now appear like so:
branch_name/latest/branch_basename
(*) While bup doesn't currently handle slashes in branch names, patches
to this end are on the mailing list, so this patch should handle
them, in anticipation of their general support in bup.
To address potential concerns: the storage format is changed in subtle
ways, in that the top level tree now contains a "normally" named object,
rather than byte-offset names. However, bup-join doesn't care about
this, and as bup-join was previously the only way to use these commits,
the user experience is not affected.
We also add a test for the new functionality. (The test uses an empty
string as input data, because this is the second way in which this patch
changes the behaviour of bup-split: previously, passing empty strings to
bup-split would make it generate an empty git tree, whereas now it
relies on hashsplit.split_to_blob_or_tree() to make a blob for the empty
string. This is meaningful because vfs.py chokes on empty git trees.)
Signed-off-by: Yung-Chin Oei <yungchin@yungchin.nl>
[rlb@defaultvalue.org: rebase to current master; adjust code indentation.]
Rob Browning [Fri, 4 Apr 2014 20:55:37 +0000 (15:55 -0500)]
vfs.py: don't redundantly _populate_metadata for Dir()s.
This fix should significantly improve performance. Without it, every
call to node.metadata() caused the VFS to read the entire .bupm file
in order to (re)initialize the metadata for every entry in the
directory. Don't do that.
I thought I'd already implemented this, but apparently only in my
head.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Mark J Hewitt <m.hewitt@computer.org>
[rlb@defaultvalue.org: adjust commit message, and include a leading
single quote in what we're looking for, to match c8c1f0341c82f4abf4d3f8eed1fe4ddfa4a48493.] Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 22 Mar 2014 15:25:32 +0000 (10:25 -0500)]
test-ls.sh: get the group and gid from the filesystem.
Get the expected gid from the filesystem, not "id", because on some
platforms (BSDs, etc.) a new path's gid is taken from the parent
directory, not the effective gid.
Thanks to Thomas Klausner <tk@giga.or.at> for reporting the problem
and to him and Greg Troxel <gdt@ir.bbn.com> for helping craft the
solution.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Tue, 18 Feb 2014 18:30:31 +0000 (12:30 -0600)]
test-ls.sh: take only the first 10 chars from ls -l's mode string.
Since ls -l's mode string may not be separated from the next field by
a space (i.e. when ACLs, etc. are involved), take only the first 10
characters for now when retrieving the symlink mode string (cf. 30d9027cc5444f038d38927219dc59e3b69fa219).
Thanks to Mark J Hewitt <mjh@idnet.com> for pointing out the problem
and testing the solution.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Mon, 17 Feb 2014 18:39:05 +0000 (12:39 -0600)]
test-compression.sh: use "tar ... | wc -c" instead of du.
This test was failing on a ZFS system with compression enabled because
du wasn't summing the actual file sizes. To fix that, use tar/wc so
that we'll calculate the actual tree size without having to worry
about the portability of du's --apparent-size argument.
Thanks to Björn Seifert <seifert@oern.de> for reporting the problem
and Greg Troxel <gdt@lexort.com> for suggesting this fix.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Always return a level 0 blob from _splitbuf() for BLOB_MAX sized blobs.
Previously if _splitbuf() returned an offset anywhere in the buffer we
would always use the 'level' of the offset for the resulting blob. So
when the offset was greater than BLOB_MAX, we'd still use the offset's
level, even though we'd be returning a blob that didn't reach that
offset. In some cases, this could cause bup to generate a sequence of
BLOB_MAX sized blobs, each in their own own chunk group.
To fix that, set the level of all BLOB_MAX sized blobs to 0, which
makes those blobs behave just as they would have before this patch,
whenever they were found at the end of the buffer.
Test the new behavior -- the last two of these four new tests will
fail if run before the changes, as they check that blobs split by
enforcing BLOB_MAX do not 'inherit' the value of level from a split
later in the buffer.
Signed-off-by: Aidan Hobson Sayers <aidanhs@cantab.net>
[rlb@defaultvalue.org: squash tests commit into _splitbuf() changes commit;
adjust commit message.] Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Fri, 31 Jan 2014 00:49:36 +0000 (18:49 -0600)]
Fix drecurse relative --excludes, quash duplicates, and add tests.
Previously "bup drecurse --exclude bar foo" wouldn't work because bar
would be expanded to an absolute path, but drecurse would traverse
(and attempt to match against) relative paths.
Have parse_excludes() remove duplicates and sort its result.
Add some initial drecurse tests.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
This comment was originally added in reference to some confusion over
globbing, but the text was misleading because bup does apply
realpath() to all of the --exclude paths, arguably a form of
expansion.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
itxx00 [Wed, 8 Jan 2014 05:14:26 +0000 (13:14 +0800)]
bup-restore.md: always initialize root_meta in do_root.
Previously, using bup split with bup restore would cause an error:
UnboundLocalError: local variable 'root_meta' referenced before assignment
Signed-off-by: itxx00 <itxx00@gmail.com>
[rlb@defaultvalue.org: adjust commit message. Initialize root_meta
unconditionally before guard to match other code.] Signed-off-by: Rob Browning <rlb@defaultvalue.org>