Rob Browning [Tue, 14 Jul 2015 00:33:24 +0000 (19:33 -0500)]
Handle sysconf results more carefully
Sysconf indicates that there's no definite limit by returning -1 and
leaving errno unchanged. In that case, for SC_ARG_MAX, guess 4096. For
SC_PAGE_SIZE, die, since various operations currently require a
page_size.
Thanks to Mark J Hewitt for reporting the issue, and to Mark and Nix for
investigating the cause.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 30 May 2015 16:18:26 +0000 (11:18 -0500)]
Eject our pages after save via fmincore
Use fmincore and fadvise_done to eject only pages that have become
resident during our save traversal (i.e. via hashsplitting) in
batches (currently 8MB or one VM page, whichever's larger).
Hopefully this will work better than either universal ejection (our
previous behavior), or no ejection (which dramatically slows down save
operations on some systems, perhaps due to competition with access to
the indexes, etc.
Thanks to Nimen Nachname for the initial suggestion, and thanks to Tilo
Schwarz for reporting bugs in a previous version of the patch, and for
noting that we shouldn't wait until the end of a large region before
starting to eject it.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Ben Wiederhake <Ben.Wiederhake@gmail.com> Tested-by: Tilo Schwarz <mail@tilo-schwarz.de>
[rlb@defaultvalue.org: bup_fmincore: add missing malloc result check,
and missing free(result) when munmap fails to 437fedd07ee327c14b11cb19f6c0519ef1e50884] Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Tue, 26 May 2015 01:55:32 +0000 (20:55 -0500)]
Revert "Avoid fadvise (since it doesn't work as expected)"
On some systems (the reported system had 2GB), not forcing out all of
the data traversed during save dramatically slows down save operations,
possibly due to competition with access to the indexes, etc. So restore
the use of fadvise_done() for now.
Rob Browning [Sun, 14 Jun 2015 14:58:32 +0000 (09:58 -0500)]
test.sh: separate index and split/join tests
Move the index tests to test-index.sh and the split/join tests to
test-split-join.sh in order to increase the potential parallelism (and
modularity/isolation).
Here, this brings the "make -j check" time down quite a bit. Before
test.sh was always the last thing running, and the index tests took most
of the time.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 13 Jun 2015 17:54:21 +0000 (12:54 -0500)]
test-fuse: format save name with python, not bash
Some versions of bash don't support the date expansion we used.
Thanks to pspdevel for reporting the issue.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org> Reviewed-by: Gabriel Filion <gabster@lelutin.ca> Tested-by: Gabriel Filion <gabster@lelutin.ca>
[rlb@defaultvalue.org: add comment above savename()] Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Ben Kelly [Thu, 4 Jun 2015 16:47:27 +0000 (12:47 -0400)]
bup midx: fix --output when used with --auto or --force
This fixes an issue where --output is properly respected only when
neither of these options are used.
Signed-off-by: Ben Kelly <btk@google.com>
[rlb@defaultvalue.org: rebased onto e25363fc58cd906337ecee28d715af8f355fd921] Reviewed-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 23 May 2015 16:30:36 +0000 (11:30 -0500)]
Update definition lists for pandoc 1.13
It appears that pandoc 1.13 requires a blank line between definition
list items, so add it. See "compact_definition_lists" in the pandoc
README for more information.
Thanks to Robert Edmonds for reporting the problem.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 10 May 2015 21:33:16 +0000 (16:33 -0500)]
Avoid fadvise (since it doesn't work as expected)
Currently, it appears that at least on Linux posix_fadvise() always
purges the cache given POSIX_FADV_DONTNEED, which is what bup uses, and
does nothing for POSIX_FADV_NOREUSE, which means that before this patch,
a bup save would completely clear the filesystem cache of any file data
traversed during the run. Aside from being completely unintended, this
meant that active VM images, large databases, etc. would probably be
purged with every save.
Since it also looks like at least NetBSD doesn't do anything for
DONTNEED, and tools like tar, cpio, and duplicity don't use it, let's
just drop it. That's simpler, and we can always add it back if/when
someone discovers it helps somewhere relevant.
Thanks to Nimen Nacnamme for reporting the issue.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org> Acked-by: Greg Troxel <gdt@lexort.com>
Update the example of making a backup to a remote server in README.md.
Replace 'ssh SERVENAME bup init' with 'bup init -r SERVERNAME'. The
latter doesn't only initialize the remote repository, but also the local
one (if it doesn't exist). Augment 'bup {init,save} -r SERVERNAME'
commands with the path specifier to show the ability to specify the
remote path.
Signed-off-by: Tadej Janež <tadej.j@nez.si> Reviewed-by: Gabriel Filion <gabster@lelutin.ca>
[rlb@defaultvalue.org: shorten/adjust commit summary; adjust and change
tense of commit message.] Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Johannes Berg [Thu, 23 Apr 2015 20:41:45 +0000 (22:41 +0200)]
Reject invalid string in --date argument
As parse_date_or_fatal() currently uses atof(), which just returns 0
if the string isn't a valid number, it can never actually be fatal
and will just use "1970-01-01 00:00:00" as the time if the string is
specified wrong.
Fix that by using float() directly so ValueError() is raised.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
[rlb@defaultvalue.org: adjust commit summary] Reviewed-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 28 Mar 2015 20:09:47 +0000 (15:09 -0500)]
Get TZ offset from C localtime, given tm_gmtoff
If we detect that struct tm contains tm_gmtoff, use the system
localtime() to compute timezone offsets. This may help fix problems on
platforms where Python strftime "%z" doesn't report accurate timzeone
information.
Thanks to Patrick Rouleau for reporting just such a problem on Cygwin.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 28 Mar 2015 17:48:19 +0000 (12:48 -0500)]
Adjust sparse restore tests for test fs block size
Change test-sparse-files.sh to detect the test fs block size (when
possible) and adjust its behavior accordingly. If the block size can't
be determined, use a block size of 3MB, which is hoped to be larger than
any block sizes we'll encounter anytime soon.
Previously the tests might fail on filesystems with relatively large
block sizes, like those on the current Debian powerpc and ppc64el build
daemons (64k).
Thanks to Goswin Brederlow for mentioning that the Lucene block size is
1MB, to Robert Edmonds for running a build through the Debian buildds,
which revealed the problem, and to Julien Cristau for reporting the
block size on the failing buildds.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 15 Mar 2015 21:43:47 +0000 (16:43 -0500)]
Rework write_sparsely() to fix in-buffer zero runs
Fix the sparse restoration of buffers that have non-zero bytes, followed
by a run of zero bytes that's longer than the minimum sparse run
length (currently 512), followed by non-zero bytes.
Previously, the initial non-zero bytes would be *lost*.
In the new code, don't unconditionally output previous zero bytes --
merge them with any leading zeros in the current block.
And allow arbitrarily large sparse regions; use append_sparse_region()
to break up runs that are too large for off_t into a sequence of seeks
of no more than INT_MAX bytes each.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 15 Mar 2015 21:38:34 +0000 (16:38 -0500)]
Test sparse restore of short in-buffer zero run
Test that sparse --restore handles the case where within one call to
write_sparsely() (one buffer) we have non-zero bytes, followed by a run
of zero bytes that's longer than the minimum sparse run
length (currently 512), followed by non-zero bytes.
Currently, the initial non-zero bytes will be lost, and this test will
fail.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 21 Mar 2015 22:17:12 +0000 (17:17 -0500)]
Use t/sampledata, not make install, for tests
This should be more efficient, and is intended fix the problems people
have experienced with the recursive "make install" invocations that some
tests were using to produce input data.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 21 Mar 2015 19:32:30 +0000 (14:32 -0500)]
Only build _version.py once; remove phony targets
Previously _version.py was a phony target because we couldn't easily
tell when the current git working tree version had changed. This caused
various targets to be rebuilt multiple times (i.e. recursive make
invocations, etc.).
To fix that, just update _version.py once (at startup) if needed, via an
immediate variable assignment that calls a new ./configure-version
command, i.e.
Rob Browning [Sat, 21 Mar 2015 19:57:18 +0000 (14:57 -0500)]
Create t/sampledata/var/ and version it
Maintain a new t/sampledata/var/ (via t/configure-sampledata) that
contains any test data that we don't want to or can't commit to
git (i.e. symlinks, and other dynamically generated data). Move all of
the existing generated data there, and delete var/ entirely on clean.
Control the creation of var/ with make, via the existence of
t/sampledata/var/rev/vN. Whenever we change the content, we'll change
N (currently 0), which will force the directory to be recreated.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 21 Mar 2015 19:40:48 +0000 (14:40 -0500)]
Don't have ./bup depend on main.py, etc.
Instead create bup_deps (which includes bup) and have everything that
needs a functional ./bup depend on that set of targets.
Previously bup would often be rebuilt unnecessarily (even if it didn't
depend on other phony targets) because it depended on _version.py and
main.py. Since bup ends up symlinked to the latter, and make looks at
symlink target timestamps, and _version.py would often be newer than
main.py, ./bup would be repeatedly rebuilt.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sat, 14 Mar 2015 15:39:19 +0000 (10:39 -0500)]
Makefile: fix -j race to create ./bup
Previously the bup target might fail if it was run in parallel
(i.e. during make -j). Since bup depends on _version.py, which is
phony, that wasn't unlikely.
Fix the race by ignoring errors while creating the symlink, and then
testing for existence afterward.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Fri, 13 Mar 2015 02:41:50 +0000 (21:41 -0500)]
subtree-hash: handle non-ASCII path chars
Pass "-z" to git ls-files to turn off git's core.quotepath, so we can
handle non-ASCII path characters directly, and rewrite the code in
python to make that easier.
Thanks to Jiří Martínek for reporting the issue -- that
test-redundant-saves.sh would fail when there was a "ý" character in the
source tree path.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 22 Feb 2015 22:42:18 +0000 (16:42 -0600)]
Port import-duplicity to python
Additionally, specify an --archive-dir to all duplicity invocations so
that we don't scribble in ~/.cache/duplicity during the imports, and
unconditionally clean up temporary dirs.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
If we don't specify an --archive-dir to duplicity, it'll scribble in
~/.cache/duplicity during testing. This is only a partial fix, given
the duplicity invocations inside import-duplicity itself.
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Zoran Zaric <zz@zoranzaric.de>
[rlb@defaultvalue.org: adjust commit message; update tests during rebase
and move them from test.sh to t/test-import-duplicity.sh following
break-up of test.sh.] Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Alexander Barton [Wed, 25 Feb 2015 14:28:21 +0000 (15:28 +0100)]
t/test-on.sh: Don't quote "wc -l" output
Some wc(1) implementations, for example on OS X, prefix the numeric
output with spaces. So don't quote this output (and don't treat it
as a string) to not confuse WVPASSEQ.
This fixes the following error of "make check":
Comparing:
2
--
2
! t/test-on.sh:25 ' 2' = '2' FAILED
2.108s ok
called from t/test-on.sh:25 WVPASSEQ 2 2
make[1]: *** [runtests-cmdline] Error 1
! Program returned non-zero exit code (2) FAILED
WvTest: 3007 tests, 1 failure, total time 84.013s.
Reported by Pedro Estarque, thanks!
Signed-off-by: Alexander Barton <alex@barton.de>
[rlb@defaultvalue.org: adjust commit message; add WVPASS calls] Signed-off-by: Rob Browning <rlb@defaultvalue.org> Reviewed-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Thu, 17 Jul 2014 17:47:43 +0000 (12:47 -0500)]
hlinkdb.py: clean up temp file more carefully
Make sure to always close the temp file, and if something goes wrong
while preparing the save, make sure to delete it and remove it from
consideration by commit.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Sun, 9 Nov 2014 00:32:55 +0000 (18:32 -0600)]
on: handle remote stdout and stderr via mux
Previously, when running "bup on" (i.e. when in BUP_SERVER_REVERSE
mode), bup would redirect the remote command's stdout to stderr because
stderr is the only remaining avenue back to the local console. It's the
only avenue because in reverse mode, stdout (and stdin) are connected
back to a local "bup server" instance (hence the "reverse").
Of course that makes it impossible to reliably capture the
non-diagnostic output from the remote commands, i.e.
commit_id="$(bup on HOST save -t ...)"
To fix that, in the remote "bup on--server" multiplex the stdout and
stderr from all "bup on" subcommands with "bup mux", and then
demultiplex those streams back to the local stdout and stderr via
DemuxConn() in the receiving "bup on".
Thanks to Alexander Barton for pointing out an error in a previous
version of this commit message, and thanks to Gabriel Filion for
pointing out that we could use the existing mux infrastructure instead
of reinventing the wheel.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Tue, 4 Nov 2014 18:09:37 +0000 (12:09 -0600)]
save: always push parents when entering a subtree
Once we enter the subtree, all of the parents must be on the stack for
subsequent operations to work correctly (like the _pop() in the
following "if not file:" section).
Consider the case where /some/empty/dir was indexed, but removed before
save. Without this fix, the metadata read for "dir" would fail and the
pending directory stack (parts) would include /some/empty instead of
/some/empty/dir.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Mon, 3 Nov 2014 23:02:03 +0000 (17:02 -0600)]
save-cmd.py: remove redundant _push()
Remove a duplicate _push() from the "first_root" code. This could cause
the creation of a tree that was immediately _pop()ped by the "finish the
current sub-tree" code, and then reintroduced (as a duplicate parent
entry) by the "start a new sub-tree" code.
Instead, just wait for the "start a new sub-tree" code to call the
relevant _push().
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
The documentation for posix_fadvise states that if len is 0 the advice
extends to the end of the file. We currently always pass 0 for
the first two posix_fadvise calls (first because ofs is 0, second
because ofs == BLOB_READ_SIZE == 1024*1024) so we're advising the kernel
to dump any predicted file caching twice per file.
This patch ensures we don't pass a len of 0 in the two scenarios above.
Signed-off-by: Aidan Hobson Sayers <aidanhs@cantab.net>
[rlb@defaultvalue.org: adjust commit summary] Reviewed-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Rob Browning [Thu, 20 Nov 2014 16:47:58 +0000 (10:47 -0600)]
Skip dependent tests if we can't load loop or fuse
When modprobe is available (or when we know we're on Linux), try to load
the loop(back) and fuse modules before using them, and skip the relevant
tests if the load fails. This allows the tests to proceed on systems
lacking those modules.
Signed-off-by: Rob Browning <rlb@defaultvalue.org> Tested-by: Rob Browning <rlb@defaultvalue.org>
Gabriel Filion [Mon, 17 Nov 2014 17:24:50 +0000 (12:24 -0500)]
Correct claim about number of packs per backup
The current sentence implies that there is only one pack file per backup
run, which is not necessarily correct.
Because of the "constants" max_pack_size and max_pack_objects that are
used as limits to pack sizes, we may end up with more than one pack per
backup if there is a lot of data that needs to be stored.
Signed-off-by: Gabriel Filion <gabster@lelutin.ca>
[rlb@defaultvalue.org: adjust commit summary] Reviewed-by: Rob Browning <rlb@defaultvalue.org>