bup.git
2 months agotest.sh: test bup features python version master
Rob Browning [Sat, 15 Aug 2020 17:10:14 +0000 (12:10 -0500)]
test.sh: test bup features python version

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agofeatures: show version number of the Python interpreter
Alexander Barton [Sun, 2 Aug 2020 18:10:29 +0000 (20:10 +0200)]
features: show version number of the Python interpreter

This can be handy for debiugging, especially if the interpreter used
(like /usr/bin/python3) is a symlink to the actual Python interpreter
(/usr/bin/python3.7) and/or doesn't include the "micro" version at all.

Signed-off-by: Alexander Barton <alex@barton.de>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agoMakefile: specify -Wformat=2 rather than error given -Werror
Rob Browning [Sat, 15 Aug 2020 01:13:15 +0000 (20:13 -0500)]
Makefile: specify -Wformat=2 rather than error given -Werror

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
2 months agotest-fsck: test invocation with pack arguments
Rob Browning [Sat, 15 Aug 2020 01:12:07 +0000 (20:12 -0500)]
test-fsck: test invocation with pack arguments

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
2 months agofsck: fix argv_bytes typo
Johannes Berg [Fri, 14 Aug 2020 17:43:58 +0000 (19:43 +0200)]
fsck: fix argv_bytes typo

Reported-by: gkonstandinos@gmail.com
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2 months agotests: tclient: fix buffering behaviour in the test
Johannes Berg [Thu, 13 Aug 2020 13:20:33 +0000 (15:20 +0200)]
tests: tclient: fix buffering behaviour in the test

If the objects we write are small enough to be at least
partially buffered instead of written out to the server
connection, the test will hang in the loop as the data
hasn't made it to the server yet, which therefore can't
make a suggestion yet.

In normal non-testing scenarios this isn't an issue as
the connection is flushed before every new read(), and
thus if the client actually needs to wait for something
from the server, it'll have flushed before. Here, we're
directly poking at the internals by using has_input(),
and nothing causes a flush, causing the above scenario.

Fix this by some more ugly poking at the internals and
flush the connection before we go into the loop.

While at it, also avoid hanging there forever and break
out of the loop if one second passed without receiving
anything from the server - that should be long enough
for it to hash the two objects and respond. Also, add a
time.sleep() there to avoid busy spinning which takes
CPU cycles from the server that needs to be hashing the
object.

Reported-by: Robert Edmonds <edmonds@debian.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2 months agochecked_malloc: use %z to format size_t; enable -Wformat=2
Rob Browning [Sun, 9 Aug 2020 18:07:01 +0000 (13:07 -0500)]
checked_malloc: use %z to format size_t; enable -Wformat=2

Reported-by: Greg Troxel <gdt@lexort.com>
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agotest-help: restrict MANPATH; only test manpages when available
Rob Browning [Sun, 9 Aug 2020 16:59:28 +0000 (11:59 -0500)]
test-help: restrict MANPATH; only test manpages when available

Reported-by: Robert Edmonds <edmonds@debian.org>
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agoinstall-python-script: don't presume python exists
Rob Browning [Sat, 8 Aug 2020 18:05:39 +0000 (13:05 -0500)]
install-python-script: don't presume python exists

Reported-by: Robert Edmonds <edmonds@debian.org>
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agobup_mincore: actually use size_t value (not ulonglong) in call
Rob Browning [Sat, 8 Aug 2020 17:27:51 +0000 (12:27 -0500)]
bup_mincore: actually use size_t value (not ulonglong) in call

We'd already done the correct conversion, but weren't using it.

Reported-by: Greg Troxel <gdt@lexort.com>
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
2 months agosave: make the py3 save commit message args match py2
Johannes Berg [Tue, 4 Aug 2020 14:46:16 +0000 (16:46 +0200)]
save: make the py3 save commit message args match py2

We shouldn't use sys.argv, it's missing the actual arguments
now. Also, encode it but drop the "b" on the bytes (if present,
i.e. python3) to make this look the same as on python2.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
[rlb@defaultvalue.org: make formatting decision via py_maj; adjust
 commit message]
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agoversion: fix --date argument for py3
Johannes Berg [Wed, 5 Aug 2020 19:17:13 +0000 (21:17 +0200)]
version: fix --date argument for py3

This has an issue with str vs. bytes, fix it.

Reported-by: Mark J Hewitt <mjh@idnet.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agoAdd trivial tests for current help behavior
Rob Browning [Sat, 8 Aug 2020 16:40:54 +0000 (11:40 -0500)]
Add trivial tests for current help behavior

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agohelp: fix for python3
Johannes Berg [Wed, 5 Aug 2020 07:54:55 +0000 (09:54 +0200)]
help: fix for python3

Fix "bup subcommand --help" to properly compare and start
the man viewer.

Reported-by: Robert Edmonds <edmonds@debian.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agoftp: handle lack of readline in input
Johannes Berg [Tue, 28 Jul 2020 20:28:11 +0000 (22:28 +0200)]
ftp: handle lack of readline in input

If we don't have readline, print our prompt and read a line
from stdin instead.

Reported-by: Eric Waguespack <eric.w@guespack.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2 months agoftp: fix ls arguments for python3
Johannes Berg [Tue, 28 Jul 2020 21:28:46 +0000 (23:28 +0200)]
ftp: fix ls arguments for python3

During the conversion to python 3, ftp ls could no longer
get option arguments, because we're now passing *bytes* to
the option parser that still expects *str*.

Fix this by fsdecode()'ing the bytes into str, which then
causes the option parsing to work properly, and the result
will be fsencode()'d again via argv_bytes() for usage.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agohlinkdb: always load as bytes on python3
Johannes Berg [Mon, 27 Jul 2020 21:15:16 +0000 (23:15 +0200)]
hlinkdb: always load as bytes on python3

We really need this to be bytes, so always load as such.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
[rlb@defaultvalue.org: changed %s to %d]
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
2 months agochange mktemp to be compatible with busybox
Brian Minton [Mon, 20 Jul 2020 18:55:11 +0000 (14:55 -0400)]
change mktemp to be compatible with busybox

Signed-off-by: Brian Minton <brian@minton.name>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agocirrus: make rdiff-backup completely optional for freebsd tests
Rob Browning [Thu, 23 Jul 2020 23:58:18 +0000 (18:58 -0500)]
cirrus: make rdiff-backup completely optional for freebsd tests

Thanks to Johannes Berg for reporting the problem and tracking down
the likely cause/fix.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoEliminate redundant check of index start against ctime
Aidan Hobson Sayers [Fri, 17 Jul 2020 21:59:04 +0000 (22:59 +0100)]
Eliminate redundant check of index start against ctime

When (the first version of) this check was added 10 years ago in
b4b4ef116880, it was presumably to ensure that "index; save; touch;
index" in the same second would flag a file as needing saving.

Since then, tmax has been added in the indexing process to solve the
problem by capping timestamps stored in the index. This capping means
that a file with ctime in the same second as an indexing start will
be picked up on both the indexes in the example above - there's no need
to special case the check.

Signed-off-by: Aidan Hobson Sayers <aidanhs@cantab.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoweb: don't re-resolve item in listing
Johannes Berg [Fri, 17 Jul 2020 19:42:16 +0000 (21:42 +0200)]
web: don't re-resolve item in listing

We already have an item, we just need its metadata. There's no
need to re-resolve it. Somehow, resolving it again is also very
slow for large directories (perhaps re-reading metadata again
and again?), and this significantly speeds up things.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoAdd bup-features to report status; drop ACL warnings
Rob Browning [Sun, 19 Jul 2020 19:28:36 +0000 (14:28 -0500)]
Add bup-features to report status; drop ACL warnings

Add a new "bup features" command that reports information about the
status and capabilities of the current installation.

Given that, and the new summary at the end of the ./configure output,
drop the platform specific warnings, which were a bit odd anyway,
since they wouldn't warn you on say Linux if we found libacl, but it
didn't actually have all the features we required.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoconfigure: restore working dir after symlinking python
Rob Browning [Sun, 5 Jul 2020 20:04:04 +0000 (15:04 -0500)]
configure: restore working dir after symlinking python

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoAvoid varying git archive content for ref; rework versioning
Rob Browning [Sun, 5 Jul 2020 19:40:40 +0000 (14:40 -0500)]
Avoid varying git archive content for ref; rework versioning

Don't include the log --decorate names in the archive via export-subst
because they can change after the fact when new tags or branches are
added for a given hash -- for example, when we created the 0.30.x
branch after tagging 0.30.1.  Archives retrieved before the branch was
created would have a different set of NAMES in _release.py.

Move _release to source_info and add an optional checkout_info module.
source_info contains the (no longer variable) export-subst commit hash
and date values, and checkout_info contains the same data for a git
checkout.  Automatically update checkout_info whenever we're at the
top level of a git source tree, but don't include it in the archives.

Record the base version in version.py explicitly as either a release
version like 0.31, for an actual release (which must be committted
before tagging the release), or a development version like
0.31~ (indicating a version that's always less than 0.31).

Rework bup version to report the lib/version for an actual release, or
that version suffixed with the commit hash when running a non-release,
and add a "+" when uncommitted modifications are detected.  For
example:

      release: 0.31
  non-release: 0.31~4e4b9ba8689c93702743c8ecd49c5a7808a4d717
     modified: 0.31~4e4b9ba8689c93702743c8ecd49c5a7808a4d717+

Drop the --tag argument from bup version since the tags are variable,
and you can always ask git to describe the hash via

  git describe --always HASH

Add dev/refresh, similar to moreutils sponge, to handle file creation
safely, something we may want to deploy more widely (e.g. instead of
the $$/PID based tempfiles in the Makefile).

Thanks to Greg Troxel for reporting the problem.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoDrop pwd/gid reentrant function variants for now
Rob Browning [Tue, 30 Jun 2020 23:50:17 +0000 (18:50 -0500)]
Drop pwd/gid reentrant function variants for now

From the POSIX pages, it sounded like _SC_GETPW_R_SIZE_MAX, etc. were
upper limits, but other docs, and observed behavior (reported ERANGE
failures) indicate that it's more likely they're just suggestions.  As
a result, we'd either need larger hard-coded buffers, or a
reallocate/retry loop if we kept them.

Instead, since we're not at all thread-safe and aren't likely to be
anytime soon, the dynamic allocations are just less efficient and more
complex, so back off to the non-reentrant flavors for now.

Thanks to kd7spq for reporting the problem and Johannes Berg for
tracking down the cause and help settling on this solution.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agocirrus: fill out python 3 tests
Rob Browning [Sat, 27 Jun 2020 15:26:54 +0000 (10:26 -0500)]
cirrus: fill out python 3 tests

Test python 3 on all platforms and to the same extent as python 2.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agodrecurse: use portable S_ISDIR() instead of deriving S_IFMT
Rob Browning [Sat, 27 Jun 2020 16:47:31 +0000 (11:47 -0500)]
drecurse: use portable S_ISDIR() instead of deriving S_IFMT

In some simple testing, there didn't appear to be any notable,
consistent performance difference, and the S_IFMT(0xffffffff) call
fails with an OverflowError on at least macos.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoAdd portable dev/sort-z; link into t/bin/; use in test-meta.sh
Rob Browning [Thu, 25 Jun 2020 06:48:45 +0000 (01:48 -0500)]
Add portable dev/sort-z; link into t/bin/; use in test-meta.sh

Add a dev/sort-z wrapper to handle sorting null terminated lines
across platforms.  Apparently NetBSD supports "-R 000" instead of -z.

Add a t/bin for any commands we want to make available to all tests,
symlink sort-z there, and add it to the PATH in test-meta.sh.

Thanks to Greg Troxel for reporting the problem.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoconfigure: test for functional readline more carefully
Rob Browning [Sun, 21 Jun 2020 17:46:50 +0000 (12:46 -0500)]
configure: test for functional readline more carefully

Apparently on (cirrus) macos, just testing for the ability to compile
readline.h isn't sufficient because configure ends up selecting the
built-in readline (which is insufficient) rather than the one we
specifically installed via brew.  More specifically, the built-in
readline has the wrong prototype for rl_completion_entry_function
which causes this error:

  _helpers.c:2096:38: error: incompatible function pointer types assigning to 'Function *' (aka 'int (*)(const char *, int)') from 'char *(const char *, int)' [-Werror,-Wincompatible-function-pointer-types]
        rl_completion_entry_function = on_completion_entry;
                                     ^ ~~~~~~~~~~~~~~~~~~~~

So change the test for an acceptable readline to check that
specifically, which is a better test for all the platforms.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoconfigure: check for <readline.h> vs <readline/readline.h>
Rob Browning [Sun, 21 Jun 2020 16:47:02 +0000 (11:47 -0500)]
configure: check for <readline.h> vs <readline/readline.h>

The readline docs (man and info pages) indicate that we should use

  #include <readline/readline.h>
  #include <readline/history.h>

Unfortunately, it appears that on a number of plaforms pkg-config
--cflags actually returns a -I value that requires

  #include <readline.h>
  #include <history.h>

So make an even bigger mess in config/configure to accomodate either
possibility.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoAvoid readline.h -Wstrict-prototype induced failures
Rob Browning [Sun, 21 Jun 2020 16:50:30 +0000 (11:50 -0500)]
Avoid readline.h -Wstrict-prototype induced failures

On some platforms -Wstrict-prototype is now the default, and readline
includes prototypes like "int foo()" rather than "int foo(void)", so
for now, just suppress those warnings for the readline includes.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoconfigure-sampledata: only create random paths if asked
Rob Browning [Sun, 21 Jun 2020 16:29:23 +0000 (11:29 -0500)]
configure-sampledata: only create random paths if asked

Stop creating randomized paths in t/sampledata/ by default.  I'd
originally just added this to allow some quick testing, and while it
now appears to be fine on Linux/ext4, it's too aggressive to be the
default, so hide it behind a BUP_TEST_RANDOMIZED_SAMPLEDATA_PATHS
environment variable.

Among other things, make-random-paths just crashes on cirrus macos,
and cirrus freebsd was having (different) trouble.  It might also have
been macos where test-import-duplicity.sh failed on compare-trees
mismatches.  Not sure whether that was an issue with bup, rsync, or
duplicity.

We'll want to restore broader randomized path testing, but likely via
a less blunt instrument, since placing the paths in t/sampledata
affects any test that relies on it, and existing testing constructs
like

  WVPASSEQ ... $(... | wc -l)

are completely incompatible.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agocirrus: ensure we use the real readline on macos
Rob Browning [Sat, 20 Jun 2020 19:19:21 +0000 (14:19 -0500)]
cirrus: ensure we use the real readline on macos

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months ago_helpers: fix potential issues -Wshorten-64-to-32 complains about
Rob Browning [Sat, 20 Jun 2020 19:02:37 +0000 (14:02 -0500)]
_helpers: fix potential issues -Wshorten-64-to-32 complains about

These were revealed by -Wshorten-64-to-32 which appears to be the
clang default on cirrus' macos (catalina?).

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoconfigure: summarize findings
Rob Browning [Sun, 21 Jun 2020 17:13:24 +0000 (12:13 -0500)]
configure: summarize findings

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoUse pkg-config opportunistically
Rob Browning [Sat, 20 Jun 2020 16:55:53 +0000 (11:55 -0500)]
Use pkg-config opportunistically

Use pkg-config's --cflags and --libs when they're available for
libreadline or libacl, but don't require pkg-config.  When it's not
found, just check for the libraries with a test compile.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoDon't assume readline always defines _XOPEN_SOURCE
Rob Browning [Sun, 21 Jun 2020 17:09:16 +0000 (12:09 -0500)]
Don't assume readline always defines _XOPEN_SOURCE

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agogrp_struct_to_py(): fix error handling
Johannes Berg [Fri, 19 Jun 2020 05:19:07 +0000 (07:19 +0200)]
grp_struct_to_py(): fix error handling

Both getgrgid_r() and getgrnam_r() *return* an error number
on failures, and don't store it to errno. Thus, rc will not
be less than zero, and we need to set errno before we can
create a python error from it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
3 months agocirrus: macos: use new brew installer
Johannes Berg [Thu, 28 May 2020 22:06:24 +0000 (00:06 +0200)]
cirrus: macos: use new brew installer

The current installer prints that it's deprecated, use the
one they suggest using now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
3 months agogitignore: add config/bin
Rob Browning [Sun, 14 Jun 2020 20:26:35 +0000 (15:26 -0500)]
gitignore: add config/bin

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoDESIGN: describe our adjusted approach to py3
Rob Browning [Sat, 13 Jun 2020 19:30:10 +0000 (14:30 -0500)]
DESIGN: describe our adjusted approach to py3

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoDESIGN: document the actual hashsplit algorithm
Johannes Berg [Fri, 1 May 2020 22:52:53 +0000 (00:52 +0200)]
DESIGN: document the actual hashsplit algorithm

The hashsplit algorithm, when used for the fanout, has a quirk
that appears to be due to an implementation bug.

In order for splitting to occur, the lowest 13 (BUP_BLOBBITS)
bits of the csum need to be 1. Then, per DESIGN, the next bits
that are 1 are used for the fanout. However, the implementation
doesn't actually work this way. What actually happens is that
the lower 13 bits need to be ones:

 ........1'1111'1111'1111

Then, the DESIGN document states that the next bits that are 1
should be used for the fanout:

 ....'111_'____'____'____

However, the implementation actually ignores the next bit ('x')

 ....'11x_'____'____'____

and it doesn't matter whether that's set to 0 or 1, the fanout
will be based on the next higher bits (marked '.' and '1').

Fix this in the DESIGN documentation rather than changing the
algorithm as the latter would cause a save of an identical file
to completely rewrite the tree objects that make up the file,
due to different fanout behaviour.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
3 months agohelp: fix src tree Documentation location
Rob Browning [Sat, 13 Jun 2020 19:31:40 +0000 (14:31 -0500)]
help: fix src tree Documentation location

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoindex: fix bytes vs. str (py3) for --long output
Johannes Berg [Sat, 13 Jun 2020 19:21:44 +0000 (21:21 +0200)]
index: fix bytes vs. str (py3) for --long output

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
3 months agometadata: fix test failure with xattrs
Johannes Berg [Tue, 26 May 2020 22:29:23 +0000 (00:29 +0200)]
metadata: fix test failure with xattrs

The test_apply_to_path_restricted_access() is broken because
the expected string doesn't take into account that it's now
a u'' string (on python2) due to path_msg(), but this doesn't
appear on any non-selinux systems because the original file
never has any xattr, so the test passes, just not on my Fedora
system.

Change the expected message and remove the quote entirely as
it's different between python 2 and 3, and also try to set an
xattr in the test - if that fails, just continue as it used
to be without reading/setting it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
3 months agometadata: port ACL support to C
Johannes Berg [Sat, 30 May 2020 21:55:46 +0000 (23:55 +0200)]
metadata: port ACL support to C

Use some own C code instead of the posix1e python bindings, as those
don't have correct 'bytes' support (at least right now), which means
that we cannot use them with arbitrary file, user and group names.
Our own wrappers just use 'bytes' throughout.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
[rlb@defaultvalue.org: adjust to rely on pkg-config]
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoSplit src tree python use to config/bin/python and dev/bup-python
Rob Browning [Sat, 30 May 2020 02:50:25 +0000 (21:50 -0500)]
Split src tree python use to config/bin/python and dev/bup-python

Replace cmd/bup-python with config/bin/python, which is just a symlink
to the configured python, and dev/bup-python, which is what "bup
python" used to be.  Adjust all the code to use config/bin/python when
we can (i.e. when we don't need bup modules), and dev/bup-python
otherwise.  Drop "bup python", since we don't need it anymore.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoAdd bin/bup symlink so bin/ can be added to PATH
Rob Browning [Fri, 29 May 2020 16:14:31 +0000 (11:14 -0500)]
Add bin/bup symlink so bin/ can be added to PATH

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
3 months agoRemove Python 3 guardrail: BUP_ALLOW_UNEXPECTED_PYTHON_VERSION
Rob Browning [Sun, 7 Jun 2020 16:18:11 +0000 (11:18 -0500)]
Remove Python 3 guardrail: BUP_ALLOW_UNEXPECTED_PYTHON_VERSION

Now that the last subcommand (web) has been ported to Python 3, we
at least some randomized binary test coverage, and we think we've
addressed all the Python 3 issues we know of, remove the
BUP_ALLOW_UNEXPECTED_PYTHON_VERSION guardrail.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agoStop forcing LC_CTYPE=ISO-8859-1
Rob Browning [Mon, 25 May 2020 19:46:52 +0000 (14:46 -0500)]
Stop forcing LC_CTYPE=ISO-8859-1

Now that we've made adjustments to work around all the Python 3
problems with non-Unicode data (argv, env vars, readline, acls, users,
groups, hostname, etc.), and added randomized binary path and argv
testing, stop overriding the LC_CTYPE since that should no longer be
necessary.

Thanks to Johannes Berg for nudging me to consider whether we might
now be in a position to do this (with a bit more work), and for quite
a bit of help getting all the precursors in place once we thought it
was feasible.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
3 months agosampledata: include random binary paths
Rob Browning [Thu, 28 May 2020 06:56:13 +0000 (01:56 -0500)]
sampledata: include random binary paths

Thanks to Johannes Berg for the xargs -0 addition fix.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
3 months agoBypass Python 3 glibc argv problems by routing args through env
Rob Browning [Sun, 7 Jun 2020 14:05:26 +0000 (09:05 -0500)]
Bypass Python 3 glibc argv problems by routing args through env

Until/unless https://sourceware.org/bugzilla/show_bug.cgi?id=26034 is
resolved by Python or GNU libc, sidestep the problem, which can crash
Python 3 during initialization, with a trivial sh wrapper that diverts
the command line arguments into BUP_ARGV_{0,1,2,...} environment
variables, since those can be safely retrieved.

Add compat.argvb and compat.argv and populate them at startup with the
BUP_ARGV_* values.  Adjust all the relevant commands to rely on those
vars instead of sys.argv.

Although the preamble say "rewritten during install", that's not in
place yet, but will be soon (when we drop LC_CTYPE and rework
bup-python).

Thanks to Johannes Berg for suggesting this, and help figuring out the
details.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
4 months agoimport-rdiff-backup: fix incorrectly named TMPIDX var
Rob Browning [Sun, 7 Jun 2020 14:02:47 +0000 (09:02 -0500)]
import-rdiff-backup: fix incorrectly named TMPIDX var

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
4 months agohashsplit: avoid cat_bytes() if possible
Johannes Berg [Tue, 28 Apr 2020 21:35:33 +0000 (23:35 +0200)]
hashsplit: avoid cat_bytes() if possible

If our current buffer is empty, there's no need to cat_bytes()
it with the new buffer, we can just replace the empty one with
the new one. This saves the memcpy() in many cases. Especially
if the whole file was read in one chunk (by bumping up the read
size) this saves a lot of time.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
4 months agoweb: fix for python3
Johannes Berg [Sun, 17 May 2020 19:38:14 +0000 (21:38 +0200)]
web: fix for python3

Do the minimal adjustments to make 'bup web' work with python3.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
4 months agotests: web: also add some invalid UTF-8
Johannes Berg [Sun, 17 May 2020 19:43:45 +0000 (21:43 +0200)]
tests: web: also add some invalid UTF-8

The '¬°excitement!' really tests only valid UTF-8 since it
comes from the original bash file, add another test that
explicitly creates a byte sequence that is invalid utf-8
and ensures that this is preserved properly as well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
4 months agotests: web: exit early to avoid indentation
Johannes Berg [Sun, 17 May 2020 19:42:53 +0000 (21:42 +0200)]
tests: web: exit early to avoid indentation

There isn't really much point in assigning a variable if
all we really want is to skip the whole test, just exit
early in those cases.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
4 months agoMigrate ftp, etc. to our _helpers bytes-oriented readline
Rob Browning [Wed, 3 Jun 2020 06:34:26 +0000 (01:34 -0500)]
Migrate ftp, etc. to our _helpers bytes-oriented readline

This allows us to preserve the status quo for now, even without the
LC_CTYPE override, i.e. binary stdin/stdout, etc.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
4 months agoWrap readline oursleves to avoid py3's interference
Rob Browning [Sun, 7 Jun 2020 20:32:10 +0000 (15:32 -0500)]
Wrap readline oursleves to avoid py3's interference

We don't want Python to "help" us by guessing and insisting on the
encoding before we can even look at the incoming data, so wrap
readline ourselves, with a bytes-oriented (and more direct) API.  This
will allows us to preserve the status quo for now (and maintain parity
between Python 2 and 3) when using Python 3 as we remove our LC_CTYPE
override.

At least on Linux, readline --cflags currently defines _DEFAULT_SOURCE
and defines _XOPEN_SOURCE to 600, and the latter conflicts with a
setting of 700 via Python.h in some installations, so for now, just
defer to Python as long as it doesn't choose an older version.

Thanks to Johannes Berg for fixes for allocation issues, etc. in an
earler version, and help figuring out the #define arrangement.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Rob Browning <rlb@defaultvalue.org>
4 months agobup: don't print subcommands as b'cmd' in help with py3
Rob Browning [Sat, 6 Jun 2020 21:50:55 +0000 (16:50 -0500)]
bup: don't print subcommands as b'cmd' in help with py3

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
4 months agodistutils: handle CFLAGS and LDFLAGS directly
Rob Browning [Sun, 31 May 2020 19:10:11 +0000 (14:10 -0500)]
distutils: handle CFLAGS and LDFLAGS directly

Otherwise it places LDFLAGS in the middle of the link arguments,
before lib/bup/*.o which means we can't add lib
dependencies (e.g. -lreadline).  Pass the libs directly and specify them
via the appropriate extra_* arguments.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
4 months agobup: add own gethostname() wrapper
Johannes Berg [Sat, 30 May 2020 19:10:02 +0000 (21:10 +0200)]
bup: add own gethostname() wrapper

This is necessary because python3 insists that hostnames should
be utf-8, which is a rather questionable assumption.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
[rlb@defaultvalue.org: don't define HOST_NAME_MAX if it's not already]
Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
4 months agopwdgrp: add C helpers to get user/group bytes directly
Rob Browning [Thu, 28 May 2020 05:44:16 +0000 (00:44 -0500)]
pwdgrp: add C helpers to get user/group bytes directly

Thanks to Johannes Berg for fixing some bugs in an earlier revision.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Rob Browning <rlb@defaultvalue.org>
4 months agometadata: don't modify ACL list when writing
Johannes Berg [Sat, 30 May 2020 19:41:10 +0000 (21:41 +0200)]
metadata: don't modify ACL list when writing

ACLs should be stored as a two-entry list on files, and four-entry
list for directories. Unfortunately, when writing, we expand the
two-entry list for files to four, because the metadata format is
always with four entries.

However, on reading, we trim the last two empty entries, so that
we can end up in an inconsistent situation: On a metadata entry
for a file that has been written already, it will still have four
entries, and that won't compare correctly etc.

This isn't an issue today because we only ever do the compare
in restore, where we didn't load from disk but from the meta-
data in the repository, which always starts out four entries.

Still, fix the inconsistency and don't erroneously extend the
list when writing.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
4 months agoMove cmd to lib/ and reverse symlink
Rob Browning [Mon, 25 May 2020 19:55:46 +0000 (14:55 -0500)]
Move cmd to lib/ and reverse symlink

This prepares for removal of the bup-python wrapper.  Given this
change we'll be able to have the same sys.path in the source tree and
install tree, and so won't have to go back to mangling that during
installs.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
4 months agoREADME: add 0.30.x to CI status table
Rob Browning [Fri, 19 Jun 2020 19:33:13 +0000 (14:33 -0500)]
README: add 0.30.x to CI status table

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
4 months agoUpdate HACKING and README for 0.30.1
Rob Browning [Sat, 23 May 2020 21:37:25 +0000 (16:37 -0500)]
Update HACKING and README for 0.30.1

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
4 months agoAdd release notes for 0.30.1
Rob Browning [Sat, 23 May 2020 21:28:43 +0000 (16:28 -0500)]
Add release notes for 0.30.1

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
5 months agoindex: fix -H option
Johannes Berg [Wed, 13 May 2020 20:22:04 +0000 (22:22 +0200)]
index: fix -H option

hexlify(ent) doesn't work, that needs to be ent.sha. Fix it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agogit/midx: provide context managers for idx classes
Luca Carlon [Thu, 21 May 2020 20:28:41 +0000 (22:28 +0200)]
git/midx: provide context managers for idx classes

Opening files and then mmap()ing them keeps the files open at the
filesystem level, and then they cannot be fully removed until the
fd is closed when giving up the mapping.

On most filesystems, the file still exists but is no longer visible
ion this case. However, at least on CIFS this results in the file
still being visible in the folder, but it can no longer be opened
again, or such. This leads to a crash in 'bup gc' because it wants
to re-evaluate the idx it just tried to delete.

Teach the PackIdx classes the context manager protocol so we can
easily unmap once they're no longer needed, and use that in bup gc
(for now only there).

For consistency, already add the context manager protocol also to
the midx, even if it's not strictly needed yet since bup gc won't
actually do this to an midx.

Signed-off-by: Luca Carlon <carlon.luca@gmail.com>
[add commit message based on error description, add midx part,
remove shatable to avoid live pointers into unmapped region]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agosave: close files immediately
Johannes Berg [Sat, 16 May 2020 07:03:31 +0000 (09:03 +0200)]
save: close files immediately

Use a with statement to close all files immediately after
hashsplitting. There's also no need to have two except
clauses, so unify them to simplify this change.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agohelpers: use float for format_filesize()
Johannes Berg [Sun, 17 May 2020 21:45:21 +0000 (23:45 +0200)]
helpers: use float for format_filesize()

In format_filesize(), we really do want float division,
in order to display the value correctly. For example, if
there's a file with 45200000 bytes, that should be shown
as 43.1 MB, not 43.0. Fix this by using proper float
division here, not int division.

Fixes: a5809723352c ("helpers: use // not / for division")
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agomincore: fix reading information
Johannes Berg [Fri, 1 May 2020 20:16:39 +0000 (22:16 +0200)]
mincore: fix reading information

The _fmincore_chunk_size is typically set to 64MiB, which
makes sense to avoid doing very large mmap() operations
(to save already precious VM on 32-bit systems).

However, since that's in bytes, we cannot divide a size in
pages by it, and expect any useful outcome.

Calculate the number of chunks (chunk_count) properly based
on the size of the file, rather than its number of pages.
Otherwise, chunk_count typically ends up just 1 even for a
very large file (my test file was ~500MiB), and mincore()
is run just once, so we fill the presence information only
for the first 64MiB of the file, even if it was previously
completely in RAM.

Given a large enough test file (and enough RAM to keep it
there), the following should print about the same times
twice:

  cat test > /dev/null ; \
  time cat test > /dev/null ; \
  bup split --noop test ; \
  time cat test >/dev/null

Without the fix, it's evident that the file is evicted from
RAM almost entirely (apart from the first 64MiB) even in
this synthetic case.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agoFix bup-web error formatting when port unsupplied
Wyatt Alt [Sun, 19 Apr 2020 21:46:52 +0000 (14:46 -0700)]
Fix bup-web error formatting when port unsupplied

Previously supplying the bup dir to bup web by mistake would result in a
python syntax error.

Signed-off-by: Wyatt Alt <wyatt.alt@gmail.com>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agogit: create_commit_blob: allow timezones to be specified as 0
Johannes Berg [Sun, 19 Jan 2020 20:18:35 +0000 (21:18 +0100)]
git: create_commit_blob: allow timezones to be specified as 0

Checking "if adate_tz" means that if it's 0 (UTC) then we'll
actually use localtime, which is wrong. Do this only when it's
specified as None.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
5 months agoFall back to calloc when __builtin_mul_overflow isn't available
Rob Browning [Sun, 19 Apr 2020 20:47:25 +0000 (15:47 -0500)]
Fall back to calloc when __builtin_mul_overflow isn't available

Check for __builtin_mul_overflow at configure time, and fall back to
calloc when it's not found.

Thanks to Greg Troxel for reporting the problem.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
6 months agocirrus: move to FreeBSD 12-1
Johannes Berg [Sat, 18 Apr 2020 21:59:25 +0000 (23:59 +0200)]
cirrus: move to FreeBSD 12-1

Since there appears to be trouble with the image we were using.  After
testing a few others, 12-1 appears to be OK.

cf. https://github.com/cirruslabs/cirrus-ci-docs/issues/625

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
[rlb@defaultvalue.org: elaborate on rationale in commit message]

6 months agovfs: use None for unknown uid/gid
Johannes Berg [Sat, 8 Feb 2020 21:54:56 +0000 (22:54 +0100)]
vfs: use None for unknown uid/gid

This means we show '?' instead of 0 for unknown UIDs when
numeric output is requested, as it was before.

This also uncovered a forgotten bytes annotation for the
"unknown" string ('?' should be b'?').

Somehow, this new behaviour (of printing 0 instead of ?)
also got quite enshrined in the test suite, fix that too.

And finally, on python 2, fuse doesn't accept None in the
stat struct (but does on python 3, go figure).

Fixes: f76c37383ddb ("Remove vfs (replaced by vfs2)")
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
6 months agogit: add a test for not keeping midx files open
Johannes Berg [Sat, 18 Jan 2020 21:36:36 +0000 (22:36 +0100)]
git: add a test for not keeping midx files open

This test creates a few dummy idx files, generates an midx,
queries a PackIdxList for a non-existent object, unlinks the
midx and checks that we still have it open, but that we close
it at PackIdxList::refresh now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
6 months agogit: split out idx file writing to a separate class
Johannes Berg [Fri, 3 Jan 2020 12:53:20 +0000 (13:53 +0100)]
git: split out idx file writing to a separate class

Split the idx file writing into a separate class, to make that
kind of action available separately. This will be useful for the
next patch where we use it to test some idx/midx code.

In the future, it'll also be useful for encrypted repositories
since the idx format there will be useful for local caching to
take advantage of midx and bloom code as is, but the packwriter
will of course not be useful.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
6 months agogit: stop using .encode('hex') in MissingObject()
Johannes Berg [Sat, 18 Apr 2020 21:29:22 +0000 (23:29 +0200)]
git: stop using .encode('hex') in MissingObject()

This needs to use hexlify() instead for python3 compatibility.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
6 months agofsck: stop using .encode('hex')
Johannes Berg [Sat, 18 Apr 2020 21:28:59 +0000 (23:28 +0200)]
fsck: stop using .encode('hex')

An error path is using this still, use hexlify() instead.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
6 months agotest-save-errors: fix shebang for freebsd
Johannes Berg [Mon, 16 Mar 2020 21:09:02 +0000 (22:09 +0100)]
test-save-errors: fix shebang for freebsd

It appears that freebsd doesn't support recursive interpreters,
so you cannot use a shell script directly as one. Instead, to
fix it, invoke it via /usr/bin/env.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
6 months agossh: simplify the code
Johannes Berg [Mon, 13 Jan 2020 19:53:26 +0000 (20:53 +0100)]
ssh: simplify the code

There's no point in shipping PATH to the remote server, since
it will be different there. We can also simplify the loopback
check, and we don't really need to munge the PATH there either
if we just use path.exe() in place of a plain 'bup' for it.

While at it, also fix the formatting instruction for the ints
to %d, instead of %s.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agogit: overhaul git version checking
Rob Browning [Tue, 11 Feb 2020 02:47:13 +0000 (20:47 -0600)]
git: overhaul git version checking

Rework the git version tetsting to handle versions like "1.5.2-rc3"
or (apparently) "1.5.2-rc3 (something ...)".  Add tests for parsing of
all the version types in the current git tag history that we need to
support.

Support and document BUP_ASSUME_GIT_VERSION_IS_FINE=1 as an escape
hatch in case the parsing isn't sufficiently comprehensive, or
upstreeam changes their practices in future releases.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
7 months agocmd/fsck-cmd.py: Append newline after quoted par2 output
Christian Cornelssen [Tue, 18 Feb 2020 15:17:33 +0000 (16:17 +0100)]
cmd/fsck-cmd.py: Append newline after quoted par2 output

With a recent `par2` installed, `bup fsck -v` was printing messages
like this:

  Unexpected par2 error output
  ''Assuming par2 supports parallel processing

when there was no error output and the exit status was zero.  A
previous commit 19f9faeb0055dadb7f76a953d51acec8373c6edb eliminated
the spurious reporting; now make sure we print a newline after the
par2 output whenever there actually is an error.

Signed-off-by: Christian Cornelssen <ccorn@1tein.de>
[rlb@defaultvalue.org: add information from the pr to commit message]
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agocmd/fsck-cmd.py: Do not warn about empty par2 output
Christian Cornelssen [Tue, 18 Feb 2020 14:43:51 +0000 (15:43 +0100)]
cmd/fsck-cmd.py: Do not warn about empty par2 output

With a recent `par2` installed, `bup fsck -v` outputs messages like

  Unexpected par2 error output
  ''Assuming par2 supports parallel processing

when there was no error output and the exit status was zero.  Stop
printing the warning in that case.

Signed-off-by: Christian Cornelssen <ccorn@1tein.de>
[rlb@defaultvalue.org: add information from the pr to commit message]
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
7 months agoindex: make --fake-valid match the man page
Johannes Berg [Mon, 17 Feb 2020 20:02:11 +0000 (21:02 +0100)]
index: make --fake-valid match the man page

The index command currently clobbers the hash of a file when
marking it as valid, but the man page states:

    --fake-valid
        mark specified paths as up-to-date even if they aren't.
        This can be useful for testing, or to avoid unnecessarily
        backing up files that you know are boring.

The latter part ("avoid unnecessarily backing up [...]") cannot be
implemented with --fake-valid as is, because of the clobbering of
the hash: the fake invented hash will not exist in the repository,
and thus save checks and saves the file.

Fix this by clobbering the hash only if it's the invalid EMPTY_SHA.

Add a test for this to test-save-smaller, just because that's where
we discovered it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agosave: add test for --smaller, fix DESIGN document
Johannes Berg [Mon, 17 Feb 2020 19:03:22 +0000 (20:03 +0100)]
save: add test for --smaller, fix DESIGN document

Add a test for --smaller, in particular showing that the actual --smaller
behaviour doesn't match what's described in the DESIGN file, which says:

    Another interesting trick is that you can skip backing up files even if
    IX_HASHVALID *isn't* set, as long as you have that file's sha1 in the
    repository.  What that means is you've chosen not to backup the latest
    version of that file; instead, your new backup set just contains the
    most-recently-known valid version of that file.  This is a good trick if you
    want to do frequent backups of smallish files and infrequent backups of
    large ones (as in 'bup save --smaller').  Each of your backups will be
    "complete," in that they contain all the small files and the large ones, but
    intermediate ones will just contain out-of-date copies of the large files.

This ("Each of your backups will be 'complete,' [...]") would seem to indicate
all files should be present, but in fact neither new nor old files are actually
saved by 'bup save --smaller'.

To avoid confusion, also update the DESIGN documentation here.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agodocumentation: quote literal * in man pages
Johannes Berg [Sun, 16 Feb 2020 21:13:03 +0000 (22:13 +0100)]
documentation: quote literal * in man pages

For "1024*1024" etc. we need to quote the * so that it
doesn't get used as bold markup, fix that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agobuild: fix C-side dependencies
Johannes Berg [Wed, 5 Feb 2020 19:36:46 +0000 (20:36 +0100)]
build: fix C-side dependencies

We need to depend on bupsplit.h, otherwise changes there
don't force a rebuild.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agoget: convert opt.source to bytes
Johannes Berg [Tue, 4 Feb 2020 20:27:23 +0000 (21:27 +0100)]
get: convert opt.source to bytes

I noticed this while playing with something else that
didn't just pass the repo_dir to git, but instead used
it with some os.path.join() calls that complain about
mixed unicode/bytes.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agocompat: directly assign bytes_from_uint = chr
Johannes Berg [Fri, 31 Jan 2020 21:00:56 +0000 (22:00 +0100)]
compat: directly assign bytes_from_uint = chr

This is significantly faster than the indirection and
seems to reduce the runtime of the test suite by about
3% on my machine with python 2 (76.121s -> 73.725s,
but I tried only once which is clearly not enough.)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agovfs: remove dead cache_get_revlist_item()
Johannes Berg [Wed, 29 Jan 2020 20:32:39 +0000 (21:32 +0100)]
vfs: remove dead cache_get_revlist_item()

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agoBaseConn: let _read/_readline raise NotImplementedError
Johannes Berg [Fri, 13 Dec 2019 20:55:53 +0000 (21:55 +0100)]
BaseConn: let _read/_readline raise NotImplementedError

This way, it's easier to understand the code, since these
functions aren't referenced without existing in BaseConn.

Also change has_input() to raise NotImplementedError instead
of trying to instantiate NotImplemented - the latter is just
a singleton to return from the rich comparison methods.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agogit/client/server: remove rev_list() count support
Johannes Berg [Tue, 28 Jan 2020 23:21:01 +0000 (00:21 +0100)]
git/client/server: remove rev_list() count support

This is obviously not used, as passing count!=None would
crash the client method (client.py doesn't import Integral).
Rather than fixing that, just remove support for it entirely.

While at it, also clean up a duplicate rev_list_invocation()
call in the server.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
7 months agocatpipe: remove useless StopIteration catching
Johannes Berg [Tue, 28 Jan 2020 23:16:54 +0000 (00:16 +0100)]
catpipe: remove useless StopIteration catching

StopIteration cannot escape this loop construct, remove it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
8 months agosave: remove pointless metalist check
Johannes Berg [Tue, 28 Jan 2020 19:41:10 +0000 (20:41 +0100)]
save: remove pointless metalist check

The metalist can never be empty, since at every level we add
at least the directory's own metadata; it may not have actual
metadata, but there's an entry all the time.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>
8 months agosave/vfs: update comments wrt. tree/bupm ordering
Johannes Berg [Sat, 25 Jan 2020 23:17:38 +0000 (00:17 +0100)]
save/vfs: update comments wrt. tree/bupm ordering

After looking into this and thinking about it, the comments here
are a bit misleading - save states the entries must be in a given
order without a rationale, and vfs states that the order is wrong
but gives an explanation that's not quite right.

Update both comments to make this clearer, and to document that
there's no inherent reason, just happened to pick something when
the save code was written, which turned out to be not the best.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
8 months agotests: add test for save encountering duplicates
Johannes Berg [Sat, 25 Jan 2020 22:13:17 +0000 (23:13 +0100)]
tests: add test for save encountering duplicates

Add a test for save encountering duplicates in the index, both
for a file and a directory, which was fixed in the previous patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
Tested-by: Rob Browning <rlb@defaultvalue.org>