arthur.barton.de Git - bup.git/log

Introduce BUP_DEBUG, --debug, and tone down the log messages a lot.

There's a new global bup option, --debug (-D) that increments BUP_DEBUG. If
BUP_DEBUG >=1, debug1() prints; if >= 2, debug2() prints.

We change a bunch of formerly-always-printing log() messages to debug1 or
debug2, so now a typical bup session should be a lot less noisy.

This affects midx in particular, which was *way* too noisy now that 'bup
save' and 'bup server' were running it automatically every now and then.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

client.py,git.py: run 'bup midx -a' automatically sometimes.

Now that 'bup midx -a' is smarter, we should run it automatically after
creating a new index file. This should remove the need for running it by
hand.

Thus, we also remove 'bup midx' from the lists of commonly-used subcommands.
(While we're here, let's take out 'split' and 'join' too; you should be
using 'index' and 'save' most of the time.)

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Rename 'bup rbackup' to 'bup on'

'rbackup' was a dumb name but I couldn't think of anything better at the
time. This works nicely in a grammatical sort of way:

bup on myserver save -n myserver-backup /etc

Now that we've settled on a name, also add some documentation for the
command.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

rot13 the t/testfile* sample data files.

They were generated by catting bunches of bup source code together, which,
as it turns out, makes 'git grep' super annoying. Let's rot13 them so
grepping doesn't do anything interesting but the other characteristics are
the same.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/midx: --auto mode can combine existing midx files now.

Previously, --auto would *only* create a midx from not-already-midxed .idx
files.  This wasn't optimal since you'd eventually end up with a tonne of
.midx files, which is just as bad as a tonne of .idx files.

Now we'll try to maintain a maximum number of midx files using a
highwater/lowwater mark.  That means the number of active midx files should
now stay between 2 and 5, and you can run 'bup midx -a' as often as you
want.

'bup midx -f' will still make sure everything is in a single .midx file,
which is an efficient thing to run every now and then.

'bup midx -af' is the same, but uses existing midx files rather than forcing
bup to start from only .idx files.  Theoretically this should always be
faster than, and never be worse than, 'bup midx -f'.

Bonus: 'bup midx -a' now works when there's a limited number of file
descriptors.  The previous fix only worked properly with 'bup midx -f'.
(This was rarely a problem since 'bup midx -a' would only ever touch the
last few .idx files, so it didn't need many file descriptors.)

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Merge branch 'maint'

* maint:
cmd/midx: use getrlimit() to find the max open files.

cmd/midx: use getrlimit() to find the max open files.

It turns out the default file limit on MacOS is 256, which is less than our
default of 500. I guess this means trouble after all, so let's auto-detect
it.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Merge branch 'maint'

* maint:
  index.py: handle uid/gid == -1 on cygwin
  cmd/memtest: use getrusage() instead of /proc/self/stat.
  cmd/index: catch exception for paths that don't exist.
  Don't use $(wildcard) during 'make install'.
  Don't forget to install _helpers.dll on cygwin.

cmd/margin: interpret the meaning of the margin bits.

Maybe you were wondering how good it is when 'bup margin' returns 40 or 45.
Well, now it'll tell you.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

index.py: handle uid/gid == -1 on cygwin

On cygwin, the uid or gid might be -1 for some reason. struct.pack()
complains about a DeprecationWarning when packing a negative number into an
unsigned int, so fix it up first.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/memtest: use getrusage() instead of /proc/self/stat.

Only Linux has /proc/self/stat, so 'bup memtest' didn't work on anything
except Linux. Unfortunately, getrusage() on *Linux* doesn't have a valid
RSS field (sigh), so we have to use /proc/self/stat as a fallback if it's
zero.

Now memtest works on MacOS as well, which means 'make test' passes again.
(It stopped passing because 'bup memtest' recently got added to one of the
tests.)

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/index: catch exception for paths that don't exist.

Rather than aborting completely if a path specified on the command line
doesn't exist, report it as a non-fatal error instead.

(Heavily modified by apenwarr from David Roda's original patch.)

Signed-off-by: David Roda <davidcroda@gmail.com>
Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Documentation/*.md: add some options that we forgot to document.

Software evolves, but documentation evolves... slower.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Rename Documentation/*.1.md to Documentation/*.md

All our man pages end up in section 1 of man anyway, and it looks like that
will probably never change. So let's make our filenames simpler and easier
to understand.

Even if we do end up adding a page in (say) section 5 someday, it's no big
deal; we can just add an exception to the Makefile for it or something.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Don't use $(wildcard) during 'make install'.

It seems the $(wildcard) is evaluated once at make's startup, so any changes
made *during* build don't get noticed.

That means 'make install' would fail if you ran it without first running
'make all', because $(wildcard cmd/bup-*) wouldn't match anything at startup
time; the files we were copying only got created during the build.

Problem reported by David Roda.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Don't forget to install _helpers.dll on cygwin.

We were installing *.so, but not *$(SOEXT) like we should have. Now we do,
which should fix some cygwin install problems reported by David Roda.

Also, when installing *.so and *.dll files, make them 0755 instead of 0644,
also to prevent permissions problems on cygwin, also reported by David Roda.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Merge branch 'maint'

* maint:
git.py: recover more elegantly if a MIDX file has the wrong version.
cmd/midx: add a new --max-files parameter.

Conflicts:
lib/bup/git.py

Merge branch 'guesser'

* guesser:
  _helpers.extract_bits(): rewrite git.extract_bits() in C.
  _helpers.firstword(): a new function to extract the first 32 bits.
  git.py: when seeking inside a midx, use statistical guessing.

git.py: recover more elegantly if a MIDX file has the wrong version.

Previously we'd throw an assertion for any too-new-format MIDX file, which
isn't so good. Let's recover more politely (and just ignore the file in
question) if that happens.

Noticed by Zoran Zaric who was testing my midx3 branch.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/midx: add a new --max-files parameter.

Zoran reported that 'bup midx -f' on his system tried to open 3000 files at
a time and wouldn't work.  That's no good, so let's limit the maximum files
to open; the default is 500 for now, since that ought to be usable for
normal people.  Arguably we could use getrlimit() or something to find out
the actual maximum, or just keep opening stuff until we get an error, but
maybe there's no point.

Unfortunately this patch isn't really perfect, because it limits the
usefulness of midx files.  If you could merge midx files into other midx
files, then you could at least group them all together after multiple runs,
but that's not currently supported.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

_helpers.extract_bits(): rewrite git.extract_bits() in C.

That makes our memtest run just slightly faster: 2.8 seconds instead of 3.0
seconds, which catches us back up with the pre-interpolation-search code.
Thus we should now be able to release this patch without feeling embarrassed
:)

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

_helpers.firstword(): a new function to extract the first 32 bits.

This is a pretty common operation in git.py and it speeds up cmd/memtest
results considerably: from 3.7 seconds to 3.0 seconds.

That gets us *almost* as fast as we were before the whole statistical
guessing thing, but we still enjoy the improved memory usage.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

git.py: when seeking inside a midx, use statistical guessing.

Instead of using a pure binary search (where we seek to the middle of the
area and do a greater/lesser comparison) we now use an "interpolation
search" (http://en.wikipedia.org/wiki/Interpolation_search), which means we
seek to where we statistically *expect* the desired value to be.

In my test data, this reduces the number of typical search steps in my test
midx from 8.7 steps/object to 4.8 steps/object.

This reduces memory churn when using a midx, since sometimes a given search
region spans two pages, and this technique allows us to more quickly
eliminate one of the two pages sometimes, allowing us to dirty one fewer
page.

Unfortunately the implementation requires some futzing, so this actually
makes memtest run about 35% *slower*. Will try to fix that next.

The original link to this algorithm came from this article:
http://sna-projects.com/blog/2010/06/beating-binary-search/

Thanks, article!

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

git.py: keep statistics on how much sha1 searching we had to do.

And cmd/memtest prints out the results. Unfortunately this slows down
memtest runs by 0.126/2.526 = 5% or so. Yuck. Well, we can take it out
later.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/memtest: add a --existing option to test with existing objects.

This is useful for testing behaviour when we're looking for objects
that *do* exist. Of course, it just goes through the objects in order, so
it's not actually that realistic.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/midx: fix SHA_PER_PAGE calculation.

For some reason we were dividing by 200 instead of by 20, which was way off.
Switch to 20 instead. Suspiciously, this makes memory usage slightly worse
in my current (smallish) set of test data, so we might need to revert it
later...? But if we're going to have an adjustment, we should at least make
it clear what for, rather than hiding it in something that looks
suspiciously like a typo.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/margin: add a new --predict option.

When --predict is given, it tries to guess the offset in the indexfile of
each hash, based on assumption that the hashes are distributed evenly
throughout the file.  Then it prints the maximum amount by which this guess
deviates from reality.

I was hoping the results would show that the maximum deviation in a typical
midx was less than a page's worth of hashes; that would mean the toplevel
lookup table could be redundant, which means fewer pages hit in the
common case.  No such luck, unfortunately; with 1.6 million objects, my
maximum deviation was 913 hashes (about 18 kbytes, or 5 pages).

By comparison, midx files should hit about 2 pages in the common case (1
lookup table + 1 data page).  Or 3 pages if we're unlucky and the search
spans two data pages.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/memtest: print per-cycle and total times.

This makes it easier to compare output from other people or between
machines, and also gives a clue as to swappiness.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Rename _faster.so to _helpers.so.

Okay, _faster.so wasn't a good choice of names.  Partly because not
everything in there is just to make stuff faster, and partly because some
*proposed* changes to it don't just make stuff faster.  So let's rename it
one more time.  Hopefully the last time for a while!

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

lib/bup/ssh: Add docstrings

Document the code with doctrings.

Also add an "import sys" line since it is used by sys.argv[0] on line 6.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

lib/bup/options: Add docstrings

Document the code with docstrings.

Use one line per imported module as recommended by PEP 8 to make it
easier to spot unused modules.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

import cleanup

Remove unused imported modules.

I started using the pyflakes.vim plugin and it automagically shows a
bunch of problems/uncleanliness in the code. It helped me pull this out
in 15mins.

This change shouldn't have any impact on performance or functionality
but it makes the code cleaner.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

cmd/ftp: don't die if we can't import the ctypes module.

It's only needed on some rare broken versions of readline anyway. If we
can't find the module, chances are the system doesn't have that broken
version of readline.

Based on suggestions by Gabriel Filion and Aaron Ucko.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

lib/bup/vfs: bring back Python 2.4 support

There is currently one test failure when running tests against Python
2.4: a try..except..finally block that's interpreted as a syntax error.
The commit introducing this incompatibility with 2.4 is f77a0829

This is a well known python 2.4 limitation and the workaround, although
ugly, is easy.

With this test passing, Python 2.4 support is back.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

lib/bup/vfs: Add docstrings

Since the vfs module uses the function git._treeparse, it should not be
named as if it was a private function. Rename git._treeparse to
git.treeparse and document it (add a docstring to it).

Also, transform _ChunkReader, _FileReader and Node into new-style
classes.

Finally, remove trailing spaces from lib/bup/vfs.py .

DESIGN: update mentions of stupidsum to reflect new rollsum algorithm.

Pointed out by Gabriel Filion.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

README: typo.

Noticed by Zoran Zaric.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/save: update the progress meter less often.

If you ran 'bup save' in an ssh sessio, you could end up sending huge
amounts of data back over ssh *just* to update the progress meter after
every single block! Oops. Limit the updates to only about 5 per second,
which is much better.

Rename _hashsplit.so to _faster.so, and move bupsplit into its own source file.

A lot of stuff in _hashsplit.c wasn't actually about hashsplitting; it was
just a catch-all for all our C accelerator functions. Now the module name
reflects that.

Also move the bupsplit functions into their own non-python-dependent C
source file so they can be used as part of other projects.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

test.sh: check the return code of 'bup random'

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/{random,memtest}: use the new options.py default value support.

options.py: support for putting default values in [square brackets].

This looks good in the usage message, and is a better place to hardcode such
things than in the code itself.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

_hashsplit.c: get rid of some warnings indicated by a C++ compiler.

Not hugely important, but might as well fix them.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

_hashsplit.c: replace the stupidsum algorithm with rsync's adler32-based one.

I've been meaning to do this for a while, but a particular test dataset that
really caused problems with stupidsum() (ie. it split things into way more
chunks than it should have) finally screwed me over.  Let's change over to a
"real" checksum algorithm.

Non-annoying datasets shouldn't be noticeably affected, but bad ones (such
as my test case from EQL Data) can be 10x more sensible.  Typical backup
sets now have about 20% fewer chunks, although this has little affect on the
overall repository size.

WARNING: After this patch, all your chunk boundaries will be different from
before!  That means your incremental backups won't be terribly incremental
and your backup repositories will jump in size.  This should only happen
once.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

_hashsplit.c: switch rollsum_roll() to a macro instead of an inline function.

gcc 4.3's optimizer manages to fail at optimizing the inline, but works okay
with the macro.

Mysteriously, if find_ofs() is *not* static (and therefore presumably
*harder* to optimize), the optimizer works either way. But removing the
static is just wrong, so use the macro instead.

The difference in speed is about 53 megs/sec vs 80 megs/sec on my machine
for this command:

bup random 100M 2>/dev/null | bup split -N --bench

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

_hashsplit.c: refactor a bit, and add a self-test.

In preparation for replacing the stupidsum algorithm with the rsync
adler32-based one.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

make clean: remove some leftover files.

Stuff has moved around a bit recently, and we weren't cleaning up everything
like we should.

cmd/web: hide .dotfiles by default

Make all files begining with a dot be hidden by default. The hidden
files can be shown by giving the argument "hidden" with a vlue of 1 in
the URL.

Also, in _compute_dir_contents, remove the line "contents = []" since it
is never used.

Finally add a "Show/Hide hidden files" link on the pages where content
is hidden.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

cmd/ftp: exit cleanly on Ctrl-C

bup ftp currently does not handle KeyboardInterrupt exceptions.

Simply call handle_ctrl_c() at the beginning of the file to make the
command exit without a stacktrace.

cmd/ftp: Hide .dotfiles by default (-a shows them)

Normally in FTP sites, files beginning with a dot are hidden from a list
(ls) command by default. Also, using the argument '-a' makes the list
show hidden files.

The current 'bup ftp' implementation does not behave so. Make it hide
hidden files by default, as expected, and show hidden files when '-a' or
'--all' is specified to the 'ls' command.

All unknown switches will make bup ftp show the ls command usage.

Users can also give 'ls --help' to obtain the usage string.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

lib/options: Add an onabort argument to Options()

Some times, we may want to parse a list of arguments and not have the
call to Options.parse() exit the program when it finds an unknown
argument.

Add an argument to the class' __init__ method that can be either a
function or a class (must be an exception class). If calling the
function or class constructor returns an object, this object will be
raised on abort.

Also add a convenience exception class named Fatal that can be
passed to Options() to exclusively catch situations in which
Options.parse() would have caused the program to exit.

Finally, set the default value to the onabort argument to call
sys.exit(97) as was previously the case.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

cmd/ftp: if completion fails due to FileNotFound, just eat it.

Just as bash would do, if you're trying to complete a filename that doesn't
exist, just don't offer any completions. In this case, it only happens if
you try to complete through a broken symlink.

Now that we've fixed this case, enable the printing of exception tracebacks
in case of *other* kinds of completion errors, since we don't expect there
to be any.

[Committed by apenwarr based on an unofficial patch from Gabriel]

Add a mode argument to mkdirp.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>

Don't specify a user or group during "make install".

This makes it possible to install bup as a normal user.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>

Remove Makefile lines that only contain a tab.

Signed-off-by: Rob Browning <rlb@defaultvalue.org>

cmd/ftp: don't let people cd into a non-directory.

This bug was relatively harmless (since you could also cd back out again)
but kind of weird.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

vfs: resolve absolute symlinks inside their particular backup set.

Let's say you back up a file "/etc/motd" that's a symlink to
"/var/run/motd". The file inside the backup repo is actually
/whatever/latest/etc/motd, so the symlink should *actually* point to
/whatever/latest/var/run/motd. Let's resolve it that way automatically in
Symlink.dereference().

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

vfs: try_lresolve() was a bad idea.  Create try_resolve() instead.

Also add some comments to describe the actual differences between resolve()
and lresolve(), and clean things up a bit so that they actually work as
they're supposed to.

Basically, all of lresolve(), resolve(), and try_resolve() depend on
*intermediate* paths being resolvable; all of them will throw an exception
if not.  They only differ in the very last node in the path, when that node
is a symlink:

  resolve() will dereference it or throw an exception if it can't;
  try_resolve() will try to dereference it, but return self if it can't;
  lresolve() will not dereference it at all, like lstat() doesn't.

With that in mind, we can fix up cmd/ftp and cmd/web to use the right calls,
thus fixing an unexpected error in ftp's tab completion reported by Gabriel
Filion, which would happen if you tried to tab complete inside a directory
that contained a broken symlink.  We only care what the symlink points to so
we can decide whether or not to append '/' to the tab completion, so we want
it to fail silently if it's going to fail.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

fix helpers.columnate bug when list is empty

When the list given to the columnate function is empty, the function
raises an exception when determining the max(len of all elements), since
the list given to max is empty.

One indirect example of when this bug is apparent is in the 'bup ftp'
command when listing an empty directory:

bup> ls backupname/latest/etc/keys
error: max() arg is an empty sequence

Add a special condition at the beginning of the columnate function that
returns an empty string if the list of elements is empty.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

Ignore vim's .sw? files.

Vim names its temp files .filename.sw?. Let's ignore them.

Signed-off-by: Peter McCurdy <petermccurdy@alumni.uwaterloo.ca>

cmd/web: don't die if lresolve() fails.

Some symlinks end up pointing to nonexistent names, which is maybe not
"normal", but certainly is allowed.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Convert 'bup web' directory listing to use tornado templates.

This includes creating a new idea of a "resource path" that currently sits
under the lib dir. Getting resources is supported with a new helper
(resource_path).

Signed-off-by: Joe Beda <joe@bedafamily.com>

Default 'bup web' to serving on localhost only.

Also make command output match man page.

Signed-off-by: Joe Beda <joe@bedafamily.com>

Install our copy of tornado into /usr/lib/bup/tornado.

Signed-off-by: Joe Beda <joe@bedafamily.com>

web: Make output follow html4 standard

Add a doctype to specify which HTML version to use, in our case use the
HTML4.01 transitional doctype.

Close the second <th> tag so that it doesn't appear as 3 columns.

Add a charset definition in the head of the document.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

web: Lowercase tags in output

For stylistic preference, lowercase all tags in the output sent from bup
web.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

Update tornado to revision ad104ffb41

The file lib/tornado/escape.py was forcing users to install a json
library even though "bup web" doesn't use any json functionality.

An issue was opened upstream:

http://github.com/facebook/tornado/issues/closed#issue/114

and the day after it was opened, a fix was committed for it.

Update to the latest revision of tornado so that we can remove a
dependency on json/simplejson.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

Convert web-cmd to use Tornado.

Pleasantly, this mostly just involved deleting code, with a few tweaks.

Signed-off-by: Peter McCurdy <petermccurdy@alumni.uwaterloo.ca>

Add Tornado framework from git, commit 7a30f9f6

I just took the tornado/tornado directory, along with the README.

I'm using tornado's git commit 7a30f9f6eac9aa0cf295b078695156776fd050ce,
since recent versions of Tornado have support for specifying which
address you want to listen to.

Signed-off-by: Peter McCurdy <petermccurdy@alumni.uwaterloo.ca>

Added breadcrumb navigation to bup-web.

Signed-off-by: Zoran Zaric <zz@zoranzaric.de>

git.py: use close_fds=True when starting git cat-file.

Otherwise git could inherit some other file descriptors we're using. This
is particularly relevant in cmd/web, and particularly when applying
pmccurdy's patches to use Tornado.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Add docstrings to lib/bup/helpers.py

Since the split_path function was only used in one place, also move the
function inside this file (lib/bup/index.py).

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

All HTML attribute values should be enclosed by doublequotes.

Signed-off-by: Zoran Zaric <zz@zoranzaric.de>

Closing a UL-tag doesn't make sense here, the TABLE-tag has to be closed.

Signed-off-by: Zoran Zaric <zz@zoranzaric.de>

Move t/*.py to lib/bup/t/*.py.

Since the tests in that directory are all tests of lib/bup/*.py anyway,
this is a more consistent location for them.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

client.py: raising a particular rare exception caused a syntax error.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Update to latest wvtest.py, wvtest.sh, and wvtestrun from wvtest project.

Imported from wvtest commit a975b39ddcca5c894e2e2b656b8e28c11af36f47.

Because of changes to wvtest.py's chdir() handling, had to make some slight
changes to filenames used by the bup tests themselves - all changes for the
better.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/web: print a nicer message if we can't bind the socket.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/web: tiny fix to make redirects work with Firefox.

Firefox honours Content-Length even for 301 redirects, so if the field isn't
provided, it assumes there's an unlimited amount of data and just hangs.

Also fixed a typo in the man page.

Add new 'bup web' command.

'bup web' starts a web server that allows one to browse the bup repository
from a web browser.

Also reorganized version-cmd to allow easy access to bup version from other
places.

Signed-off-by: Joe Beda <joe@bedafamily.com>

options.py: differentiate unset and set-to-negative options.

Unset options will still be None, but options explicitly set to a negative
will now be 0. This doesn't change semantics for anything currently in bup,
but it could be useful later when applying defaults.

While we're here, clean up the option parsing code to make it
very slightly more efficient.

Signed-off-by: Brandon Low <lostlogic@lostlogicx.com>

cmd/split: minor correction to an error message.

Signed-off-by: Brandon Low <lostlogic@lostlogicx.com>

cmd/ftp: only import readline if necessary.

Apparently on some systems (Mandriva and Slackware at least), importing
the readline library can print some escape sequences to stdout, which screws
things up with the unit tests that run 'bup ftp "cat filename"' and expect
it to be the right data.

Thanks to Eduardo Kienetz for noticing and helping to track down the problem
since I couldn't reproduce it.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

vfs: File.open() needs to do a seek(0) on the cached FileReader.

Otherwise if you open a file, read through it, and close it, then do it
again, you'll get zero bytes the second time.

To make this efficient, change seek() to not discard its _chunkiter every
single time; instead, keep the _chunkiter around until trying to read() from
a location that *isn't* the current offset. Now seeking around in the file
is cheap.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Use common env utility rather than hard coded location for bash

README: one less reason that we suck.

bup fuse and bup ftp can rejoin large files nowadays, so remove that
limitation from the README.

Reported by koo5 @ github.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

README improvement.

Be more specific about how to update the remote PATH.

Inline git.cat() inside server-cmd.py

Since the cat() function in git.py is used only inside the server-cmd.py
script, and since it is a discouraged use of CatPipe, inline the code
inside the server-cmd.py script.

At the same time, make the CatPipe object persistent between calls to
the "cat" command to remove unnecessary deletion/creation or resources.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

Remove trailing spaces from git.py

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

bup-fuse.1: mention how to unmount the filesystem when we're done.

Based on a question from the mailing list.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

vfs: correctly handle reading small files.

After the recent change to let vfs seek around in files, we broke support
for files that were only one chunk. Fix it up, then add some unit tests to
detect such mistakes in the future.

Also, 'bup ftp' now returns nonzero if it catches any exceptions during
execution, making it more suitable for use in scripts... such as the unit
tests :)

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/random: support lengths that aren't a multiple of 1k.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Makefile: allow PYTHON variable to override python version.

Currently, the Makefile assumes the python command that should be used
is the default python version -- the "python" executable that is found
in PATH. Compiling and testing with a different python version is not
possible without either having a system with another default version, or
by manually changing the link found in PATH.

Correct this situation by using a variable for the python command name,
that can be overridden on the command line like the following:

make PYTHON=python2.6 test

Signed-off-by: Gabriel Filion <lelutin@gmail.net>

Docstrings for the git.py library

Add docstrings to the module and the public classes and functions of the
git library (eg. the ones that do not start with _ ).

Also rename the AbortableIter class to _AbortableIter since it is used
only inside the git.py library and is not intended to be used elsewhere
for now.

Signed-off-by: Gabriel Filion <lelutin@gmail.com>

make install: don't fail if documentation couldn't be built.

Just silently refuse to install the documentation instead. Reported by Karl
Kiniger.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/newliner: if input starts getting too long, print it out.

This prevents output that doesn't have any newlines from being buffered
forever (eg. the output of 'bup split -vv').

Reported by Karl Kiniger.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/{random,split}: call handle_ctrl_c() for cleaner keyboard interrupts.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

cmd/{save,split}: add a --bwlimit option.

This allows you to limit how much upstream bandwidth 'bup save' and 'bup
split' will use. Specify it as a number of bytes/second, or with the 'k' or
'M' or (lucky you!) 'G' suffixes for larger values.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>

Work around extra space added by some readline versions.

Apparently some versions of readline (6.0, for me) in some versions of
Python (Ubuntu's python2.6.4-0ubuntu1, for me) have an irritating bug
where they add an extra space to the end of all completions.  This is
particularly annoying for directory completions, as you can't
tab-complete your way into the contents of the directory.  See
http://bugs.python.org/issue5833

This patch, borrowed mostly from Trac, goes in and twiddles the
appropriate variable inside the readline library to make it stop doing
that.  See http://trac.edgewall.org/ticket/8711 for the discussion.

Signed-off-by: Peter McCurdy <petermccurdy@alumni.uwaterloo.ca>

bup ftp: work even if the 'readline' module isn't available.

Suggested by Joe Beda.