From: Johannes Berg Date: Sat, 25 Jan 2020 23:17:38 +0000 (+0100) Subject: save/vfs: update comments wrt. tree/bupm ordering X-Git-Tag: 0.31~103 X-Git-Url: https://arthur.barton.de/cgi-bin/gitweb.cgi?p=bup.git;a=commitdiff_plain;h=273bc50ab1151e25f900521e26d9d7a6e0839d19 save/vfs: update comments wrt. tree/bupm ordering After looking into this and thinking about it, the comments here are a bit misleading - save states the entries must be in a given order without a rationale, and vfs states that the order is wrong but gives an explanation that's not quite right. Update both comments to make this clearer, and to document that there's no inherent reason, just happened to pick something when the save code was written, which turned out to be not the best. Signed-off-by: Johannes Berg --- diff --git a/cmd/save-cmd.py b/cmd/save-cmd.py index 53cab3b..e0b1e47 100755 --- a/cmd/save-cmd.py +++ b/cmd/save-cmd.py @@ -126,8 +126,11 @@ handle_ctrl_c() # Since the git tree elements are sorted according to # git.shalist_item_sort_key, the metalist items are accumulated as # (sort_key, metadata) tuples, and then sorted when the .bupm file is -# created. The sort_key must be computed using the element's real -# name and mode rather than the git mode and (possibly mangled) name. +# created. The sort_key should have been computed using the element's +# mangled name and git mode (after hashsplitting), but the code isn't +# actually doing that but rather uses the element's real name and mode. +# This makes things a bit more difficult when reading it back, see +# vfs.ordered_tree_entries(). # Maintain a stack of information representing the current location in # the archive being constructed. The current path is recorded in diff --git a/lib/bup/vfs.py b/lib/bup/vfs.py index ea0b197..9222f8f 100644 --- a/lib/bup/vfs.py +++ b/lib/bup/vfs.py @@ -618,10 +618,12 @@ def ordered_tree_entries(tree_data, bupm=None): """ # Sadly, the .bupm entries currently aren't in git tree order, - # i.e. they don't account for the fact that git sorts trees - # (including our chunked trees) as if their names ended with "/", - # so "fo" sorts after "fo." iff fo is a directory. This makes - # streaming impossible when we need the metadata. + # but in unmangled name order. They _do_ account for the fact + # that git sorts trees (including chunked trees) as if their + # names ended with "/" (so "fo" sorts after "fo." iff fo is a + # directory), but we apply this on the unmangled names in save + # rather than on the mangled names. + # This makes streaming impossible when we need the metadata. def result_from_tree_entry(tree_entry): gitmode, mangled_name, oid = tree_entry name, kind = git.demangle_name(mangled_name, gitmode)