When an midx is deleted underneath bup, usually by itself running
'bup midx --auto', then PackIdxList may keep them open. This can
cause bup to run out of disk space easily since these files can
be fairly big, and can be recreated multiple times in a backup
run.
To fix this, remove any open PackMidx instances from the list and
close them explicitly.
Out of an abundance of caution, also explicitly close the bloom
instance if we have one - the same issue should apply here even if
I couldn't observe it, since the GC isn't guaranteed to clean up
the object immediately.
I remember debugging this issue years ago without coming to any
good conclusion, and it's been mentioned on the mailing list a few
times as well, e.g.
https://groups.google.com/d/msg/bup-list/AqIyv9n9WPE/-Wl2JVh5AQAJ
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Rob Browning <rlb@defaultvalue.org>
(cherry picked from commit
5c746e43600c059c52b5fd78212499e3e9700946)
Tested-by: Rob Browning <rlb@defaultvalue.org>
The module-global variable 'ignore_midx' can force this function to
always act as if skip_midx was True.
"""
+ if self.bloom is not None:
+ self.bloom.close()
self.bloom = None # Always reopen the bloom as it may have been relaced
self.do_bloom = False
skip_midx = skip_midx or ignore_midx
if os.path.exists(self.dir):
if not skip_midx:
midxl = []
+ midxes = set(glob.glob(os.path.join(self.dir, b'*.midx')))
+ # remove any *.midx files from our list that no longer exist
+ for ix in list(d.values()):
+ if not isinstance(ix, midx.PackMidx):
+ continue
+ if ix.name in midxes:
+ continue
+ # remove the midx
+ del d[ix.name]
+ ix.close()
+ self.packs.remove(ix)
for ix in self.packs:
if isinstance(ix, midx.PackMidx):
for name in ix.idxnames:
d[os.path.join(self.dir, name)] = ix
- for full in glob.glob(os.path.join(self.dir,'*.midx')):
+ for full in midxes:
if not d.get(full):
mx = midx.PackMidx(full)
(mxd, mxf) = os.path.split(mx.name)