arthur.barton.de Git - bup.git/commit

author	Avery Pennarun <apenwarr@gmail.com>
	Sat, 2 Jan 2010 06:46:06 +0000 (01:46 -0500)
committer	Avery Pennarun <apenwarr@gmail.com>
	Sat, 2 Jan 2010 06:46:06 +0000 (01:46 -0500)
commit	295288b18c6cd7bcd92eaa9c73c271ad4178b2b1
tree	113c6451fd12530f607a02bd87f6913e5d2562ff	tree \| snapshot
parent	b8215e5898febb6bd87682e34cf4460bdf526445	commit \| diff

'bup split': speed optimization for never-ending blocks.

For blocks which never got split (eg. huge endless streams of zeroes) we
would constantly scan and re-scan the same sub-blocks, making things go
really slowly. In such a bad situation, there's no point in being so careful;
just dump the *entire* input buffer to a chunk and move on. This vastly
speeds up splitting of files with lots of blank space in them, eg.
VirtualBox images.

Also add a cache for git.hash_raw() so it doesn't have to stat() the same
blob files over and over if the same blocks (especially zeroes) occur more
than once.

git.py		diff \| blob \| history
hashsplit.py		diff \| blob \| history