X-Git-Url: https://arthur.barton.de/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=DESIGN;h=e8aa8d08eb9b2602c595140f0c58f16f100983ea;hb=3306a802a11b8d945af081cb835bb7fe208dc84c;hp=c7eca302bdad3047b3922703976777c1fbbd8612;hpb=1a1163538770067c924f4c36b7b710e08a880908;p=bup.git diff --git a/DESIGN b/DESIGN index c7eca30..e8aa8d0 100644 --- a/DESIGN +++ b/DESIGN @@ -196,7 +196,7 @@ sequence data. As an overhead percentage, 0.25% basically doesn't matter. 488 megs sounds like a lot, but compared to the 200GB you have to store anyway, it's irrelevant. What *is* relevant is that 488 megs is a lot of memory you have -to use in order to to keep track of the list. Worse, if you back up an +to use in order to keep track of the list. Worse, if you back up an almost-identical file tomorrow, you'll have *another* 488 meg blob to keep track of, and it'll be almost but not quite the same as last time. @@ -251,7 +251,7 @@ relatively infrequently. (You might think you change your source code "frequently" and that git handles much more frequent changes than, say, svn can handle. But that's not the same kind of "frequently" we're talking about. Imagine you're backing up all the files on your disk, and one of -those files is a 100 GB database file with hundreds of daily users. You +those files is a 100 GB database file with hundreds of daily users. Your disk changes so frequently you can't even back up all the revisions even if you were backing stuff up 24 hours a day. That's "frequently.") @@ -374,7 +374,7 @@ So that's the basic structure of a bup repository, which is also a git repository. There's just one more thing we have to deal with: filesystem metadata. Git repositories are really only intended to store file contents with a small bit of extra information, like -symlink targets and and executable bits, so we have to store the rest +symlink targets and executable bits, so we have to store the rest some other way. Bup stores more complete metadata in the VFS in a file named .bupm in @@ -548,7 +548,7 @@ the index and backs up any file that is "dirty," that is, doesn't already exist in the repository. Determination of dirtiness is a little more complicated than it sounds. The -most dirtiness-relevant relevant flag in the bupindex is IX_HASHVALID; if +most dirtiness-relevant flag in the bupindex is IX_HASHVALID; if this flag is reset, the file *definitely* is dirty and needs to be backed up. But a file may be dirty even if IX_HASHVALID is set, and that's the confusing part.