Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Merge strategy needs some love
#1
Currently the merge strategy seems to be: create a new file with all the data of the merged files combined, then delete the obsolete files.

This doesn't scale.

Example: a full backup using 51% of the available backup space. Merging any incremental backup with this is not possible, as the aforementioned method would occupy 51% twice during the merge operation.

A solution to this dilemma is non-trivial (and I don't know anything about the backup file format).

One strategy I could imagine: "melt" the older backup into the newer file. Say a full backup and an increment shall be merged:
  • Mark the full backup file as "incomplete, use increment file for restore if selected"
  • Read a chunk from the end of the full backup file
  • Append all the data from that chunk which isn't obsolete as a new chunk to the increment file
  • Truncate the full backup file to a new size, freeing exactly the copied chunk

In case of a crash during this process, the increment is always a valid restore option. The process could be resumed easily.

This would only add the size of a "chunk" to the backup space during the operation. Maybe 100 MiB would be good for those. Compression should probably operate on chunks, not whole backup files.

Thank you for your efforts to create a new backup suite and your generous free launch!

Edit: a similar strategy, but much faster, for file systems supporting hole-punching:

Phase 1:
  • Mark the increment file as "incomplete, use full backup file for restore if selected"
  • Read a chunk from the end of the increment file
  • Append all the data from that chunk to the full backup file
  • Truncate the increment file, freeing exactly the copied chunk

Phase 2:
  • Read a chunk from the full backup file
  • If it contains obsolete data, copy and append to the end of the full backup file, discarding the obsolete data, and punch a hole at the old location
Opposed to the previous strategy, this one mostly only reads the full backup file, leaving most of the data untouched. Instead of hole-punching, backups could be split into chunk-files in general, to support more file systems.
Reply


Messages In This Thread
Merge strategy needs some love - by Korkman - 03-23-2023, 10:50 AM
RE: Merge strategy needs some love - by admin - 03-24-2023, 01:30 PM

Forum Jump:


Users browsing this thread: 1 Guest(s)