Little, only a few new chunks need to be stored - this is great for VMs or complete files or time stamps staying the same: If a big file changes a.Without killing the deduplication, even between machines sharing a repo. file/directory names staying the same: So you can move your stuff around.Matter whether they come from different machines, from previous backups,įrom the same backup or even from the same single file.Ĭompared to other deduplication approaches, this method does NOT depend on: To deduplicate, all the chunks in the same repository are considered, no
Of bytes stored: each file is split into a number of variable length chunksĪnd only chunks that have never been seen before are added to the repository.Ī chunk is considered duplicate if its id_hash value is identical.Ī cryptographically strong hash or MAC function is used as id_hash, e.g. Deduplication based on content-defined chunking is used to reduce the number