Microsoft's Data Deduplication Eval Tool for 8/7/Vista/etc.

Discussion in 'Windows 8' started by generalmx, Apr 16, 2014.

  1. generalmx

    generalmx MDL Novice

    Apr 15, 2014
    34
    21
    0
    #1 generalmx, Apr 16, 2014
    Last edited by a moderator: Apr 20, 2017
    Ever wondered if it's actually worth it to upgrade to Windows 8 / Server 2012 just for a cool new feature like data deduplication? According to Microsoft's own Technet article, DDPEVAL.EXE is a portable tool intended to be used to help evaluate whether you want to upgrade, but for some reason, Microsoft doesn't provide this as a separate download, even though they even blogged about how cool it's portable and how you should copy and use it on other systems (twice!). So rather than making you download Server 2012 and/or extract it from the cabs found on these forums, here it is...oops, I can't attach files yet! Well maybe someone else will attach it for me?

    Results on my small backup partition:
    Evaluated folder: I:
    Evaluated folder size: 98.39 GB
    Files in evaluated folder: 5629

    Processed files: 1980
    Processed files size: 98.36 GB
    Optimized files size: 60.28 GB
    Space savings: 38.08 GB
    Space savings percent: 38

    Optimized files size (no compression): 77.18 GB
    Space savings (no compression): 21.18 GB
    Space savings percent (no compression): 21

    Files excluded by policy: 3649
    Small files (<32KB): 3649
    Files excluded by error: 0

    If you find you want to actually install Data Deduplication on Windows 8.1 x64, grab the copy from the MSMG Toolkit thread and use the commands listed in the spoiler below:
    Code:
    dism /online /add-package /packagepath:"Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-Dedup-ChunkLibrary-Package~31bf3856ad364e35~amd64~~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-Dedup-ChunkLibrary-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~~6.3.9600.16384.cab" /packagepath:"Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.cab" && dism /online /enable-feature /featurename:Dedup-Core /all
    

    Edit:

    Results from enabling Deduplication and using it on my ~1.6TB bulk data volume with ~100GB free was about an additional ~250GB of savings, giving me ~350GB total free. Spoiler for more details:
    PS C:\WINDOWS\system32> Get-DedupStatus -Volume H:

    FreeSpace SavedSpace OptimizedFiles InPolicyFiles Volume
    --------- ---------- -------------- ------------- ------
    357.2 GB 345.64 GB 297543 297544 H:


    PS C:\WINDOWS\system32> Get-DedupStatus | Format-List


    Volume : H:
    VolumeId : \\?\Volume{5e55d931-504c-11e2-be88-0014350013a5}\
    Capacity : 1.57 TB
    FreeSpace : 357.2 GB
    UsedSpace : 1.22 TB
    UnoptimizedSize : 1.56 TB
    SavedSpace : 345.64 GB
    SavingsRate : 21 %
    OptimizedFilesCount : 297543
    OptimizedFilesSize : 1.45 TB
    OptimizedFilesSavingsRate : 23 %
    InPolicyFilesCount : 297544
    InPolicyFilesSize : 1.45 TB

    However, this space savings was not without its tradeoffs: dedup was now using 1GB of memory to store multiple chunk indexs, though all of it was in "standby" (meaning if another process needs more than the ~10.5GB I have left, it'll let it all go).

    (Yeah, I can't even post links to different parts of the forum yet.)
     
  2. murphy78

    murphy78 MDL DISM Enthusiast

    Nov 18, 2012
    7,389
    11,614
    240
    This seems like a good way for big file hosting things like servers to keep the size down.
    I'm not sure the average end-user could ever benefit from this as I think you have to run the dedup passes for quite a stretch occasionally.
     
  3. generalmx

    generalmx MDL Novice

    Apr 15, 2014
    34
    21
    0
    #3 generalmx, Apr 16, 2014
    Last edited: Apr 16, 2014
    (OP)
    There's a few ways Microsoft makes it a bit less strenuous on the average system:
    - Longer initial deduplication for maximum savings.
    - Copy-On-Write deduplication from then on.
    - Weekly low-priority deduplication "scrub" in the background which works with your existing defrag & optimize process.

    Note: The deduplication process does take up a fair amount of memory, starting at ~350MB, up to 25% total for the background process, or up to 50% total if you do it manually, and averages about 100GB processed per hour per volume. However, these resource needs are mainly for the scrubbing and initial creation of the deduplication metadata, where a very lightweight copy-on-write process is done otherwise.

    In addition, the deduplication filter actually makes the NTFS volume work a bit more like Microsoft's newer ReFS, with additional checksumming of data and weekly repairs (from the scrubs), as well as making duplicates of certain critical metadata, thus increasing overall data integrity.

    However, it does have limitations: doesn't work with boot or system volumes or small files (<32KB), and doesn't work with files encrypted using EFS.

    I'd check it out if you haven't already :)