Working Deduplication files for Microsoft Windows 10

Discussion in 'Windows 10' started by dreamss, Oct 4, 2014.

  1. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    Has anyone gotten data dedup working for Windows 11 22H2? I've been trying with the offline installation method for the Dedup_22509 package, after manually installing the certificate in the Microsoft Development Root Certificate Authority 2014.crt file. Finally got the offline installation segments to work without errors after a bit of finagling with scratch folders and image paths (the default CMD script wouldn't work as-is). I also pre-updated the dedup.sys in the Dedup_22509 folder with the one from the dedup.18362.1 package before attempting any of this stuff. But booting back into Win11, net start dedup claims the service cannot be found. And dedup doesn't work anywhere, needless to say. Would love any help/insights folks might have. Thank you!!!
     
  2. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    Great news! Have this working. It was quite trying, so I thought I'd share my findings here for the benefit of all others dropping by here. My finds are validated on both Windows 11's original release, and the updated 22H2 release from late this year.

    First and foremost - standing on the shoulders of giants here. This guide makes use of the Dedup_22509 package shared earlier on this thread.

    1. Make sure you have an extra empty drive with sufficient scratch space, ex: the drive you intend to deduplicate itself. You can just shrink your main Windows partition, something like 8 GB should be more than enough. Precise how-to on shrinkage is avoided and "left as an exercise to the reader" as it is out of scope for this tutorial.

    2. Boot into the Windows PE recovery environment. This is the quickest and easiest way to get offline access to your image. Please note that I did not challenge the assumption stated earlier on this thread that the Dedup_22509 package is meant to be used on an offline image:

    i. Right-click Start, Choose Settings.
    ii. Type recovery, and choose Recovery Options.
    iii. Under Advanced startup, click Restart now, and double confirm by clicking Restart now again.
    iv. Choose Troubleshoot | Advanced options | Command Prompt.

    There, that was easy, was't it?

    Alert! drive letters WILL be shuffled all around!!! Try a few commands like dir c:, dir d:, dir e:, etc. until you have found out which of your drives is the boot drive and which of your drives is the spare drive with sufficient scratch space.

    3. Edit these 4 lines in the Dedup_22509 package's CMD file installer_offline.cmd:

    dism /image=%target% /Add-Package /PackagePath=FileServer\update.mum
    dism /image=%target% /Add-Package /PackagePath=FileServer-en\update.mum
    dism /image=%target% /Add-Package /PackagePath=Dedup\update.mum
    dism /image=%target% /Add-Package /PackagePath=Dedup-en\update.mum

    You can just use Notepad for this, as it's already available in your just booted-into WinPE/WinRE instance.

    i) Replace %target% with <windows drive>:\ (*with* slash).
    ii) Add /scratchdir=<scratch drive>:\.

    So for example, if your boot drive is C: and your scratch drive is D:, edit the lines such that they read:

    dism /image=c:\ /Add-Package /PackagePath=FileServer\update.mum /scratchdir=d:\
    dism /image=c:\ /Add-Package /PackagePath=FileServer-en\update.mum /scratchdir=d:\
    dism /image=c:\ /Add-Package /PackagePath=Dedup\update.mum /scratchdir=d:\
    dism /image=c:\ /Add-Package /PackagePath=Dedup-en\update.mum /scratchdir=d:\

    This is by NO MEANS STANDARD. On my systems, I had to use stuff like:

    dism /image=e:\ /Add-Package /PackagePath=FileServer\update.mum /scratchdir=c:\

    etc. Ensure you check because again, drive letters WILL be shuffled around.

    4. Run the CMD file. When it prompts for the drive to process, type <boot drive>: only (*without* slash).

    So for example, at the drive prompt, type:

    C:

    And press enter if you determined in #2 above that C: was your boot drive. In my case, I had to type E: and press enter, because that was the letter assigned to my boot drive. Perfectly normal in its oddity, for sure.

    5. Search for dedup.sys starting in <boot drive>:\Windows and replace both instances (in drivers and some long sXs folder) AFTER the fact (or DISM will fail with pre-applied file) with the dedup.sys file found in the package dedup.18362.1 again earlier from this thread - and again, standing on the shoulders of giants - thanks again to everyone who contributed.

    So you will for example type:

    ren c:\windows\system32\drivers\dedup.sys -dedup.sys
    copy <full path to 18362.1 dedup.sys file> c:\windows\system32\drivers\dedup.sys

    And then the same commands again for a much longer folder, that is found inside the Windows folder of your boot drive. Need a hint on how to search? Just type:

    c:
    cd c:\windows (assuming c: is your boot drive)
    dir dedup.sys /s

    That's it! Now you can just type exit at the command prompt, choose to return to Windows 11, and you're done. The service shall already be running by the time you're back in.

    Pitfalls and Gotchas (yes, these burned hours and days and weeks of my time, resulting in my desperate, immediately preceding plea above):

    1. If you don't use a scratch disk, the operation really will fail randomly. DISM already warns you about this. It may seem to work or just overtly fail. Just use a scratch disk and get it over with. You already have another disk you're trying to compress anyways, don't you?

    2. The WinPE/WinRE method I described in #2 above is really the easiest way to boot into an offline environment for your boot disk, without requiring external USB drives, etc. But I suppose if you already have an environment you prefer with the latter, you wouldn't care much for this.

    3. The script just doesn't work properly, in my tests for #3 above until we actually hard-code the necessary paths, assign the scratch space, etc. Don't get me wrong - it's a great script and we still need it to set the right ACLs, etc. But I just couldn't get it to work without these manual finaglings I've mentioned above. Also pay extra attention to the slashes.

    4. Again for #4 above, be careful. Using the slash here again messes things up. Just type your drive letter and a colon, without a slash.

    5. I did test for this scenario specifically - you must use the 18xxx version, without it you will get some certificate error after returning to Windows. And again, you must not pre-replace this file inside the Dedup_22509 package folder in interest of saving time. DISM will complain and break your process if you do so. Been there, done that as you can probably tell.

    Some final observations:

    a. There is NO NEED to install the custom certificate (from year 2014 or whatever). In my testing, it made no difference. Maybe it'd help with #5 above, but I really didn't test for that. Installing the certificate was hard enough already - had to disable Windows Defender to get that to work. Best just avoid that step altogether.

    b. Again, this has been tested and found working with 21h2 and 22h2 win11 releases.

    Enjoi!

    BTW, since you're reading this, chances are you're into data compression. I'm the purveyor of patented transparent disk compression (WIMBOOT based, not data deduplication - although my package also includes a GUI wrapper for data deduplication, so you don't have to install and mess around with "power"shell command"lets"). I'd hate to come across as a spammer here so I won't say more, until one of you asks for it (in which case I'm more than glad to share what I've built and what it does).

    Last but not least, thanks again to everyone - truly standing on the shoulders of giants here - I've been following this thread for almost a decade now adding dedup to my drives (gives me 4:1 compression ratios for my VM library so has become something truly indispensable), real glad to have this opportunity to contribute back to the community.
     
  3. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    Upgrading a machine today I saw an interesting issue - dism crapped out on the first invocation, checked everything, looking good. Reading through the dism logs saw that it mentioned some issue due to "pendingfilerename"s. It just would not take. Rebooted back into Windows, in hopes the pending renames would be resolved. Came back into the PE/RE environment, tried again, worked just fine.

    Had to share as it had me freaked out for a moment and might also happen to others too. Apparently must add instruction 4.5 above:

    4.5 If you get any errors with dism, abort the batch file with ctrl+c if you can (confirming "y"es for the abort confirmation request) and reboot back into Windows. Start the process over again at step #2, skipping step #3 this time and proceeding straight to step #4 right after #2. Pray it works this time! Loop as many times as you can handle without going insane :p
     
  4. acer-5100

    acer-5100 MDL Guru

    Dec 8, 2018
    3,706
    2,662
    120

    You are partly confused about this argument.

    #1 by default deduplication, does plain deduplication AND compression, using the LZX compression (which is the same thing, introduced with win 10, that CompacctOS uses)

    #2 there is no such "transparent" compression here, decompression is done transparently, compression must be done "manually". Compact command, deduplication service, or restoring an image using wimlib.

    The old NTFS compression is really transparent: once you mark a disk/folder/file as compressed (the blue files) everything happen automagically, just saving the file. But that algorithm was envisioned in 1990s, so it has a low compression ratio (like 1.3x-1.4x) but requires a negligible computing power.

    With newer W10 LZX (and Xpress*) way, more power is needed (especially during compression), the compression ratio is amazing: from 1.8 to 3x on average, but once you change a single bit to a compressed file it will saved as non compressed.

    That's why I use both the methods on OS drives, so I limit the growing space when the OS access and rewrites some files
     
  5. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    You're preaching to the choir here with a dash of nitpicking on top (I do consider any form of compression that doesn't need manual decompression for data access as transparent).

    Yes, my patented product layers LZNT1, XPRESS*K, LZX, etc. with a view to one-click operation in the least amount of time for maximum space or speed.

    Happy to tell you more or share the patent number so you can read up about it if you're interested in learning about something new.
     
  6. acer-5100

    acer-5100 MDL Guru

    Dec 8, 2018
    3,706
    2,662
    120
    #826 acer-5100, Dec 16, 2022
    Last edited: Dec 16, 2022
    Feel free to consider and anything what you prefer in the way you prefer. No one will die in either way.

    but

    If we talk about transparent compression, we are talking about... uh... compression, something that is supported by natively by the filesystem (NTFS, BTRFS, ZFS and so on).

    The compactOS family of compressors are something different, something "glued" on top of the FS, just like the MS deduplication.

    There is no nitpick involved here, just explaining clearly the differences to people who may read this thread.

    The last time I read a message that sounded like the above quote, was about a guy who tried to sell some programs, as breakthrough novelties when they where just GUIs to well known MS features. Even using the zipfolder name which was a revolutionary program back in the days of Win95.
     
  7. pm67310

    pm67310 MDL Guru

    Sep 6, 2011
    2,372
    1,590
    90
    or more stable use windows server 2022 standard ( converted to workstation ) + ( with azure stack hci certificate ) to have full features os with no activation
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    Is (s)he sure (s)he feels this way, as (s)he seems to have this compulsion to "correct" me across a decade of time and countless forums online?

    Ah yes, the infamous but, belying what was just stated.

    What follows is the presentation of a subjective view as fact, ostensibly for the public good:

    While in reality, the reparse points used by the "glued" compressors for example, are as much a "native" part of the file system as the three decades old LZNT1 that (s)he is compelled to glorify.

    Why would someone praise an utterly obsolete, performance killing approach to transparent disk compression; which was, by the way, largely inferior in space savings even when compared to its peers of the time (Stacker)?

    Simply because this one must use any and all means at his or her disposal to discredit an opposing view, including:

    Character assassination. Rest assured I'm the same "guy", and having learned my lesson, I started here by establishing that my intent isn't selling any software. My patent from the USPTO establishes the novelty of my product (this one should maybe start a dispute process there next), with the only reason I'm back here today being:
     
  9. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    #829 MimarSinan, Dec 17, 2022
    Last edited: Dec 17, 2022
    ...to address yet a new problem I'm having today with Windows 11 data deduplication. Repeating my same instructions on literally the same hardware and virtually the same operating system image today where they had worked perfectly before, the dedup service fails to start with a certificate error. So I am inclined to find this method rather unstable at this time indeed - not sure if installing the certificate would help with this today, and even if so, why it failed today without the certificate when it had worked yesterday.

    Sadly using the Server branch is not an option as reinstalling everything currently present on the Windows 10 instance being upgraded to Windows 11 is out of the question (not only because of the work involved but because fresh reinstalls of some line of business apps are no longer possible). Unless you know of a way to go from Windows 10 Pro to Server 2022?

    Edit: DOH! Method exonerated once again. Had forgotten the dedup.sys step. Pitfall alert! If you forget to do so, you will get a certificate error. Didn't test with installing the certificate, just replaced dedup.sys.
     
  10. maeglin

    maeglin MDL Novice

    Dec 14, 2012
    10
    1
    0
    #830 maeglin, Dec 17, 2022
    Last edited: Dec 18, 2022
    All 4 modules install successfully, and I can see the both deduplication services enabled in services but I cannot use Powershell cmlets nor I can open the deduplicated files (files themselves are fine, I can open them in Server 2022). Dedup.sys is 18362.1

    error I am getting in PowerShell:
    "Get-DedupStatus: The 'Get-DedupStatus' command was found in the module 'Deduplication', but the module could not be loaded. For more information, run 'Import-Module Deduplication'."
    If I try to run "'Import-Module Deduplication", it fails again with "Failed to generate proxies for remote module 'Deduplication'"

    EDIT: The problem is most likely caused by dedup.sys certificate, as I am getting this error when I run `net start dedup`

    `A required certificate is not within its validity period when verifying against the current system clock or the timestamp in the signed file.`

     
  11. acer-5100

    acer-5100 MDL Guru

    Dec 8, 2018
    3,706
    2,662
    120

    I had few doubts on that given based on my old searches from many years ago, you are the only source that call the old NTFS compression NT1 (I'm not saying you're wrong here, just I didn't find other sources).

    Still your SW is the proof of how rotten the patent system is.

    Perhaps did you pay the original authors of the original (GREAT) "zip folders", or you are just using their fame to mislead the users of your sw?
     
  12. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    #832 MimarSinan, Dec 18, 2022
    Last edited: Dec 18, 2022
    That's the exact same error I got when I had forgotten to replace dedup.sys. You sound like you're sure you replaced it, but I would suggest you double check that you replaced both files (and maybe confirm both their paths here just to be sure). The process is a bit overly complex so I wouldn't be surprised if one of the many steps went missing, we're all human after all :)

    Fortunately I haven't been needing to worry about the commandlets because my software does include a GUI to create/manage/uncompress dedup drives without any command line finagling. One nice recent touch is it'll progressively copy a file onto a dedup drive that normally wouldn't have enough space to hold the raw uncompressed file, but does otherwise have enough space to hold the compressed file.

    So let's say your dedup drive size is 100 GB but your file is 150 GB even though you know it compresses down to 50 GB. The tool copies over the 150 GB file in, say, 4 GB increments - recompressing after each increment and adding more data after - hence the file is "massaged in" gradually to the dedup drive.
     
  13. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    The algorithm employed is officially called LZNT1, but you are correct you wouldn't know. It takes an open mind and tons of hard work to actually find out the truth about things, instead of parroting learned negativity. Don't feel bad though, we're only human after all and can only do what we've been taught. I don't mean that in a sarcastic way but a literal way.

    Ha, nice jab. Maybe you should spend more time doing something that actually means something to you instead of taking a poop on other people's hard work. You'll feel better. Try to fix the broken patent system for example, if you care about that so strongly? In the process, you may even grow as a person.

    I'm not even going to bother to correct you here - you're free to spend your time any way you want, as am I (even if you had a non-zero impact on my bottom line).
     
  14. acer-5100

    acer-5100 MDL Guru

    Dec 8, 2018
    3,706
    2,662
    120

    I'm still waiting for a source different than your website/ spam messages on forums.

    Just provide a link (and stop pretending to be other than a spammer)
    Kid, I'm in the IT since the early 80s, and since then I never stopped learning and sharing my knowledge (for free), for sure I sill do the same until my last day, but for sure I have nothing to learn from people like you.

    People just used to live recycling other's knowledge / fame / copyrights, and pretending are your's.

    You failed even here to link my message where I explained how to install the dedup packages, and I shared the right dedup.sys file.

    Still you refuse to reply to the simple question.

    Did you pay for the rights of the "zipfolder" name or you stole it?

    My English is poor but the question is pretty simple and understandable.
     
  15. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    So sorry for the delay, I guess I actually have a life.

    You mean you've figured everything out but this?

    Says the 15 year old Python script kiddie. Some people don't age a day, and I don't mean that in a good way. If you really were around back then, you'd know of the analogues of my product and their value.

    So noble of you to stand up for their rights! Really when are you going to get around to filing that patent dispute with the USPTO?

    Oh good! I didn't realize that was yours. Ouch! You must be hurting pretty bad about having inadvertently helped the kind of person you think I am.

    Shush! You are not allowed to mention the name of the product, or people are going to understand you're really just a shill for me, spamming the forum.
     
  16. acer-5100

    acer-5100 MDL Guru

    Dec 8, 2018
    3,706
    2,662
    120
    As usual.

    A wall of rant about people who burst your bubbles. And zero answers.
     
  17. MimarSinan

    MimarSinan MDL Novice

    Nov 9, 2019
    10
    4
    0
    Dude or dudess, you are very confused! I am a spammer, remember, and you don't trust a thing I say (or build)? Why ask me questions begging answers :D
     
  18. acer-5100

    acer-5100 MDL Guru

    Dec 8, 2018
    3,706
    2,662
    120
    Personally I never liked RAID different than RAID 1 schemes. I have seen too much nightmare scenarios in 30 + years, using them, for my taste.

    Just stick with RAID 1 and use deduplication which is robust enough to not worry about it. (obviously deduplication is not effective on multimedia files, but usually loosing some downloaded movies is not a a tragedy, So maybe is not worth to not duplicate them at all, and Drivepool allows you to configure the duplication per folder.

    P.S. if you use drivepool's duplication + MS' deduplication (duplicanting + deduplicating sounds like a joke but it isn't ;)), you must dedupe the pool members not the pool itself.
     
  19. Alexey_Orbita

    Alexey_Orbita MDL Novice

    Mar 9, 2023
    1
    0
    0
    how to extract dedup files?