I don't know about defrag but zipping the files up in an archive using 7zip with the store option (zero compression) seems to be promising.
I just ran a test on a 42 GB data set containing 23,000 odd email related files (msg, eml, ost, pst etc) it took about 1 hour to create the archive, then about 15 mins to copy across, then about another 20 mins or so to unzip on the other end.
All in all a huge time saving as previously moving 40gb of data with lots of small files like this could take hours and hours.
Of course one smooth run does not a miracle make, but very promising none the less.
All in all a huge time saving as previously moving 40gb of data with lots of small files like this could take hours and hours.
Of course one smooth run does not a miracle make, but very promising none the less.
This sounds as a more generic issue of "small files over network overhead".
I.e. it has less to do with the copying software, but a lot with the nature of the copied files, in which case compressing them (or simply assembling them in a "container") is a big advantage.
The "right" way to do that would be probably to use a TCP pipe, like
http//copy_millions_tiny_files.onlinephpfunctions.com/
But have you considered the idea of using instead an AoE approach?
http//
Or iScsi
http//
Though I believe that AoE is faster (and simpler to setup/manage)
For the record, since I fell for misunderstanding this oops the other way round originally, in both AoE and Iscsi the "Target" is on the "Server side" and the thing on the "Client side" is an "Initiator".
jaclaz
As an alternative, and one that I've made use of recently in a teaching scenario to move Debian DVD images out to student machines, (admittedly running Linux, but there are Windows versions) you could try using NetCat / NCat. You'll need it installed on both the source and the destination, then send direct across the network using TCP
On receiver
c>nc -lp [recieverPort] > myhugeimage.iso
On sender
c>nc [recieverIP] [recieverPort] < myhugeimage.iso
This is about as basic as it gets … so all higher level functions are down to you ( compressions, hash checking etc. - note it is TCP, so if it succeeds, it should be uncorrupted, however if it fails, it won't restart…)
As an alternative,
Not much "alternative". 😯
I just posted about a way (with the additional piping through cpio or gzip, needed in case of several small files, which is the issue at hand) to use a TCP tunnel
http//copy_millions_tiny_files.onlinephpfunctions.com/
Using ttcp or nc doesn't change much things, I believe.
The point is that transferring a (supposedly biggish) .iso works with *everything* and it is fast anyway, while transmitting the same amount of data in the form of zillions small files causes in any case a lot of overhead and slows down transmission speed to a crawl.
Among the "more common" protocols, FTP is of course the most suited in theory, but in practice it becomes slower than HTTP where a large number of small files is involved. ?
jaclaz
As an alternative,
Not much "alternative". 😯
I just posted about a way (with the additional piping through cpio or gzip, needed in case of several small files, which is the issue at hand) to use a TCP tunnel
Ah, yes, to be fair you did …
I don't know how well one could go about things using something like Cygwin, however this on Linux would quite possibly work quite well by piping a tar across the network something along the lines of …
On the destination
nc -l -p [recievingPort] | tar Sxzpf
and on the source
tar Sczpf - . | nc [recievingIP] [recievingPort]
As a secure solution, by the way, you could do similar things, piping over ssh rather than nc. That would also theoretically compress traffic as well - however what additional compression you would get on top of the gzip in the tar command I don't know - probably not a great deal !
Ah, yes, to be fair you did …
… however this on Linux would quite possibly work quite well by piping a tar across the network something along the lines of …
…
As a secure solution, by the way, you could do similar things, piping over ssh rather than nc.
….
Actually the given link suggests the use of cpio or gzip because there are possible issues with tar and long filenames AND excludes SSH due to it's overhead
http//copy_millions_tiny_files.onlinephpfunctions.com/
Why CPIO? Why TTCP?
CPIO is in my experience slightly faster than TAR, and does not have any problems with long filenames. When I used TAR the filenames became a problem.
TTCP has almost no overhead. It is not secure like SSH, but SSH has lots of overhead and consumes much CPU power (witch can become a bottleneck). TTCP was way faster for this task, and it gives you a little extra the speed results afterwards (TTCP is officially used for bandwidth testing).
Cannot say how accurate is that report today, maybe the "tar issue" pertained to an older version ? , but the overhead of SSH should remain, and since the transfer of the OP is across a local network, I doubt that there is a need for having it "secure".
On Windows, there are (besides Cygwin) "native" gzip, ncat and ttcp ports and there are also "here and there" some interestiing projects (JFYI)
http//
but never had an occasion to test them, and surely not in this network usage.
jaclaz
P.S. I found a couple of test reports that confirm what was initially suggested, i.e. that among the Windows GUI tools Richcopy seems a winner for network transfer
http//
https://
https://
but still archiving the file and transferring the archive might be faster.
For what it's worth, I just finished moving close to 24TB of data consisting of hundreds of millions of files… Use rsync. Supports block level copy, resume/append, retains timestamps and permissions and compression (if destination supports it).
The whole copy finished in a couple of days and it was only 1 command for the whole processes. If the copy stops for some reason, just rerun the command and it will pick up where it left off.
Windows ports are available also…
…
Use rsync.
…
Windows ports are available also…
The "direct" correspondent to rsync under Windows should be DeltaCopy
http//
(which is a frontend for rsync in a Cygwin port)
jaclaz
… It's slow with many very small files–just like I expect any other tool to be–but with the 2+ GB image files I'm usually transferring, it runs at near wire speed.
Is it slow, because with lots of little files it has to run a CRC on each one to verify the integrity of the file? Since many smaller files could potentially mean more fragmentation?
Kind of late reply, but no, robocopy is not performing a CRC on anything as far as I know. It does copy over all the metadata as well as the file, and with small files that metadata can be relatively large compared to the file itself. Also, it's a serial process–one thing at a time.
The idea of copying the files into an archive is intriguing. Not sure why you wouldn't compress it too if you were going to take the time to archive it. With today's processors, compression should be quick.
I prefer to copy everything as-is even if it takes longer. That way, I preserve as much as possible, such as the metadata.
Yeah I could probably use some compression but I didn't really have any idea how long the storing process was going to take so I just opted for the fastest option. Some of the data is probably highly compressible so it would make sense to try that as well.
The real problem with creating the archives though is space. I'm not always going to have the extra space locally to compress the archives too, as in many cases the reason I'm moving the data is to free up space on the local drive.
Rich copy seems to perform well but it's quite old, Tera Copy works pretty well 99% of the time too but it doesn't seem quite as fast as Rich Copy, possibly due to Rich's ability to use threading.