Python script / rev...
 
Notifications
Clear all

Python script / reverse engineering: Repairing a corrupted Zip-file with known compression and known uncompressed content  

deady1000
(@deady1000)
New Member

Hey guys,

I am working on a solution for a problem which I have with Zip-files which are corrupted when a specific program (a game-server) has a crash. The Zip-files contain a log-file (actually flight-data) in txt-format, which can be read with a simple notepad or using TacView.

When the exporting program has a crash (which can happen sometimes) the Zip-files are not completed and thus corrupted and the txt-file is not readable via any program. When using HxD, I can see that the zip-file is just ending in the middle of the compressed txt-file. So it is not possible to open nor extract that file and it's shown as empty.

I know a workaround (software is Fiddler, off-label-use: check link “ https://textslashplain.com/2015/12/15/repairing-corrupt-zip-files/”) which makes it possible to repair these Zip-files and extract the txt-file later but I want to write a script (at best in Python using Zlib or something) on my own to make that automatically. If this software can do it, anyone can do it, right?? The question is just -> How to decompress this textfile via Python?

The script shall be able to read the bytes of the zip-file, and start decompressing and exporting to a new created textfile, after a certain string, in this case “61 63 6D 69” which means ‘acmi’ in text-form (which is the file-format of the txt-file I want to rescue). The compressor would be DEFLATE (ultra for example). Problem is that I currently don’t know how to decompress any blocks of code without having a header (?). If the Script would be able to decompress the deflated block, it would be easy to export the uncompressed text-string to another textfile which then could be read of notepad and the Log-Program (here TacView). It would be no problem if the textfile is not completed! It would be okay if during the decompression something is cut off, if that is necessary!

Please check this example of the first 9 lines from the textfile:

TEXT IS DECODED WITH WINDOWS(ANSI)
HEX IS EXPORTED FROM THE TOOL HxD

UNCOMPRESSED:
-----------------------------------
FileType=text/acmi/tacview
FileVersion=2.2
0,ReferenceTime=2011-04-01T10:50:00Z
0,RecordingTime=2020-11-14T19:34:28.853Z
0,Title=BLUE-FLAG.CAUC.80s.v1.072
0,DataRecorder=DCS2ACMI 1.8.3.200
0,DataSource=DCS 2.5.6.57530
0,Author=PILOT_367366
0,Comments=Welcome to Blue Flag.\
-----------------------------------
EF BB BF 46 69 6C 65 54 79 70 65 3D 74 65 78 74 2F 61 63 6D 69 2F 74 61 63 76 69 65 77 0A 46 69 6C 65 56 65 72 73 69 6F 6E 3D 32 2E 32 0A 30 2C 52 65 66 65 72 65 6E 63 65 54 69 6D 65 3D 32 30 31 31 2D 30 34 2D 30 31 54 31 30 3A 35 30 3A 30 30 5A 0A 30 2C 52 65 63 6F 72 64 69 6E 67 54 69 6D 65 3D 32 30 32 30 2D 31 31 2D 31 34 54 31 39 3A 33 34 3A 32 38 2E 38 35 33 5A 0A 30 2C 54 69 74 6C 65 3D 42 4C 55 45 2D 46 4C 41 47 2E 43 41 55 43 2E 38 30 73 2E 76 31 2E 30 37 32 0A 30 2C 44 61 74 61 52 65 63 6F 72 64 65 72 3D 44 43 53 32 41 43 4D 49 20 31 2E 38 2E 33 2E 32 30 30 0A 30 2C 44 61 74 61 53 6F 75 72 63 65 3D 44 43 53 20 32 2E 35 2E 36 2E 35 37 35 33 30 0A 30 2C 41 75 74 68 6F 72 3D 50 49 4C 4F 54 5F 33 36 37 33 36 36 0A 30 2C 43 6F 6D 6D 65 6E 74 73 3D 57 65 6C 63 6F 6D 65 20 74 6F 20 42 6C 75 65 20 46 6C 61 67 2E 5C
-----------------------------------

COMPRESSED:
-----------------------------------
(cannot be exported here, check HxD)
-----------------------------------
2D CD 4D 4E C3 30 10 86 E1 7D 4F 91 03 90 E9 D8 CE 1F 91 BC 48 53 82 2A 05 81 A8 0B 12 42 42 91 19 8A A5 24 46 8E 53 E0 6C 2C 38 12 57 A0 86 6E E7 7D 34 DF CF D7 77 63 7A 52 9F 6F 24 3D 7D F8 65 A7 07 B3 F4 9D 3E 18 7A 5F 84 74 47 6E 32 76 94 1C F8 02 CF 6E E9 85 1C 8D 9A 94 19 48 72 64 2C C6 24 46 A6 18 96 29 96 88 0F 7F 48 5B F7 6C C6 FD 09 71 8C 8F 8E 25 8A 9D 97 22 29 79 01 45 2A 02 54 C6 F7 24 57 ED EE 22 6E DA EA 12 EA 6A 57 43 81 13 1C 18 60 1E F6 D6 9D EF FE DF 91 93 EB 7A CB AB FA 6A 13 31 28 40 00 47 3C 89 AD 9D 9D A6 D0 23 0E 29 64 90 E6 A9 08 B1 9A FD AB 75 F2 66 D3 5E AB 27 91 E5 22 CB 8E D7 DA 0E 03 8D 7E 92 F7 D4 6B 3B 50 E4 6D B4 EA 67 8A 9A BE DB C3 E3 2F
-----------------------------------

Has anyone an idea how to decompress the given block of code? The compression is known, as it is just done via 7z in zip-format using DEFLATE/ultra. I don’t have anything else, no headers, no CRC! But I know what it should look like uncompressed – so that’s at least something I guess.

Is this possible?

What do I need to do?

 

Thank you in advance,

Cheers!

Quote
Topic starter Posted : 15/11/2020 3:22 pm
jaclaz
(@jaclaz)
Community Legend

I would try first what happens with existing tools.

Namely offzip might do:

http://aluigi.altervista.org/mytoolz.htm#offzip

jaclaz

 

ReplyQuote
Posted : 15/11/2020 4:17 pm
deady1000
(@deady1000)
New Member

Whoa, thank you!

It worked with this command:

offzip.exe -a -z -15 -Q *MYFILE*

 

It exported a file which was called something like 000005.fil. After renaming the extension I was able to use this file in Tacview. Amazing. I also wrote a Python script which now automatically repairs my files in a loop and renames them.

 

Thank you!

ReplyQuote
Topic starter Posted : 15/11/2020 10:45 pm
jaclaz
(@jaclaz)
Community Legend

Good, you are welcome.

Maybe you should look on the "server side" and see if it is possible to disable the compression of the logs, i.e. have it save them as plain txt and have an external script to compress them.

jaclaz

ReplyQuote
Posted : 16/11/2020 9:02 am
Share: