c# - DeflateStream advancing underlying stream to end -
i'm trying read out git objects git pack file, following format pack files laid out here. once hit compressed data i'm running issues. i'm trying use system.io.compression.deflatestream decompress zlib compressed objects. ignore zlib headers skipping on first 2 bytes. these 2 bytes first object anyway 789c. trouble starts.
1) know size of decompressed objects. read method documentation on deflatestream states "reads number of decompressed bytes specified byte array." want, see people setting count size of compressed data, 1 of doing wrong.
2) data i'm getting correct, think (human-readable data looks right), it's advancing underlying stream give way end! example ask 187 decompressed bytes , reads remaining 212 bytes way end of stream. in whole stream 228 bytes , position of stream @ end of deflate read 187 bytes 228. can't seek backwards, don't know end of compressed data is, , not streams use seekable. expected behavior consume whole stream?
according page reference (i'm not familiar file format myself), each block of data indexed offset field in index file. since know length of type , data length fields precedes each data block, , know offset of next block, know length of each data block (i.e. length of compressed bytes).
that is, length of each data block offset of next block minus offset of current block, minus length of type , data length fields (however many bytes is…according documentation, it's variable, can compute length read it).
so:
1) know size of decompressed objects. read method documentation on deflatestream states "reads number of decompressed bytes specified byte array." want, see people setting count size of compressed data, 1 of doing wrong.
the documentation correct. deflatestream
subclass of stream
, , has follow class's rules. since read()
method of stream
outputs number of bytes requested, these must uncompressed bytes.
note per above, do know size of compressed objects. it's not stored in file, can derive information things are stored in file.
2) data i'm getting correct, think (human-readable data looks right), it's advancing underlying stream give way end! example ask 187 decompressed bytes , reads remaining 212 bytes way end of stream. in whole stream 228 bytes , position of stream @ end of deflate read 187 bytes 228. can't seek backwards, don't know end of compressed data is, , not streams use seekable. expected behavior consume whole stream?
yes, expect happen. or @ minimum, expect buffering happen, if didn't read way end of stream, expect read @ least number of bytes past end of compressed data.
it seems me have @ least couple of options:
- for each block of data, compute length of data (per above), read standalone
memorystream
object, , decompress data stream rather original. - alternatively, go ahead , decompress directly source stream, using offsets provided in index seek each data block read it. of course, won't work non-seekable streams, indicate occur in scenario. option not work cases in scenario.
Comments
Post a Comment