=== AMD is now known as Nokaji | ||
daftykins | hmm, fascinating situation on a client system once again - i might well have shared how i had worked to batch convert scanned documents at one place to reduce the size on disk by a huge margin | 13:56 |
---|---|---|
daftykins | this time, i found a practice where half the data size of an application was taken up with 6,600 KB .RTF files representing just tiny word processed documents | 13:57 |
daftykins | seems to be something nutty about a typical RTF wherein it's all raw ASCII with no compression - and tonnes of tags for formatting, making up the overall size | 13:57 |
daftykins | i just fed January 2012's 189 .RTF files into LibreOffice batch conversion... input size: 1.25 GB - output size: 92 MB ; 7.17 % of original | 13:58 |
zxmpi | i've had situations like that and compressing works for 99% of the documents fine but there's 1% that when you compress them they become unreadable blurs | 13:58 |
daftykins | heh, yeah the added challenge in this case is that i needed them to keep the original file name *and* pretend to be .RTF still so that they'd open from the program in MS Word, thankfully it works fine and doesn't care they're not really RTFs | 13:59 |
daftykins | the only quirk is it puts spelling error squiggles under loads of normal words, some kind of additions must have been made that make it go squirrely | 14:00 |
zxmpi | yeah, windows hates file extensions and can ignore them for some apps | 14:00 |
penguin42 | daftykins: Sorry, I'm unclear; are you resaving them as rtf or plain txt? | 15:52 |
daftykins | ah i did neglect that part yes, well since i need to preserve their opening, they're noew .docx *but* named .RTF :D | 15:52 |
daftykins | *now | 15:52 |
daftykins | developer refuses to offer any assistance in editing the database to reflect a file extension change | 15:53 |
daftykins | they seem to write into a flat file format with pairs of files named .FS5 and .IDX | 15:53 |
penguin42 | had you tried just resaving them back as rtf? | 15:53 |
daftykins | (in case you've ever heard of those) | 15:53 |
penguin42 | i.e. whether LO's RTF writer is any more concise? | 15:54 |
daftykins | yeah that drops them to a smaller size, but still 3x larger than .docx | 15:54 |
penguin42 | yeh, docx is gzip'd or is it zip) | 15:54 |
daftykins | mm zip xml aiui | 15:55 |
daftykins | so 6,958 KB RTF, 500 KB .docx, 1,462 KB .RTF resaved | 15:55 |
penguin42 | I ownder if you can tell LO to change the language in the docx so it doesn't try and spell check | 15:56 |
daftykins | it's an odd one, normal words come up as bad and yet i see English UK at the bottom just fine | 15:57 |
daftykins | with LO that is, didn't spend much time in MS Word on their end | 15:58 |
daftykins | i lose all the create/modified dates as well of course due to processing | 15:58 |
penguin42 | restamp those? | 16:03 |
daftykins | i don't know a viable method for that off hand | 16:03 |
* penguin42 wishes Toolstations order system didn't blatantly lie and say it could process an order in 5mins | 16:04 | |
daftykins | https://www.twitch.tv/nasa | 16:33 |
daftykins | eclipse over in New Mexico | 16:33 |
zxmpi | the cosmic ballet goes on https://www.youtube.com/watch?v=FmoW-gNjjXA | 16:43 |
* penguin42 just walked 3x3m length of trunking home from Toolstation | 17:13 | |
daftykins | arms like jelly now? | 17:33 |
penguin42 | one, yes :-) | 17:33 |
penguin42 | didn't even get too many odd looks either :-) | 17:34 |
daftykins | if anyone glances for too long at whatever i'm carrying, i like to say i'm taking it for a walk | 17:34 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!