Posted this on /pdf/ a few days back and i know people have been seeking it so i'll release it here too.
I was trying to get some decent ebooks of the gulag archipelago unabridged, which haven't been published.
I found some versions on Archive.org scanned from the original annotated releases and you can dl them in a bunch of formats out of which all will be corrupted in some way. I wrote a simple parser that went trough the daisy format and fixed it then converted it to epub.
https://archive.org/details/TheGulagArchipelago-Threevolumes
I am half way trough the first book and haven't noticed any big issues so i'll release them here.
What i've seen so far is :
i and 1 are sometimes(rarely) mixed up and other similar scan issues
Depending on your reader the scanned text won't match with the page layout on your reader so you'll get headers/footers/annotations at random places. The annoying part is that you might have to go 1 or 2 pages forward to read the annotations.
I was too lazy to set the metadata correctly but i added the calibre folders.
LINK:
mediafire
view the rest of the comments →
[–] Zesty 0 points 1 point 1 point (+1|-0) ago
I just started 200 Years by this guy (I think). It's tough to start out with, but I can tell that he write about important things and sources everything he says which is amazing.
[–] 9302764? 0 points 1 point 1 point (+1|-0) ago
There is nothing easy about his books