How to scan books without ruining them? And also how to upload torrents?

After hating the idea of e-book readers (to be fair, the first couple I tried were slow and not good picture quality), I've spent the last few years using them extensively and they are great, and so I'm trying to get all of my old books onto ebook format. Anyway, there are some that I can't find online, so I want to scan them in, and convert them via OCR, but many of the books won't open enough so that all of the page lies plush on the scanner, so is there any solution other than manually removing each page and scanning them that way. The books aren't valuable or anything (or even rare, it's some science fiction books, and a few books of collected essays) but I'd rather not destroy them if possible.

Also, if I do scan them in, then I'd like to upload them as a torrent. How do I do this, please?

Thanks for any answers.

Comments

  • you can turn the scanner on it's side.. well some scanners do that. canoscan one.

    there is an app I saw a while back that uses a phone to scan / ocr.

  • edited June 2015
    fog said:

    you can turn the scanner on it's side.. well some scanners do that. canoscan one.


    there is an app I saw a while back that uses a phone to scan / ocr.

    Sorry, you've lost me. The problem isn't that my (flat-bed) scaner won't scan, it's that the books themselves won't physically open enough so that the two pages both lie straight on the lense (or whatever you can the glass sheet) of the scanner. The pages curve back into the spine of the book, and so the text near the spine isn't scanned properly. I can solve this by removing each page from the book, and then scanning each individual page, but then I'm left with just the pages, and not the book anymore. This has to be a problem for lots of people when scanning books, so I was hoping that there was a solution that would not involve physically ripping the pages out of the books.

    All but one of the books are paperback, and none of them have a two page size that's larger than A4.
    Post edited by ewgf on
  • open the book so it's at 90 degrees take a photo, ocr the photo.
  • Digital cameras now have enough resolution that they can compete with scanners for OCR work if the book is precious. My DSLR camera is a few years old now (2010) but it shoots at 18mp which is more than enough for OCR.

    Even a point and shoot 12mp camera would do the job. The only problem I have found with this is reflections when using a flash, or external lighting when the page is shiny or has a waxy finish.
    Calling all ASCII Art Architects Visit the WOS Wall of Text and contribute: https://www.yourworldoftext.com/wos
  • beanz said:

    open the book so it's at 90 degrees take a photo, ocr the photo. 


    I see, thanks. Sorry Fog, I see what you meant now. Have either of you (Beanz or Fog) tried this? I can see how it could potentially work, but making sure the camera took a focussed, clear enough photo (well, digital image) so that OCR worked well on the photos, might be difficult. Especially to someone like me, who doesn't know much about photography. The light would have to be right, for a start.


  • yes, that's how the pros do it. You take a photo of each pair of facing pages open part way so the pages stay fairly flat, then use software to warp the image to flatten it.
    After that you train your OCR algorithm by showing little bits of the images to people trying to sign into websites. See also reading house numbers from photos taken from the public road ;)
    My rubbish website including the redrawn Amstrad schematics and the new home of the Sinclair FAQ wiki.
  • If you have a sufficient number of books to do, then the expense of an overhead scanner might be justified -

    http://www.amazon.co.uk/dp/B00EPU73TW
Sign In or Register to comment.