Google turns classic books into free gibberish eBooks
Posted on 27 Aug 2009 at 17:01, by David Ludlow
Google has added free eBook downloads to its Books service.
Google Books was originally conceived as a way to make titles available to everyone online free of charge. It works by Google borrowing the original titles and then scanning them in, allowing readers to see the books with their original designs and fonts.
Now Google is offering free eBook downloads in ePub format for titles that are out of copyright. It sounds like a good idea in theory, but to create an ePub eBook, Google has had to use OCR on the original scans in order to extract the text.
Sadly, nobody at Google seems to have bothered to have checked the final version, making some texts impossible to understand. It seems to make a mockery of Google's statement that "Digitizing books allows us to provide more access to great literature for a wider set of the world's population".
Gulliver's Travels is one such book to go through the translation process, making it completely unreadable, as you can see from this extract.
"The j.-.i;-i had lined it on all sides with the solu.-.?i cloth she could get, well quilted Uih!l; neath, furnished it with her baby's bi d. Provided me with linen and other necessaries. And made every thing as convenient as sl.c could."
This appalling translation makes no sense and can't compete with the excellent Project Gutenberg, which offers out-of-copyright books that have been scanned and corrected properly in ePub format.
The Project Gutenberg version of Gulliver's Travels correctly turns the above paragraph into text.
"The girl had lined it on all sides with the softest cloth she could get, well quilted underneath, furnished it with her baby's bed, provided me with linen and other necessaries, and made everything as convenient as she could."
Some of Google's other ePub books are better, including Treasure Island, but we still found that it was littered with mistakes and odd characters. Until Google decides it's going to check the text after it's been through the OCR process, Project Gutenberg remains the best destination for free eBooks.
Find a review
- O2 launches Xbox One and PS4 smartphone bundles
- EA halts Battlefield 4 expansion development to fix main game
- PS4 Battlefield 4 patch delayed for 'additional testing'
- Microsoft opens Project Spark beta for Windows 8.1 users
- Sony: 2.1 million PS4 consoles sold since launch
- PS4 stock: Sony promises "substantial further volumes" in time for Christmas
- Xbox One developer settings hack could lead to boot loops, warns Microsoft
- PS4 sells 250,000 consoles in the UK in just 48 hours
- Sony PS4 code redemption restored for some after a rocky launch weekend
- Xbox One hard drive replaced with SSD to boost performance