BookGlutton / About / Filetypes
At BookGlutton, we initially supported uploading of multiple filetypes for conversion into the EPUB format that we use for our book content. But in the last year, a number of excellent conversion and authoring options for EPUB have become available. We still believe that HTML is the best source type for generating nice looking EPUB files, so here's some information about the conversion process that our own conversion tool uses.
EPUB
What is EPUB?
EPUB is an emerging standard for digital book content. We use it internally, so no conversion is needed for reading and sharing it in the Unbound Reader.
Who uses EPUB?
More and more publishers are experimenting with EPUB, and we think it will eventually be the predominant e-book format. So far it's largely DRM-free, based on open Web technology, and compatible with several major e-book reading systems.
Should I use EPUB for this upload thing?
Well, to get books on BookGlutton, you have to use EPUB. But there are many ways to get your content into it. Using HTML with our conversion tool is the best option for this.
How can I get the nitty AND the gritty about EPUB?
You can read more about EPUB on the EPUB books blog.
HTML and HTML+ZIP
What is HTML+ZIP?
HTML+ZIP is how we refer to any HTML file or files that are compressed into a ZIP format archive. This allows you to bundle several HTML files and images into one easy package. In fact, EPUB uses ZIP compression to do exactly this, so an EPUB file is just a ZIP archive with a few extra-special XML files in it.
How do I create a ZIP file?
There are numerous free programs to make ZIP files on Windows and Linux. We can't recommend one in particular, but it is a free and open format, so you should never have to pay for ZIP file creation software. In Mac OS X, the contextual menu command Create Archive will do it for you (right click on a file or folder to see this option).
Why HTML and not PDF or Word docs?
Text in a layout "flows" as it conforms to page boundaries or wraps around images. Text in a browser is said to be "reflowable," because it can be dynamically adjusted to fit unpredictable layouts. PDF, Doc and RTF formats were designed for only one "flow"--the one that happens when the document is fixed in print. EPUB and HTML on the other hand, are "reflowable" formats, which make them ideal for the flexible dimensions of a browser window.
How can I get my book to look exactly like the print version in HTML?
It's not going to look like the print version in HTML, and that's okay as long as it looks just as good. Fidelity to a paper copy isn't one of our goals when presenting books on screen, but ease of use, readability and social presence are.
I have a bunch of Word docs here. What am I supposed to do to get them into HTML?
Get your Word docs into HTML by exporting them directly from Word. Many word processors have ways to convert into HTML, usually with options called "Save as Web Page" or "Export to HTML." Make sure, when using these, that the result is a single file with the extension 'htm' or 'html.'
I exported to HTML from Word and now I see crazy question marks and wild characters I've never seen the likes of before. Why did this happen?
The funky characters you see are caused by Word exporting HTML in a non-UTF8 character set. This happens more often in languages other than English, but can happen in any document if it’s not saved in UTF8.
To fix the problem, you can re-open the resulting HTML file in Word as plain text. Then save it again, as PLAIN TEXT -- not HTML -- and with the same .htm or .html extension on the filename, but specify UTF8 as the character encoding. This link has a good step by step on how to do that (scroll down to see the image of the dialog box):
http://www.ljmu.ac.uk/cis/webpublishing/81434.htm
Isn't there any simpler way?
Google Docs is a service connected to your gmail account that does an excellent job of converting various office documents to HTML. You can upload any of your Word docs to your Google Docs account, then use your browser to view and save the converted result. Then you can upload, or if your doc is public, just import from the URL. NOTE: if enough of you like Google Docs and let us know, we'll consider putting a Google Docs import feature in to make it even easier for you.
Okay, I have HTML but can't figure out why my chapters don't show up in the book. What am I supposed to do about that?
HTML conversion to EPUB works like this: if it's a single HTML file, we look for headings in it, and create sections of the book from those headings. Headings are any large text in your HTML doc created through the use of <H1>, <H2>, or <H3> tags. There are three other heading tags in HTML, but we don't create sections from those. For HTML+ZIP archives, we do this for each HTML document in the archive, in the order they were added. So three HTML files with three headings each would produce a book with nine sections in the table of contents. Read more about this in our support forums.
Can I create HTML or XHTML documents by hand?
Yes, of course. It's not hard to do. The only tags you really need to know are the <p> tag, for paragraphs, and the <h1> tag, for headings. We'll do the rest. Here's brief example:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <title>Your Title Here</title> <meta name="author" content="Your Author Name Here"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> <h1>Lorem Ipsum</h1> <h2>By Some Ancient Guy</h2> <h3>Chapter 1</h3> <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nulla dictum velit sit amet nisl. Maecenas velit. Nulla nibh. Proin ultrices egestas ipsum. Phasellus molestie scelerisque est. Mauris non massa id pede aliquam elementum. Cras in quam. Suspendisse potenti. Aenean nibh. Donec eu diam vitae lectus ultricies venenatis. Vivamus turpis urna, scelerisque tempus, tincidunt ut, volutpat at, lorem. Aenean nunc eros, aliquam imperdiet, facilisis nec, ullamcorper at, pede. Mauris consectetuer, odio non tempus rhoncus, turpis pede consequat dolor, eget suscipit neque ipsum sit amet turpis. Phasellus semper pulvinar massa. Pellentesque dapibus lacinia massa. </p> <p> Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec ultrices. Suspendisse rhoncus nonummy purus. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Suspendisse dictum, nibh ut porta mollis, nulla leo sollicitudin risus, at blandit lectus lectus quis augue. Curabitur quis orci. In hac habitasse platea dictumst. Maecenas tortor. Proin sed dolor. Mauris nisl est, aliquam vel, volutpat nec, fringilla ut, neque. Nam sed enim nec pede pretium fermentum. Donec velit orci, vehicula at, pharetra sed, tristique vel, lorem. In quis mauris a tellus pharetra iaculis. Donec dictum leo sed magna. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Maecenas tellus neque, tempor quis, tincidunt non, fringilla ac, libero. Mauris eget eros. </p> <p> Quisque feugiat. Nulla urna. Donec imperdiet mauris eu nisl. Fusce lobortis tortor ut elit. Praesent nec nunc. Integer quis arcu sit amet felis pulvinar volutpat. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Etiam ligula diam, blandit sit amet, dignissim et, tristique viverra, lacus. Phasellus sed neque imperdiet libero tempor pharetra. Fusce eu nibh dictum felis rutrum sollicitudin. Sed ornare. Nulla eu pede sit amet arcu interdum luctus. </p> <p> Sed suscipit feugiat leo. Etiam a pede. Ut ac nisi ut tortor rhoncus ultrices. Nulla facilisi. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Integer vulputate. Fusce eu tortor ac nunc auctor posuere. Nulla facilisi. In sagittis eros at dolor. Ut a odio rhoncus erat convallis sagittis. Mauris cursus. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; </p> <h2>Chapter 2</h2> <p>Mauris ornare lobortis augue. Nulla facilisi. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Fusce sagittis. Maecenas et urna. Pellentesque et erat. Nulla facilisi. Phasellus at felis sed nisi porttitor venenatis. In eget risus. Cras dictum posuere mauris. </p> </body> </html>





