Format Your Own Damned Book Part III -- E-book Formatting Basics
There are two dominant formats for e-books: EPUB and MOBI. MOBI is used for the Amazon Kindle, and EPUB is used for pretty much everything else. Since you can easily convert an EPUB file to a MOBI file using Amazon's free Kindlegen utility, EPUB is the only format you'll ever need to deal with.
An EPUB file is like a Zip archive that contains web pages. If you've created any web pages using HTML tags you'll quickly figure out how to format an ebook. If you haven't, read on.
Technically, EPUB pages are done as XHTML files. XHTML is like HTML, the tag language used for web pages, but the tags must follow certain rules that make it easier for a computer to deal with them. With HTML you can put tags in any way you like and it's up to the web browser to figure out how to display the page. It is this looseness that makes it possible for a website that looks good in Internet Explorer to be completely unreadable using Firefox. With XHTML you use the same tags that HTML uses, but you have to use them in a specific sequence.
In practice, the tools we use to make web pages enforce that sequence automatically, so the difference between HTML and XHTML isn't that much.
Pretty much every word processor will save a document in HTML format, with the tags in the sequence needed by XHTML, so you shouldn't need to know anything about HTML to make an ebook, right?
Not true. While you will use that feature of your word processor, to get the best results you will need to learn something about HTML. This installment will teach you what you need to know.
HTML is what is known as a markup language. Before we had WYSIWYG word processors and desktop publishing, if you wanted to prepare an article for publication you needed to take the words of the article and insert markup language to tell the page formatting program how you wanted the text to look. The page formatting progranm would take care of such things as wrapping the words to fit in the margins, generating a table of contents, printing left and right page headings and footers, calculating page numbers (including numbering the introduction to a book with lower case Roman numerals and the restarting the numbering from 1 using Arabic numerals for the book proper), leaving page headings off for pages that begin a new chapter, etc.
So basically you'd provide the words for the book with some markup language and the publishing program would create nicely formatted interior pages for you.
Even though we have WYSIWYG word processors now, the basic concepts of formatting a book have not changed. A book still has a). structural elements and b). styles that determine how those elements should be formatted.
The structural elements for an ebook are much simpler than those for a printed book, so for now I'll just go over those. They are:
H1-H5
Chapter headings and subheadings. A novel will likely have just H1, chapter headings, but a nonfiction book might have subheadings as well.
P
The paragraphs that make up most of the book.
Blockquote
If you quote an extended passage from another book you'll put it in a paragraph by itself, and that paragraph will have wider margins than other paragraphs.
Bulleted List
A list of bullet points.
Numbered List
A list of items with sequence numbers.
Image
An illustration
Table
Information in a table with rows and columns.
and that's about it.
So what does this mean to you as an author? It means you need to stop thinking of your word processor as a typewriter. Instead of highlighting your chapter headings and making the font larger and fancier and putting blank lines before and after, you need to give it a Style and then modify the Style attributes so that every chapter heading in the book will have the same font, the same font size, the same amount of spacing before and after, etc.
It is especially important to do this with headings and subheadings. If you do this, your word processor will be able to automatically generate a table of contents for your document. I recommend that you try out this feature of your word processor, because doing that will verify that you did your headings and subheadings correctly.
In MS Word there is a Styles ribbon that looks like this:

For your chapter headings and subheadings, select the text and click on the appropriate button from this ribbon.
Libre Office has a drop-down list containing the same items.
If you use these Styles to format your book then the HTML generated when you save the book in that format will use the correct tags and will be in good shape to convert to an EPUB.
In the next installment we'll look at what that converted HTML looks like and how to fix it up before making an EPUB out of it.
An EPUB file is like a Zip archive that contains web pages. If you've created any web pages using HTML tags you'll quickly figure out how to format an ebook. If you haven't, read on.
Technically, EPUB pages are done as XHTML files. XHTML is like HTML, the tag language used for web pages, but the tags must follow certain rules that make it easier for a computer to deal with them. With HTML you can put tags in any way you like and it's up to the web browser to figure out how to display the page. It is this looseness that makes it possible for a website that looks good in Internet Explorer to be completely unreadable using Firefox. With XHTML you use the same tags that HTML uses, but you have to use them in a specific sequence.
In practice, the tools we use to make web pages enforce that sequence automatically, so the difference between HTML and XHTML isn't that much.
Pretty much every word processor will save a document in HTML format, with the tags in the sequence needed by XHTML, so you shouldn't need to know anything about HTML to make an ebook, right?
Not true. While you will use that feature of your word processor, to get the best results you will need to learn something about HTML. This installment will teach you what you need to know.
HTML is what is known as a markup language. Before we had WYSIWYG word processors and desktop publishing, if you wanted to prepare an article for publication you needed to take the words of the article and insert markup language to tell the page formatting program how you wanted the text to look. The page formatting progranm would take care of such things as wrapping the words to fit in the margins, generating a table of contents, printing left and right page headings and footers, calculating page numbers (including numbering the introduction to a book with lower case Roman numerals and the restarting the numbering from 1 using Arabic numerals for the book proper), leaving page headings off for pages that begin a new chapter, etc.
So basically you'd provide the words for the book with some markup language and the publishing program would create nicely formatted interior pages for you.
Even though we have WYSIWYG word processors now, the basic concepts of formatting a book have not changed. A book still has a). structural elements and b). styles that determine how those elements should be formatted.
The structural elements for an ebook are much simpler than those for a printed book, so for now I'll just go over those. They are:
H1-H5
Chapter headings and subheadings. A novel will likely have just H1, chapter headings, but a nonfiction book might have subheadings as well.
P
The paragraphs that make up most of the book.
Blockquote
If you quote an extended passage from another book you'll put it in a paragraph by itself, and that paragraph will have wider margins than other paragraphs.
Bulleted List
A list of bullet points.
Numbered List
A list of items with sequence numbers.
Image
An illustration
Table
Information in a table with rows and columns.
and that's about it.
So what does this mean to you as an author? It means you need to stop thinking of your word processor as a typewriter. Instead of highlighting your chapter headings and making the font larger and fancier and putting blank lines before and after, you need to give it a Style and then modify the Style attributes so that every chapter heading in the book will have the same font, the same font size, the same amount of spacing before and after, etc.
It is especially important to do this with headings and subheadings. If you do this, your word processor will be able to automatically generate a table of contents for your document. I recommend that you try out this feature of your word processor, because doing that will verify that you did your headings and subheadings correctly.
In MS Word there is a Styles ribbon that looks like this:

For your chapter headings and subheadings, select the text and click on the appropriate button from this ribbon.
Libre Office has a drop-down list containing the same items.
If you use these Styles to format your book then the HTML generated when you save the book in that format will use the correct tags and will be in good shape to convert to an EPUB.
In the next installment we'll look at what that converted HTML looks like and how to fix it up before making an EPUB out of it.
Published on October 19, 2016 14:35
No comments have been added yet.
Bhakta Jim's Bhagavatam Class
If I have any regrets about leaving the Hare Krishna movement it might be that I never got to give a morning Bhagavatam class. You need to be an initiated devotee to do that and I got out before that
If I have any regrets about leaving the Hare Krishna movement it might be that I never got to give a morning Bhagavatam class. You need to be an initiated devotee to do that and I got out before that could happen.
I enjoy public speaking and I'm not too bad at it. Unfortunately I picked a career that gives me few opportunities to do it. So this blog will be my bully pulpit (or bully vyasasana if you like). I will give classes on verses from the Bhagavata Purana (Srimad Bhagavatam). The text I will use is one I am transcribing for Project Gutenberg:
A STUDY OF THE BHÂGAVATA PURÂNA
OR ESOTERIC HINDUISM
BY PURNENDU NARAYANA SINHA, M. A., B. L.
This is the only public domain English translation that exists.
Classes will be posted when I feel like it and you won't need to wake up at 3Am to hear them.
...more
I enjoy public speaking and I'm not too bad at it. Unfortunately I picked a career that gives me few opportunities to do it. So this blog will be my bully pulpit (or bully vyasasana if you like). I will give classes on verses from the Bhagavata Purana (Srimad Bhagavatam). The text I will use is one I am transcribing for Project Gutenberg:
A STUDY OF THE BHÂGAVATA PURÂNA
OR ESOTERIC HINDUISM
BY PURNENDU NARAYANA SINHA, M. A., B. L.
This is the only public domain English translation that exists.
Classes will be posted when I feel like it and you won't need to wake up at 3Am to hear them.
...more
- Bhakta Jim's profile
- 15 followers
