Support for Indie Authors discussion

332 views
Archived Author Help > Formatting on Amazon and KDP

Comments Showing 1-50 of 58 (58 new)    post a comment »
« previous 1

message 1: by J.D. (new)

J.D. Dudycha | 39 comments I have a question for all of you that have experience in publishing through CreateSpace and KDP.

I am just about to publish my first novel on CreateSpace and wanted to see what I needed to do about formatting for the book, if anything.

Here are my questions: Does CreateSpace provide you with a copyright page, or do you need to make your own and include it? Should I use drop caps at the beginning of each chapter? Do I need to drop the text at the beginning of each chapter so the number stands alone at the top?

Same goes for KDP.

Any answers you have to these questions are much appreciated. Thank you!


message 2: by [deleted user] (new)

I drop-capped my Create Space book. I saw a bible once drop-cap the chapter number. I personally prefer to Center the chapter title, and drop-cap the first letter of the first word of the first paragraph. Unsure what you mean by a copyright page.

KDP? Let me know if you can get drop-cap to work on kindle, and what code you use. I am in the hunt stage for that answer myself.

Best regards, Morris


message 3: by [deleted user] (last edited May 24, 2015 01:07PM) (new)

You provide your own copyright page. Just take a look at some of the books on Amazon using the Look Inside feature and you'll get an idea of what you need. You can also download Smashword's formatting guide free from Smashwords.com. I don't recommend anything fancy on Kindle for the ebooks; the simpler the better. With paperbacks on CreateSpace you can get as fancy as you want, or as simple as you want, but, remember, the more fancy it is, the more complicated it can get. Writers tend to go overboard on their first novel (I know I did), when it's not really necessary. Readers just want it to be easy to read. When I issued the 2nd edition of my first novel, I simplified it down to the bare essentials, and it looks better. For the first letter of each chapter, I simply used a slighter larger size for the font, and made it bold. Even that's probably more than you need for a novel.


message 4: by Erin (new)

Erin Zarro | 95 comments You will need to create your own copyright page as well as dedication, Also By, etc.

When I first started 5 years ago, I didn't know to make a copyright page, so all of the poetry chapbooks I sold were without (!!!) one. A few years back I redid everything. Better late than never I guess. ;)


message 5: by Owen (last edited May 24, 2015 06:51PM) (new)

Owen O'Neill (owen_r_oneill) | 1509 comments For KDP, keep it as simple as possible. It's best to go with straight, simple HTML, if you know how to do that.

Basic guidelines to follow for KDP:
Never specify fonts or font sizes.
Use percents for indents and em's for margins.
Limit use of symbol characters (try to use HTML symbol codes) and in-line formatting. (We only use bold and italics -- not even super and subscripts.)

If this all sounds like Greek, Amazon has a nice, simple (free) formatting guide which is pretty easy to follow, for converting a Word doc into HTML for uploading. (Sorry -- I don't have a link to it handy.)

The reason is that there are many ereaders out there, all different screen sizes, with different font support and defaults, and differing ability to handle symbol sets. Things that often happen if these guidelines aren't followed: erratic shifts in font size and face; strange indents so consecutive paragraphs "step" inwards; paragraphs being centered; wide gaps between paragraphs.

If you try to get fancy, you risk your book being close to unreadable on some device[s] and nothing looks more amateurish than bad formatting.

With a PDF for the print version, you can do whatever you like, but as Ken says, it's a good idea not go overboard. That's another sign of "amateur" production that readers tend to pick up on.

Createspace does have templates you can download and there are templates and style sheets for Kindle eBook, but I've never looked at any of them, so I can't vouch for them.


message 6: by J.D. (new)

J.D. Dudycha | 39 comments Owen wrote: "For KDP, keep it as simple as possible. It's best to go with straight, simple HTML, if you know how to do that.

Basic guidelines to follow for KDP:
Never specify fonts or font sizes.
Use percen..."


Thanks Owen,

I do have one question about the print version PDF. As you see in other novels, there is a large gap between the title of the chapter and the first word of the chapter. Is there a hard and fast rule about how far down your sentence should start? I assume you will have to make that change, it is not something provided by CreateSpace.

JD


message 7: by Owen (new)

Owen O'Neill (owen_r_oneill) | 1509 comments JD wrote: "I do have one question about the print version PDF. As you see in other novels, there is a large gap between the title of the chapter and the first word of the chapter. Is there a hard and fast rule about how far down your sentence should start? ..."

There are no hard and fast formatting rules for fiction, that I'm aware of. Different publishers adopt various conventions. For example, it was once more common to always begin a chapter on the right-hand page, leaving a blank page to the left, if necessary. Now that seems to be less common, and in some books I've seen, chapters don't even begin with a new page. The same goes for headers, and page numbers (top outside corner vs bottom center). Hardbacks tend to be formatted different than trade paperback. Some get weird, like the book printed in brown ink on cream paper (it was steam-punk) that I found quite difficult to read. (The publisher -- S&S? -- really screwed that one up.)

We went to the library, picked half-a-dozen recent paperback books in our genre, and formatted our print editions based on what we saw. We kept it simple, and didn't do drop-caps or fancy dividers. We also didn't hyphenate, since Word can go nuts about that.

So no rules, just what you like while making your work easy to read.


message 8: by [deleted user] (new)

I just spaced mine to look good and worried only about making every chapter consistent. I used the same file for hardcover, since page sizes are the same.


message 9: by [deleted user] (last edited May 26, 2015 08:20AM) (new)

"Never specify fonts or font sizes." You can, and should specify font sizes for chapter and title headings. This is done by specifying size in the style part of your header by using a CSS style.

Like....
p.h1
{
font-size: 1.5em;
text-indent: 0em;
text-align: center;
font-weight: bold
}
note that when I use an h1 header in my body, it causes the text to be 1.5 em, which is 1.5 times the width of the letter "m." So when the customer on the end adjust font size in their ereader device, the titles and chapter headings stay relative in ratio and proportion to the deafult font size. You can only get this kind of control by doing your book in notepad and then converting it to eBook with a converter later. If you are interested, let me know, and I will get you some pointers and direct you to a tutorial. I also have a template that has all of the CSS styles I use in a notepad that you can use for a starter.

best regards, Morris


message 10: by Owen (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Morris wrote: ""Never specify fonts or font sizes." You can, and should specify font sizes for chapter and title headings. "

You certainly can, as you say, and at first I did, but when I reviewed the results on different devices, I didn't like it. So I took all the sizes out of our style sheets, and was happier. Then I noticed that the Amazon guidelines said the same thing, and I decided they were right on that.

It does depend on what you like and are willing to accept. As long as em's are used, as you show, it won't blow up on you. But if you leave it entirely up to the device and the user, it's nice and simple and you can't go wrong.


message 11: by [deleted user] (last edited May 26, 2015 10:35AM) (new)

Yes, but your chapter headings will be the same size as your text font, which I find unacceptable. Check out my book, "Warzone: Nemesis" on Amazon http://www.amazon.com/dp/B00EC190BI, under "Look Inside." You can see that I have successful learned to format for Kindle. I have viewed it on other devices and it works fine. As I said before I am willing to help an author get control of his product. I have some tutorilas on how to do it.

I do not like the conversion programs which often leave formatting junk from your word processor. Yes, it does take discipline to learn how to do this, but it is not too hard. In fact, I have a teaching by example book. I formatted my revised edition of "The Curse of Capistano: The Mark of Zorro." This work was written in 1917 and was in the public domain. So.. I edited it, made eBook copies and put it on my website for free. You can download both the PDF and Ebook for free, and email me/morris.graham@sbcglobal.net. I will send you the notepad I used to create the eBook file. It will serve as a tutorial for you. My website: http://www.morrisegraham.com. It is on the downloads tabs.


Best regards, Morris


message 12: by Owen (last edited May 26, 2015 10:49AM) (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Morris wrote: "Yes, but your chapter headings will be the same size as your text font, which I find unacceptable."

If H1, H2, H3 etc tags are used, each device will display them in their own different default size. You can of course define your own styles and adjust the sizes if you want to, but we've always been happy with the defaults.


message 13: by J.D. (new)

J.D. Dudycha | 39 comments Owen wrote: "For KDP, keep it as simple as possible. It's best to go with straight, simple HTML, if you know how to do that.

Basic guidelines to follow for KDP:
Never specify fonts or font sizes.
Use percen..."


Can you just upload your word doc into KDP and have them format it. It does have an option for that.

JD


message 14: by Owen (new)

Owen O'Neill (owen_r_oneill) | 1509 comments JD wrote: "Can you just upload your word doc into KDP and have them format it. It does have an option for that."

Yes, you can, but I have never tried it. I've bought a couple of kindle books that had fairly severe formatting issues, and I suspect that is what the authors did. If you clean up your word careful, it might work? But I create a clean HTML file and use that. (I've been working in HTML and designing and building websites professionally since 1996, so that's an easy thing for me to do.)

If you have a print edition on Createspace, there is an option to have them make a Kindle edition for you. I've never tried it and I don't know if there is a fee for it, but if it's free, I might be inclined to think that would be a safer route than a Word doc?

The preview function for KDP is pretty good, so you can check basic formatting, but if things get weird in the middle of you book, you might miss it unless you click thru every page, which would be time consuming.


message 15: by [deleted user] (new)

You still get more control of your own product if you do it yourself. That is why I offered to show you how to do it.

You can also right-align or left-align text, which is nifty for a letter, both header and signature, and other neat stuff.

Morris


message 16: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments One thing that always bugged me about the KDP defaults is that they automatically stick a huge extra line of space between paragraphs. This makes intentional section breaks very difficult for the reader to interpret.

I've looked at a lot of traditionally published books on Kindle and almost all of them remove those spaces in HTML. So I've done the same thing on mine. I think their defaults look fairly clunky, and I've read that sometimes the HTML headers H1 - H6 can perform erratically on different readers. So, I've gotten in the habit of not using them, just setting up my own.

It's all kind of a pain to learn and do for the first time. But once you have a book that works, you can pretty much re-use the formatting between books by just inserting the book's text and making a few tweaks here and there.


message 17: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments This is a simulation of the line space problem that Amazon uses as a default.

In this scene we have 3 characters, two of whom are in a conversation and 1 who we will later become our POV. When that change of POV happens, you want to make this change clear. It's still the same setting, same timeline, so putting a break with "***" or whatever doesn't really make sense. In these cases a simple extra line break is often used.

The Amazon default will look something like:

“What are you going to do, John?” Sue asked.

“Just what I said I’d do.”

“You wouldn’t!”

“Is that right? Well, just you watch me!”


Mary watched them through the kitchen window.


You can see the extra line space if you really pay attention, but it's a LOT clearer if we do this:

“What are you going to do, John?” Sue asked.
“Just what I said I’d do.”
“You wouldn’t!”
“Is that right? Well, just you watch me!”

Mary watched them through the kitchen window.


Personally, I find that eBooks using the KDP default for line spaces pretty annoying, even though this style is standard for HTML pages (and eReaders are really nothing more than specialized web browsers).

Still, these are supposed to be books, not web pages.


message 18: by Igzy (new)

Igzy Dewitt (IgzyDewitt) | 148 comments Micah wrote: "This is a simulation of the line space problem that Amazon uses as a default.

In this scene we have 3 characters, two of whom are in a conversation and 1 who we will later become our POV. When tha..."


Really neat formatting tip. Out of curiosity, what exactly are you adding to your code that takes out that extra line?


message 19: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments Igzy wrote: "Really neat formatting tip. Out of curiosity, what exactly are you adding to your code that takes out that extra line?."

I just define the normal paragraph properties:

p
{
text-indent: 1.3em;
margin-bottom: 0.2em;
}

I found that setting the bottom margin to 0 seemed a bit too tight, and also thought the default text indent was too large for most eReaders. But you can adjust them as you like.


message 20: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments Oh, and another thing which I find a bit odd about Amazon's eBook formatting: if you do not hard code in justified text, Kindle eReaders will justify the text by default.

That's not an issue, however what may be an issue is that this default justification does NOT show up in their Look Inside feature.

I have not hard coded justified text, so in the Look Inside preview, it appears as if my books have left justified text. But in an actual eReader they show up with fully justified text.

Most people won't notice, but some people see that as a sign of non-professionalism. In other GR groups I found some people trying to rate quality standards on Indie books using the Look Inside feature and specifically looking for unjustified text. Once I pointed out the quirk I note above, they changed that, but it goes to show that some readers do get worked up about these things.

I should probably just change to hard coded justified text.


message 21: by Owen (last edited May 26, 2015 07:41PM) (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Igzy wrote: "Really neat formatting tip. Out of curiosity, what exactly are you adding to your code that takes out that extra line?..."

In our case, we specify a "normal" style in our style sheet for all normal text that is:
p.Normal, li.Normal
{margin-top:.60em;
margin-right:0;
margin-bottom:0;
margin-left:0;
text-indent:5%;
}

Our paragraph style (which is hardly ever used) is:
p
{margin-top:0;
margin-right:0;
margin-bottom:0;
margin-left:0;
}

Our heading styles follow this format:
h1
{margin-top:2em;
margin-right:0;
margin-bottom:.30em;
margin-left:0;
border:none;
padding:0;
font-weight:bold;
}
h2
{margin-top:0.6em;
margin-right:0;
margin-bottom:0;
margin-left:0;
border:none;
padding:0;
text-transform:uppercase;
font-weight:bold;
}

h3
{margin-top:0.6em;
margin-right:0;
margin-bottom:0;
margin-left:0;
border:none;
padding:0;
font-weight:bold;
}

h4, h5, h6
{margin-top:0.6em;
margin-right:0;
margin-bottom:0.6em;
margin-left:0;
text-align:right;
font-weight:normal;
font-style:italic;
}


We've never set the text justification manually. I was unaware until Micah mentioned it that anyone would think this "unprofessional", since I would have thought that readers of eBooks would be aware that the Amazon "Look inside" feature formats things differently than eReaders. I don't think I want such people reading our books anyway (our characters often speak and act "unprofessionally" and that might annoy them?)

I have read that some readers find left justified text easier to reader. I suppose there might be some worth in not hard-coding the justification in case their device allows them to select that option? (

The one thing that does bug me about Amazon "Look inside" feature is the way it displays borders. We put a bottom or top border on some headings and we use a "div" with a border to make it full width. The "Look inside" feature displays this as a box border, which does look goofy (it looks fine on eReaders). I suppose we could use a horizontal rule instead, but I haven't bothered yet.

BTW: I use EditPad Lite for all my coding in preference to Notepad. It has some nifty features.


message 22: by Igzy (new)

Igzy Dewitt (IgzyDewitt) | 148 comments Owen wrote: "Igzy wrote: "Really neat formatting tip. Out of curiosity, what exactly are you adding to your code that takes out that extra line?..."

In our case, we specify a "normal" style in our style sheet ..."


Thank you both, Micah and Owen, for your replies. I appreciate it.


message 23: by Owen (last edited May 26, 2015 08:01PM) (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Igzy wrote: "Thank you both, Micah and Owen, for your replies. I appreciate it. "

Welcome.

BTW: if you mine those codes, you can do about 95% of all the text formatting you would to. The main thing is to make sure your all paragraphs are tagged correctly before you convert. Especially if you use Word's HTML conversion feature, this can mess you up. (If you do, make sure to save as "filtered HTML" -- Word 2010 seems best for that -- or you'll get a horrid mess!)


message 24: by Igzy (new)

Igzy Dewitt (IgzyDewitt) | 148 comments Owen wrote: "Igzy wrote: "Thank you both, Micah and Owen, for your replies. I appreciate it. "

Welcome.

BTW: if you mine those codes, you can do about 95% of all the text formatting you would to. The main th..."


If it's not too much trouble could one of you offer an example of two properly formatted paragraphs laid out together? I'd be much obliged if I could see the tags in action, as I'm not familiar with HTML formatting.


message 25: by Owen (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Igzy wrote: "If it's not too much trouble could one of you offer an example of two properly formatted paragraphs laid out together? I'd be much obliged if I could see the tags in action, as I'm not familiar with HTML formatting. "

No problem. Of course GR interprets HTML code, so this example used parentheses in place of pointy brackets, thus ( = < and ) = > . This is how we begin a chapter, with notes after the ||:

(br clear=all style='page-break-before:always') || This tells the Kindle converter to break the page. Below are the headings: heading 3 for the main, and heading 4 for the sub, which is italic and aligned right.

(h3)(a name="_Toc410049128")Prologue: Zero Day(/a)(/h3) || name ID's the target for the TOC link.
(h4)Janin Station;(br) || br is a linefeed (liner break)
Tau Verde, Vulpecula Region(/h4)


(p class=NormalBlock)It was make-and-mend day for the Halith Imperial Navy’s Kerberos Fleet ... ruled the lives of Halith mariners—especially when the fleet was lying up at a comfortable port like Janin. (/p) || This a paragraph container. HTML designates the beginning of a container with a code, just p here -- h3 above -- and ends it with a / in front of the code: /p. "class" defines the type of paragraph as defined in the style sheet. The is flush-left paragraph with extra space at the top. The CSS entry for it is below.

(p class=Normal)Watchstanding and sensor sweeps ... guarded by a ring of monitors. (/p) || This is a standard text paragraph.


p.NormalBlock
margin-top:2.5em; || creates the extra space. Note there is no text indent.
margin-right:0;
margin-bottom:0;
margin-left:0;
}

That is 90% of an HTML text doc right there (with the CSS examples above, in the previous posts). Yes, you see a lot of godawful gibberish in the code heer and there, like style="much gibberish" or (span style="much gibberish") (/span). Almost all the time, that is unwelcome. Word will put it in to try to mimic the exact look of a doc in IE -- not what you want.

If anyone wants to know the basic anatomy of an HTML doc, I can post something "simple" or feel free to PM. There are of course resources on line, and Morris mentions his as well.


message 26: by Steve (new)

Steve King (stking) | 57 comments Question: When I post a review on both Goodreads and Amazon, I find that Goodreads is easy because it has both a Title search and an Author search. I can't find an Author search on Amazon so am stuck looking through 30-40 pages sometimes of titles. Or am I just missing it? thx!


message 27: by Uma (new)

Uma (witcheyez) | 37 comments Steve wrote: "Question: When I post a review on both Goodreads and Amazon, I find that Goodreads is easy because it has both a Title search and an Author search. I can't find an Author search on Amazon so am s..."

Hi Steve. I normally just type the author's name next to the book's title. Works all the time :-)


message 28: by Steve (new)

Steve King (stking) | 57 comments Thank you Uma!!!


message 29: by Steve (new)

Steve King (stking) | 57 comments Uma----it worked! You may have saved me hours! :)


message 30: by Uma (new)

Uma (witcheyez) | 37 comments Steve wrote: "Uma----it worked! You may have saved me hours! :)"

Glad to be of assistance, Steve :-)


message 31: by T.L. (new)

T.L. Clark (tlcauthor) | 727 comments Excellent free download with step by step guides (that even I could follow);
http://www.amazon.co.uk/Building-Your...

It really was very useful. :-)
Good luck.


message 32: by Steve (new)

Steve King (stking) | 57 comments Thank you TL! Funny how we can overlook simple things sometimes. :)


message 33: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments Owen wrote: "The main thing is to make sure your all paragraphs are tagged correctly before you convert. Especially if you use Word's HTML conversion feature, this can mess you up. (If you do, make sure to save as "filtered HTML" -- Word 2010 seems best for that -- or you'll get a horrid mess!)"

That's why I do all my writing in Word, and then cut and paste it all into NotePad++ to do my HTML formatting. I use macros to get rid of a lot of extraneous junk (spaces at the end of paragraphs, extra carriage returns, etc.).


message 34: by [deleted user] (new)

I think the best way to handle all of that is to type in the Word format, convert to HTML, run your HTML through Calibre, and run that result through the Kindle Previewer (both can be downloaded for free). With those two steps I get a perfectly formated doc, complete with a linked goto menu, that I can upload directly to Amazon. I use LibreOffice for writing, and I just specify .01 spacing between paragraphs to keep everything evenly spaced. As I recall, the Smashwords handbook tells you not to, but I do it anyway and get good results the first time.


message 35: by T.L. (new)

T.L. Clark (tlcauthor) | 727 comments Steve wrote: "Thank you TL! Funny how we can overlook simple things sometimes. :)"

T.L. wrote: "Excellent free download with step by step guides (that even I could follow);
http://www.amazon.co.uk/Building-Your...
=sr_1_1?ie=UTF8&qid=1432724755&sr=..."


No worries. :-) That be why I linked. It's not overly publicicsed, but really helpful.


message 36: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments Ken wrote: "I think the best way to handle all of that is to type in the Word format, convert to HTML, run your HTML through Calibre, and run that result through the Kindle Previewer..."

I've heard others doing that but I'd strongly recommend against it. Amazon has had an on-again, off-again relationship with Calibre. At first they accepted documents converted by Calibre, then they didn't. When I published first in August of 2013, they didn't accept Calibre conversions. That may have changed since then, but if it has, it is by no means a "this will always work" solution.

Plus, I've looked at the HTML of Calibre converted eBooks and the code is littered with all kinds of extraneous metadata specific to Calibre. I don't want that stuff junking up my files, and I'm not proficient enough to manually strip it all out with assurances that no damage will be done by my meddling.

My philosophy is to learn to DIY, and try to make it as clean as possible.


message 37: by [deleted user] (new)

Micah wrote: "I've heard others doing that but I'd strongly recommend against it. Amazon has had an on-again, off-again relationship with Calibre. At first they accepted documents converted by Calibre, then they didn't..."

Possibly, but apparently running the Calibre output through the Kindle Previewer solves all that. What you get is a "converted doc" that Amazon accepts. When I try to upload a .doc file or HTML directly I don't get a "goto" menu.


message 38: by [deleted user] (last edited May 27, 2015 06:46PM) (new)

I finish my documents in Word, so that I can get the full benefit of the word processor. I make sure each paragraph has a space between it, which helps me with the next step. Then I copy the whole thing right into notepad, which strips all of windows junk. Since each paragraph has a space between it, they're all easy to see them individually. I take all my curly quotes, ellipses, copyright symbols, subscripts and superscripts, and any other specials and replace them with HTML formatting. I then go through and format all the paragraphs, right-aligned and left-aligned, chapter titles, etc... manually. Finally I add any image references in html, and then a TOC that is linked to the chapters. I save it as UTF-8 as a text file, in case I need to update or correct something. Then I open it back up and "save as html>all files." At this point you can open it up as HTML file in your browser, highlight and copy everything on the page from your browser, and drop it back into Microsoft Word, and let the spell checker look for errors. (The browser will not honor paragraph breaks or page breaks, but don't worry about it.) Go back and fix anything you find in your notepad file. Also good to double-check the output of your final html file in your word processor. Once corrections are made in the notepad TXT file you built your html file with, based on what you find, then build another html file. I usually cut down on the confusion by naming each day's work, Word and txt files by the day's date.

Finally import the html file into MOBI pocket Creator. Once imported, I add my book cover and then any graphics I want. Finally I build the file, and save without DRM rights enabled, and drop the PRC file into MOBI Pocket Reader to check for errors. If you spot any errors, go back to notepad txt file and repeat process.

If it is the final file to upload to Amazon, you need to add the metadata and book settings. This time you will save as DRM, if you wish.

Guido Henkel is a master in the art of eBook formatting. He does, however, use Calibre in some of his steps. His tutorial is http://guidohenkel.com/2010/12/take-p...

And like I said before, I have a working Ebook you can look at, and I can send you the file that created the eBook file, so you can look as see what the cause and effects are...free of charge.


Best regards, Morris


message 39: by L.J. (new)

L.J. Kendall (luke_kendall) I'd like to thank everyone for their above posts - I know a reasonable amount about this stuff, but knowing more about what subset of html Kindle accepts, and various other traps and gotchas is great to know.
I had planned to follow a process much like what various people discussed above, and very recently just discovered Calibre will do a lot of that for you - so it was especially interesting to hear that there could be things to be careful about with that approach, too!
I use LibreOffice rather than Word, and although its HTML output is much cleaner than it used to be, it still exposes some edit history and has quite odd (to me) uses of spans.
So another option I'm considering is working from a "reasonably clean" HTML - e.g. Calibre's - and then automating the stripping/converting the redundant or overcomplicated markup. The trouble seems to be how to get clean enough HTML so that last task can become just a regular-expression driven series of search and replace steps.


message 40: by Ken (new)

Ken (kendoyle) | 364 comments Luke wrote: "I'd like to thank everyone for their above posts - I know a reasonable amount about this stuff, but knowing more about what subset of html Kindle accepts, and various other traps and gotchas is gre..."

If you're considering Calibre, take a look at Sigil, a free, open-source ePub editor. You can edit in code view or in preview mode, and it generates the TOC.NCX and OPF files for you. I don't have any affiliation with the developers, but I use it for all my work.


message 41: by L.J. (new)

L.J. Kendall (luke_kendall) Thanks, Ken. I'll have to polish up my github skills and see if I can build Sigil for Linux. Sounds like he didn't have a very happy experience with the Linux community, which is sad (although I can believe it; it does sometimes happen).


message 42: by Micah (last edited Jun 07, 2015 11:43AM) (new)

Micah Sisk (micahrsisk) | 1042 comments Luke wrote: "I'm considering is working from a "reasonably clean" HTML - e.g. Calibre's - and then automating the stripping/converting the redundant or overcomplicated markup. The trouble seems to be how to get clean enough HTML so that last task can become just a regular-expression driven series of search and replace steps..."

That's why I went the route described in the blog written by Guido Henkel (see the link provided above by Morris). Although I didn't use Calibre as he suggests (Calibre files were not accepted by Amazon at the time I did my first publication...they were when that blog was written).

So you write in whatever you want. Then cut/paste it to NotePad++ and run some macros and find/replace to clean up the text and then put it in clean HTML. And ultimately run that through Kindle Previewer.

There are a couple things Henkel's blog doesn't go into (probably because he was relying on Calibre to do it). These are the creation of the .opf and .ncx files needed by Kindle.

The .opf file contains all your book's meta data, hyperlinks to its html and jpeg files, and its structure (which files are presented first, second, third, etc.).

The .ncx file is what creates the Table of Contents to be used by the built in Kindle devices Go To function (or whatever it's called), as opposed to the hard coded HTML TOC.

I've gotten into the habit of breaking my HTML into separate files. I've heard that (especially in longer books) this speeds up searches. It's also the standard way of doing it in ePubs, and MOBI files are technically just specialized ePub files.

So...a folder for one of my books would have a collection of files like:

content.opf
toc.ncx
bookcover.jpg
stylesheet.css
Title.html
Section01.html
...
SectionN.html

I usually do a section for each chapter, one for Author bio, one for Additional Works By, and one for any other stuff I'm adding.

And, yes, I've used Sigil for some of this. Seems a pretty decent program for some uses. Makes chopping up sections pretty easy.


message 43: by Owen (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Micah wrote: "The .ncx file is what creates the Table of Contents to be used by the built in Kindle devices Go To function (or whatever it's called), as opposed to the hard coded HTML TOC..."

So far, I've never bothered with .ncx or .opf file. I've always just dropped the HTML file on KDP and let the converter do it thing. Is there an advantage to using .ncx file over the TOC in the HTML file? Once a TOC gets to be 3 levels deep, it displays differently on a Kindle, and I've been curious as to why. (Although I conclude a TOC that elaborate is generally not a good idea.)

Is there an advantage to "rolling" your own for the .opf file? (Outside of the "start" tag, which seems to be idiosyncratic anyway.)

That's an interesting datum on speeding up searches -- does that mean searches within a book on by an eReader?

I take it Notepad++ preserves formatting (bold, italics)? (I use editpad lite which doesn't.)


message 44: by Micah (last edited Jun 07, 2015 03:12PM) (new)

Micah Sisk (micahrsisk) | 1042 comments If you don't use a .ncx file then if you go to the Kindle's built-in TOC (on my Kindle Voyage it's just called Go To) then you only get the following navigation options:

Beginning
Page or Location
Cover
Start
End

Which means navigating in the eBook very impractical.

I've actually seen indie books with no .ncx and no TOC in HTML.

I think Calibre files auto generate the .ncx files.

I've never tried uploading the HTML file directly to KDP. I assume that their converter is used to create the .opf file. I believe I've run an HTML through the Kindle Previewer and I think it worked, but it did not put the book cover in the resulting file. And, of course, if you're using multiple HTML files like I do, that's not going to work in the first place.

The method I use is to run the .opf file through Kindle Previewer and that generates the MOBI file that I then import to KDP.

So with that method, to make everything work correctly, you've got to roll your own .opf.

They're not hard to do once you understand them, or once you have one that works which you can use as a template.

The Kindle Previewer is actually pretty helpful if there are errors: giving you pretty clear description of the errors found and most of the time hints on where the error happened.

When copying from Word to Notepad++, however, italics, bold, underlining and any other special formatting is not preserved. You have to first run find/replace routines to prepare the document.

I make sure any straight quotes are replaced by smart quotes, double hyphens are replaced by em-dashes, any "..." is replaced by an ellipses. And then do find/replace on italics (and bold and underline if applicable).

For example in Word, I do a find/replace and press “Ctrl-i” in the find, and “^&” in replace. Then press “Replace all”.

I also make sure any section breaks not denoted with "***" are marked with some kind of code so that I don't accidentally delete them with my process in Notepad++.

In Notepad++ I trim leading and trailing spaces, then run a macro that adds

to the start of every paragraph and

at the end. It also replaces all single quotes, double quotes, em-dashes, and ellipses with their HTML special characters.

After that there's a bit of clean up that usually needs done. (Making sure that all the "***" sections are centered, changing the section breaks that don't use "***" to
).

Sounds like a lot but I've created a document that maps the process and it can all be done in a matter of minutes.

The thing is, when I'm done, I've got a clean HTML file(s) which I have total control over. No mysterious behind the scenes Word or whatever funkiness. Don't have to wonder what KDP's converter is actually doing...And running it through Kindle Previewer lets me generate a MOBI which I can check out on my PC, Kindle, and in the previewer itself.


message 45: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments Owen wrote: "That's an interesting datum on speeding up searches -- does that mean searches within a book on by an eReader?.."

It means searches within the book on an eReader, and supposedly the amount of time it takes to go to the new location once you've pressed Go (or whatever).

However, I believe you would only notice the difference on modern eReaders if the book was really long.

It makes doing corrections across a whole book a bit more of a pain (although I believe Sigil solves that issue). But it also makes compiling a story collection pretty easy.

My first books were done all in one HTML, but they're both pretty short (300 pg or less).


message 46: by Owen (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Micah wrote: "If you don't use a .ncx file then if you go to the Kindle's built-in TOC (on my Kindle Voyage it's just called Go To) then you only get the following navigation options:

Beginning
Page or Location..."


On the books we've published so far (uploading just an HTML file), the KDP conversion took care of that. Under "goto" on my Kindle you get:
Beginning
Page or Location
Cover
Front Matter (with a drop-down)
A full chapter list (not a link to the TOC)
End.

When I download the zip file, it contains the HTML, opf and image files. No ncx file, but it seems to work fine. (The Kindle previewer does not do this in the mobi file it creates, and ignores the TOC tag. I do like the nice error list it provides.)

The only issue I've seen was with that complicated TOC, in which case the "Goto" shows:
Beginning
Page or Location
Cover
Table of Contents (a link to the TOC)
End.

I suspect building an ncx file for that book might have solved that issue. It seems like the KDP conversion just couldn't handle the nesting, and therefore punted. (Why it didn't produce a dropdown as it did with the front matter in the other book, I'm not clear on.)

I export the HTML file using the save as HTML filtered option. That provides a pretty clean file. I do some clean-up, most for myself, but so I can read the HTML more easily. At first, I replaced the curly quotes with HTML codes but it didn't seem to matter, so I quit doing that.

Using the emdash and endash codes does seem to matter, but are only ones I still replace. (I don't like the ellipsis character, so I use periods and non-breaking spaces, which Word provides.)

I've haven’t noticed a difference in response between our short book and our long ones, except in the case of the book with the 140-page glossary with 400+ internal hyperlinks in it. And even that one's not too bad. (We took it out anyway and now offer it separately.)


message 47: by L.J. (new)

L.J. Kendall (luke_kendall) While thinking about automating the above processes as a sed/awk/shell script, it occurred to me that Calibre is supposed to be doing exactly that anyway; and if it worked perfectly in Calibre, every self-publisher who used Calibre would benefit. So I've just now asked the developers if they'd be interested in hearing about specific problems in generating ebooks using the latest version of Calibre. I plan to give it a try, and if I strike problems, contact them and see what they say.

The latest version is 2.3, which I had to manually install under Linux. Although I'm running Ubuntu 14.04, which is quite a recent release, the Calibre version available was only v1.25, and the Calibre website says for Linux "Please do not use your distribution provided calibre package, as those are often buggy/outdated." It also mentions that Calibre is in rapid development, and it's true: they release versions with fixes or new features every week or two!

So it also occurs to me to ask if you remember what version of Calibre you guys used, or how long ago - if it was old, it's possible the various problems mentioned have already been fixed.




message 48: by Micah (new)

Micah Sisk (micahrsisk) | 1042 comments I only tried using Calibre back in 2013. I don't really use it for anything now.


message 49: by L.J. (last edited Jun 08, 2015 12:55AM) (new)

L.J. Kendall (luke_kendall) I contacted the main developer, Kovid Goyal, who gave an extremely helpful reply.

He pointed out that people are welcome to ask for feature enhancements to Calibre, and that it's in very active development with features being added at the request of users. There are of course no guarantees on whether any individual feature is added or not. The general process is discussed here: I want a new feature

An interesting point is that Calibre supports a plug-in architecture, so any programmers here could consider adding their own functionality.

He even addressed some of the points raised in this very discussion thread, noting that what was written about Calibre seemed very out of date, and not at all representative of Calibre now. Here's what he wrote:

Some quick notes::

1) Amazon currently accepts and has always accepted old style MOBI files produced by calibre

2) Amazon may or may not accept new style MOBI (KF8) files produced by calibre -- but they have never told anyone why they refuse them so that is impossible for us to do anything about. I suspect they just dont want you using tools from outside their walled garden.

As a result if you want to publish with KDP and calibre and use KF8 the best route is to generate an EPUB file using calibre and send it to amazon. Alternatively use kindlegen on the calibre produced EPUB and send the resulting MOBI to Amazon.

3) Converting from Word will be much better if you directly convert DOCX rather than via the HTML export in Word. http://manual.calibre-ebook.com/conve...

4) Sigil works fine on linux and has prebuilt binaries for it as well.

5) calibre includes its own builtin book editor that edits both EPUB *and* AZW3 files. In my opnion it is a lot better than Sigil, but I am obviously biased :)

6) Old style MOBI files have no support for hierarchical (nested) table of contents. Table of contents in these files are just a normal html page in the book, with a pointer to it in the opf file.

7) Things like the cover and the start location are controlled by "semantics" you create in the OPF file, it has nothing to do with NCX. Both Sigil and the calibre editor have tools to make that easy.

8) Conversion in calibre produces no "extraneous junk". Everything present in the output document will have been present in the input document. The one thing to keep in mind about calibre conversions is that they "flatten" the CSS, making the styles in the output file much harder to edit. Therefore, you should edit the source document and treat conversion as a final "compile" step.


For myself, since this is one of Calibre's core functions and because it's in very healthy active development and because the developer was amazingly helpful and responsive, I'm definitely going to be using Calibre for producing my book at the end of this month. I'll let this group (and the developers) know how that goes, on my blog and here.

Oh, and I just noticed that the Calibre FAQ is well written and also looks to cover quite a few of the topics mentioned earlier.




message 50: by Owen (last edited Jun 08, 2015 01:32AM) (new)

Owen O'Neill (owen_r_oneill) | 1509 comments Luke wrote: "3) Converting from Word will be much better if you directly convert DOCX rather than via the HTML export in Word..."

This is certainly true if you don't use the HTML-filtered option. In my experience, it is not true if you do, although the HTML-filtered output still needs some clean up -- replacing the CSS at the beginning being the most important step.

If you aren't familiar with how to do that, going with DOCX, as stated, I think would be better, but there is a good chance there will be some issues.

6) Old style MOBI files have no support for hierarchical (nested) table of contents. Table of contents in these files are just a normal html page in the book, with a pointer to it in the opf file.

Excellent to know! Thanks for posting that!

BTW: I don't want to imply one should not use Calibre. Just that there are a number of options, depending on what you are comfortable with. Much depends on setting up your document file carefully in the first place.


« previous 1
back to top