Category Archives: HowTo

How to Produce a Beautiful Book from the Command Line

Book Production Framework and Examples on GitHub

Introduction

Over the last couple of years, a number of people have asked me how I produce my books.  Most self-published (excuse me, ‘indie-published’) books have an amateurish quality that is easy to spot, and the lack of attention to detail detracts from the reading experience.  Skimping on cover art can be a culprit, but it rarely bears sole blame — or even the majority of it.   Indie-published interiors often are sloppy, even in books with well-designed covers.  For some reason, many authors give scant attention to the interior layout of their books.  Of course, professional publishers know better.    People judge books not just by their covers, but by their interiors as well.  If the visual appeal of your book does not concern you, then read no further.  Your audience most likely will not.

Producing a visually-pleasing book is not an insurmountable problem for the indie-publisher, nor a particularly difficult one.  It just requires a bit of attention.  Even subject to the constraints of print-on-demand publishing, it is quite possible to produce beautiful looking books.  Ebooks prove more challenging because one has less control over them (due to the need for reflowable text), but it is possible to do as well as the major publishers by using some of their tricks.  Moreover, all this can be accomplished from the command-line and without the use of proprietary software.

Now that I’ve finished my fifth book of fiction (and second novel), I figure it’s a good time to describe how I produce my books. I have automated almost the entire process of book and ebook production from the command-line. My process uses only free, open-source software that is well-established, well-documented, and well-maintained.

Though I use Linux, the same toolchain could be employed on a Mac or Windows box with a tiny bit of adaptation. To my knowledge, all the tools I use (or obvious counterparts) are available on both those platforms. In fact, MacOS is built on a flavor of unix, and the tools can be installed via Homebrew or other methods. Windows now has a unix subsystem which allows command-line access as well.

I have made available a full implementation of the system for both novels and collections of poetry, stories, or flash-fiction.   Though I discuss some general aspects below, most of the nitty gritty appears in the github project’s README file and in the in-code documentation.   The code is easily adaptable, and you should not feel constrained to the design choices I made.  The framework is intended as a proof of concept (though I use it regularly myself), and should serve as a point of departure for your own variant.  If you encounter any bugs or have any questions, I encourage you to get in touch.  I will do my best to address them in a timely fashion.

Examples

First, let’s see some examples of output (unfortunately, wordpress does not allow epub uploads, but you can generate epubs from the repository and view them in something like Sigil).  The novel and collection pdfs are best viewed in dual-page mode since they have a notion of recto and verso pages.

Who would be interested in this

If you’re interested in producing a fiction book from the command-line, it is fair to assume that (1) you’re an author or aspiring author and (2) you’re at least somewhat conversant with shell and some simple scripting. For scripting, I use Python 3, but Perl, Ruby, or any comparable language would work. Even shell scripting could be used.

At the time of this writing, I have produced a total of six books (five fiction books and one mathematical monograph) and have helped friends produce several more. All the physical  versions were printed through Ingram, and the ebook versions were distributed on Amazon. Ingram is a major distributor as well, so the print versions also are sold through Amazon, Barnes & Noble, and can be ordered through other bookstores. In the past I used Smashwords to port and distribute the ebook through other platforms (Kobi, Barnes & Noble, etc), but frankly there isn’t much point these days unless someone (ex. Bookbub) demands it. We’re thankfully past the point where most agents and editors demand Word docs (though a few still do), but producing one for the purpose of submission is possible with a little adaptation using pandoc and docx templates. However, most people accept PDFs these days.

My books so far include two novels, three collections of poetry & flash-fiction, and a mathematical monograph.  I have three other fiction books in the immediate pipeline (another collection of flash-fiction, a short story collection, and a fantasy novel), and several others in various stages of writing.  I do not say this to toot my own horn, but to make clear that the method I describe is not speculative.  It is my active practice.

The main point of this post is to demonstrate that it  is quite possible to produce a beautiful literary book using command-line, open-source tools in a reproducible way.  The main point of the github project is to show you precisely how to do so.   In fact, not only can you produce a lovely book that way, but I would argue it is the best way to go about it! This is true whether your book is a novel or a collection of works.

One reason why such a demonstration is necessary is the dearth of online examples. There are plenty of coding and computer-science books produced from markdown via pandoc. There are plenty of gorgeous mathematics books produced using LaTeX.   But there are very few examples in the literary realm, despite the typesetting power of LaTeX, and the presence of the phenomenal Memoir LaTeX class for precisely this purpose.  This post is intended to fill that gap.

A couple of caveats.

Lest I oversell, here are a couple of caveats.

  • When I speak of an automated build process, I mean for the interiors of books. I hire artists to produce the covers. Though I have toyed with creating covers from the command-line in the past (and it is quite doable), there are reasons to prefer professional help. First, it allows artistic integration of other cover elements such as the title and author. Three of my books exhibit such integration, but I added those elements myself for the rest (mainly because I lacked the prescience to request them when I commissioned the art early on). I’ll let you guess which look better. The second big reason to use a professional artist comes down to appeal. The book cover is the first thing to grab a potential reader’s eye, and can make the sale. It also is a key determinant in whether your book looks amateurish or professional. I am no expert on cover design, and am far from skilled as an artist. A professional is much more likely to create an appealing cover. Of course, plenty of professionals do schlocky work, and I strongly advise putting in the effort and money to find a quality freelancer.  In my experience, it should cost anywhere from $300-800 in today’s dollars.  I’ve paid more and gotten less, and I’ve paid less and gotten more.   My best experiences were with artists who did not specialize in cover design.
  • The framework I provide on github is intended as a guide, not as pristine code for commercial use. I am not a master of any of the tools involved. I learned them to the extent necessary and no more. I make no representation that my code is elegant, and I wouldn’t be surprised if you could find better and simpler ways to accomplish the same things. This should encourage rather than discourage you from exploring my code. If I can do it, so can you. All you need is basic comfort with the command-line and some form of scripting. All the rest can be learned easily. I did not have to spend hundreds of hours learning Python, make, pandoc, and so on. I learned the basics, and googled whatever issues arose. It was quite feasible, and took a tiny fraction of the time involved in writing a novel.

The benefits of a command-line approach

If you’ve come this far, I expect that listing the benefits of a command-line approach is unnecessary. They are roughly the same as for any software project: stability, reproducibility, recovery, and easy maintenance. Source files are plain text, and we can bring to bear a huge suite of relevant tools.

A suggestion vis-a-vis code reuse

One suggestion: resist the urge to unify code. Centralizing scripts to avoid code duplication or creating a single “universal” script for all your books may be enticing propositions. I am sorely tempted to do so whenever I start a new project. My experience is that this wastes more time than it saves. Each project has unforeseeable idiosyncrasies which require adaptation, and changing centralized or universal scripts risks breaking backward compatibility with other projects. By having each book stand on its own, reproducibility is much easier, and we are free to customize the build process for a new book without fear of  unexpected consequences. It also is easier to encapsulate the complete project for timestamping and other purposes. It’s never pleasant to discover that a backup of your project is missing some dependency that you forgot to include.

A typical author produces new books or revises old ones infrequently. The ratio of time spent maintaining the publication machinery to writing and editing the book is relatively small. On average, it takes me around 500 hours to write and edit a 100,000 word novel, and around 100 hours for a 100 page collection of flash-fiction and poetry. Adapting the framework from my last book typically takes only a few hours, much of which is spent on adjustments to the cover art.

Even if porting the last book’s framework isn’t that time consuming, why trouble with it at all? Why not centralize common code? The problem is that this produces a dependency on code outside the project. If we change the relevant library or script, then we must worry about the reproducibility of all past books which depend on it. This is a headache.

Under other circumstances, my advice would be different. For example, a small press using this machinery to produce hundreds of books may benefit from code unification. The improved maintainability and time savings from code centralization would be significant. In that case, backward-compatibility issues would be dealt with in the same manner as for software: through regression tests. These could be strict (MD5 checksums) or soft (textual heuristics) depending on the toolchain and how precise the reproducibility must be. For example, non-visual changes such as an embedded date would alter the hash but not textual heuristics. The point is that this is doable, but would require stricter coding standards and carefully considered change-metrics.

The other reason to avoid code reuse is the need for flexibility. Unanticipated issues may arise with new projects (ex. unusually formatted poems), and your stylistic taste may change as well. You also may just want to mix things up a bit, so all your books don’t look the same. Copying the framework to a new book would be done a few times a year at most, and probably far less.

Again, if the situation is different my advice will be too. For example, a publisher producing books which vary only in a known set of layout parameters may benefit from a unified framework. Even in this case, it would be wise to wait until a number of books have been published, to see which elements need to be unified and which parameters vary book to book.

Tools

Here is a list of some tools I use. Most appear in the project but others serve more of a support function.

Core tools

  • pandoc: This is used to convert from markdown to epub and LaTeX. It is an extremely powerful conversion tool written in Haskell. It often requires some configuration to get things to work as desired, but it can do most of what we want.  And no, you do not need to know Haskell to use it.
  • make: The entire process is governed by a plain old Makefile. This allows complete reproducibility.
  • pdfLaTeX: The interior of the print book is compiled from LaTeX into a pdf file via pdfLaTeX. LaTeX affords us a great way to achieve near-total control over the layout. You need not know much LaTeX unless extensive changes to the interior layout are desired. The markdown source text is converted via pandoc to LaTeX through templates. These templates contain the relevant layout information.
  • memoir LaTeX class: This is the LaTeX class I use for everything. It is highly customizable, relatively easy to use, and ideally suited to book production. It has been around for a long time, is well-maintained, has a fantastic (albeit long) manual, and boasts a large user community. As with LaTeX, you need not learn its details unless customization of the book layout is desired.  Most simple things will be obvious from the templates I provide.

Essential Programs, but can be swapped with comparables

  • python3: I write my scripts in python 3, but any comparable scripting language will do.
  • aspell: This is the command-line spell-checker I use, but any other will do too. It helps if it has a markdown-recognition mode.
  • emacs: I use this as my text editor, but vim or any other text editor will do just fine. As long as it can output plain text files (ascii or unicode, though I personally stick to ascii) you are fine. I also use emacs org-mode for the organizational aspects of the project. One tweak I found very useful is to have the editor highlight anything in quotes. This makes conversation much easier to parse when editing.
  • pdftools (poppler-utils): Useful tools for splitting out pages of pdfs, etc. Used for ebook production. I use the pdfseparate utility, which allows extraction of a single page from a PDF file. Any comparable utility will work.

Useful Programs, but not essential

  • git: I use this for version control. Strictly speaking, version control isn’t needed. However, I highly recommended it. From a development standpoint, I treat writing as I do a software project. This has served me well. Any comparable tool (such as mercury) is fine too. Note that the needs of an author are relatively rudimentary. You probably won’t need branching or merging or rebasing or remote repos. Just “git init”, “git commit -a”, “git status”, “git log”, “git diff”, and maybe “git checkout” if you need access to an old version.
  • wdiff, color-diff: I find word diff and color-diff very useful for highlighting changes.
  • imagemagick: I use the “convert” tool for generating small images from the cover art. These can be used for the ebook cover or for advertising inserts in other books. “identify” also can be useful when examining image files.
  • pdftk (free version): Useful tools for producing booklets, etc. I don’t use it in this workflow, but felt it was worth mentioning.
  • ebook-convert: Calibre command-line tool for conversion. Pandoc is far better than calibre for most conversions, in my experience. However, ebook-convert can produce mobi and certain other ebook formats more easily.
  • sigil: This the only non-command-line tool listed, but it is open-source. Before you scoff and stop reading, let me point out that this is the aforementioned “almost” when it comes to automation. However, it is a minor exception. Sigil is not used for any manual intervention or editing. I simply load the epub which pandoc produces into sigil, click an option to generate the TOC, and then save it. The reason for this little ritual is that Amazon balks at the pandoc-produced TOC for some reason, but seems ok with Sigil’s. It is the same step for every ebook, and literally takes 1 minute. Unfortunately, sigil offers no command-line interface, and there is no other tool (to my knowledge) to do this. Sigil also is useful to visually examine the epub output if you wish. I find that it gives the most accurate rendering of epubs.
  • eog: I use this for viewing images, though any image viewer will do. It may be necessary to scale and crop (and perhaps color-adjust) images for use as book covers or interior images. imageMagick’s “identify” and “convert” commands are very useful for such adjustments, and eog lets me see the results.

How I write

All my files are plain text. I stick to ascii, but these days unicode is fine too. However, rich-text is not.  Things like italics and boldface are accomplished through markdown.

Originally, I wrote most of my pieces (poems, chapters, stories) in LaTeX, and had scripts which stitched them together into a book or produced them individually for drafts or submissions to magazines. These days, I do everything in markdown  — and a very simple form of markdown at that.

Why not just stick with LaTeX for the source files? It requires too much overhead and gets in the way. For mathematical writing, this overhead is a small price to pay, and the formatting is inextricably tied to the text. But for most fiction and poetry, it is not.

I adhere to the belief that separating format and content is a wise idea, and this has been borne out by my experience. Some inline formatting is inescapable (bold, italics, etc), and markdown is quite capable of accommodating this. On the rare occasions when more is needed (ex. a specially formatted poem), the markdown can be augmented with html or LaTeX directly as desired. Pandoc can handle all this and more. It is a very powerful program.

I still leave the heavy formatting (page layout, headers, footers, etc) to LaTeX, but it is concentrated in a few templates, rather than the text source files themselves.

There also is another reason to prefer markdown. From markdown, I more easily can generate epubs or other formats. Doing so from LaTeX is possible but more trouble than it’s worth (I say this from experience).

What all this means is that I can focus on writing. I produce clear, concise ascii files with minimal format information, and let my scripts build the book from these.

To see a concrete example, as well as all the scripts involved, check out the framework on github.

Book Production Framework and Examples on GitHub

How to Get a Patent in 2 Easy Steps!

1. Expedited Process: [Note: if your name is not Apple, Google, Microsoft, Sony, or Oracle, skip to step 2]:

Scribble a drawing in crayon on a napkin, write ‘for, you know, stuff’ and drop it off at the Patent Commissioner’s house when you have dinner with him and his wife. On the off-chance it isn’t accepted the next day, be polite but firm. The assigned examiner may be new or overworked. Bear in mind, he is NOT your employee. He serves several other large corporations as well.

By the way, don’t forget that the Patent office is running a special this month: you get every 1000th patent free!

2. Standard Process:

(i) spend several months with a team of lawyers (paid out of pocket) carefully researching the state of the art of your field, fleshing out your idea, researching potentially related patents, and constructing unassailable claims of your own. In the course of this, learn a new language called “legalese,” which bears only a superficial resemblance to English — much as its speakers bear only a superficial resemblance to humans.

(ii) assemble a meticulously crafted and airtight application — one which no sane person can find fault with, because it has no fault.

(iii) get rejected by the examiner, who clearly did a sloppy google search for some keywords. He cites several patents which have nothing in common with yours, except for those keywords.

(iv) reply to said patent examiner, patiently explaining why a simple reliance on keyword similarities is insufficient evidence of prior art, and that modern linguistic scholarship has shown different sentences can have words in common.

(v) receive a reply with “final rejection” emblazoned in huge letters, and in what appears to be blood. An attached notice explains that any further communication regarding this patent will result in a late-night visit by three large fellows with Bronx accents. Your lawyers dismiss this as boilerplate, and explain that “final rejection” actually means “we want more patent fees.”

(vi) battle your way through 50 years and $1,000,0000 of appeals and rejections as the examiner displays an almost inhuman level of ineptitude, an apparent failure to grasp rudimentary logic, infantile communication skills, and an astonishing ability to contradict himself hour to hour.

(vii) Suspect your patent examiner is planning to run for Congress, where his skills would be better appreciated. Encourage him to do so. Maybe his replacement will be better equipped, possessing both neurons and synapses.

(viii) Eventually you reach the end of the process. There has been one of two outcomes:

  • You passed away long ago, and no longer care about the patent.
  • Your application finally was accepted. Because an accepted patent is valid from the original date of application, yours expired decades ago. But this does not matter, since the idea is long obsolete anyway.

Either way, you should feel privileged. You have participated in one of the great institutions of American Democracy!

Why Your Book Won’t Be an Amazon Success Story

I’m going to be that guy. The one nobody likes at parties. The one who speaks unpleasant truths. If you don’t want to hear unpleasant truths, stop reading.

If you want to be told which self-help books to buy and which things to do and which gurus will illuminate the shining path to fame and fortune, stop reading.

If you want somebody to hold your hand, and nod at all the right moments and ooh and aah about how your writing has come a long way and you’re “almost there,” stop reading.

It doesn’t matter whether you’ve come a long way. It doesn’t matter whether your writing is almost there, is there, or is beyond there. It doesn’t matter what you’re saying or how you’re saying it. You may have written the most poignant 80,000 words in the English language, or you may have another book of cat photos. None of that matters.

Unless you’re a certain type of person saying a certain type of thing in a certain way, none of it matters. And that certain type of person, that certain type of thing, and that certain way changes all the time. Today it’s one thing, tomorrow it will be another.

Statistically speaking, you’re not it.

“But what about all those success stories,” you argue. “I’m always hearing about Amazon success stories. Success, success, success! This book mentioned them and that blog mentioned them and the 12th cousin of my aunt’s best friend’s roommate had one.”

There are two reasons this doesn’t matter.

Most of those stories are part of a very large industry of selling hope to suckers. Any endeavor which appeals to the masses and appears to be accessible to them spawns such an industry. Business, stock picking, sex, dating, how to get a job, how to get into college, and on and on. Thanks to today’s low barrier to entry, self-publishing is the newest kid on that block.

This isn’t a conspiracy, or some evil corporation with a beak-nosed pin-striped CEO, cackling ominously while rubbing his hands. Self-publishing just attracts a lot of people who see an easy way to make money. When there’s a naive, eager audience, a host of opportunists and charlatans purvey snake oil to any sucker willing to pay. They’re predators, plain and simple. Hopefully, I can dissuade you from being prey. Leave that to others. Others unenlightened by my blog. Cynicism may not always be right, but it’s rarely wrong.

Even seemingly reputable characters have become untrustworthy. The traditional publishing industry has grown very narrow and institutional, and life is hard for everyone associated with it. The temptation to go for the easy money, and cast scruples to the winds, is quite strong. Not that denizens of the publishing industry ever were big on scruples. Many individuals from traditionally respectable roles as agents, editors, and publishers find it increasingly difficult to eke out a living or are growing disillusioned with a rapidly deteriorating industry. It is unsurprising that they are bedazzled by the allure of easy money. Unsurprising, and disappointing. This is especially insidious when agents offer paid services which purport to help improve your chances with other agents. The argument is that they know what their kind wants. Anybody see the problem with this? Anybody, anybody, Bueller? It would be like H.R. employees taking money to teach you how to get a job with them. Oh wait, they do. How could THAT possibly go wrong…

I’m not going to delve into the “selling hope to suckers” angle here. That is fodder for a separate post, in which I analyze a number of things which did or did not work for me. For now, I’ll focus on the second reason your book won’t be an Amazon Success Story. Incidentally, I will resist the temptation to assign an acronym to Amazon Success Story. There! I successfully resisted it.

In this post, I’ll assume that ALL those stories you hear are right. Not that they’re 99% bunk or that most actual successes had some outside catalyst you’re unaware of or were the result of survivorship bias (the old coin-flipping problem to those familiar with Malkiel’s book). To paraphrase the timeless wisdom of Goodfellas, if you have to wait in line like everyone else you’re a schnook. If you’re trying what everyone else tries, making the rounds of getting suckered for a little bit here a little bit there, with nothing to show for it — you’re the schnook.

Don’t feel bad, though. No matter how savvy we are in our own neighborhoods, we’re all schnooks outside it. Hopefully, I can help you avoid paying too much to learn how not to be a schnook.

I can’t show you how to be successful, but I can show you to avoid paying to be unsuccessful. But that’s for another post. We’re not going to deal with the outright lies and deception and rubbish here. Those are obvious pitfalls, if enticing. Like pizza.

In this post, we’re going to assume the success stories are real — as some of them surely are. We’re going to deal with something more subtle than false hope. We’re going to discuss the OTHER reason you won’t be successful on Amazon. It’s not obvious, and it can’t be avoided.

But first, I’m going to make a plea: if you’re the author of one of those breathless, caffeinated “how to be a bzillionaire author like me” books or blogs or podcasts … stop it. Please. Just stop it. Unless you’re cynically selling hope to suckers or mass-producing content-free posts as click-bait. In that case, carry on. I don’t approve of what you do, but I’m not going to waste breathe convincing dirtbags not to be dirtbags. However, if you’re even the least bit well-meaning, stop. Maybe you have some highly popular old posts along these lines. Update them. Maybe you’re writing a new series of posts based on what your friend named John Grisham has to say to self-publishing authors. Don’t.

You’re doing everyone a disservice. People will waste money and time and hope. Best to tell them the truth. You may not be that guy. You may be too nice, tactful, maybe even (dare I say) an optimist. I’m not an optimist. I AM that guy. No false hope sold here.

Maybe you’re still reading this and haven’t sky-dived into a volcano or fatally overdosed on Ben & Jerries, or turned to one of those cheerful, caffeinated blogs. Shame on you. There’s special internet groups for people like you. But you’re still here, and I haven’t driven you away. I must be doing something wrong.

If you’re a true dyed in the wool masochist, I’ll now explain why you won’t be successful. It has to do with a tectonic shift in Amazon’s policies.

Over a year ago, I wrote a post titled “Why NOT to use Amazon Ads for your book,” which many people have written me about. Most found it a useful take on Amazon ads, and one of the few articles which doesn’t regurgitate lobotomized praise for the practice.

I stand by that. Subsequent experiments (to be reported in a future post) have shown that Amazon ads perform even worse now. This led me to wonder why. Why did all the long-tailed keywords and the reviews and the ads make no difference. None of us know the precise inner workings of Amazon ads, but there are strong indications of their behavior.

I now will offer my theory for why there are success stories, why it’s tempting to believe they can be emulated, and why they cannot. To do so, let’s review some basic aspects of Amazon’s algorithms.

There are two algorithms we care about:

(1) The promotion algorithm, which ranks your book. It is responsible for placing it in any top 100 lists, determining its visibility in “customers also bought” entries, when and how it appears in searches, and pretty much any other place where organic (i.e. non-paid) placement is involved.

(2) The ad auction algorithm, which determines whether you win a bid for a given ad placement.

The promotion algorithm determines how much free promotion your book gets, and is critical to success. It has only a couple of basic pieces of information to work with: sales and ratings. The algorithm clearly reflects the timing of sales, and is heavily weighted toward the most recent week. It may reflect the source of those sales — to the extent Amazon can track it — but I have seen no evidence of this. As for ratings, all indications are that the number of ratings or reviews weighs far more heavily than the ratings themselves. This is true for consumers too, as long as the average rating is 3+. Below that, bad ratings can hurt. Buyers don’t care what your exact rating is, as long as there isn’t a big red flag. The number of ratings is seen as a sign of legitimacy, that your book isn’t some piece of schlock that only your grandmother and dad would review — but your mom was too ashamed to attach her name to. Anything from a traditional publisher has 100’s to 1000’s of ratings. A self-published work generally benefits from 15+. More is better.

It makes sense that the promotion algorithm can play a role, but why mention an “ad auction algorithm”. Ad placement should depend on your bid, right? Maybe you can tweak the multipliers and bids for different placements or keywords, but the knobs are yours and yours alone. You might very well think that, but I couldn’t possibly comment. Unlike the ever-diplomatic Mr. Urquhart, I’m too guileless to take this tack. I also don’t use Grey Poupon. I can and will comment. You’re wrong. Amazon’s ad algorithm does a lot more behind the scenes. You may be the highest bidder and still lose, and you may be the lowest bidder and win.

As usual, we must look at incentives to understand why things don’t behave as expected. Amazon does not run ads as a non-profit, nor does it get paid a subscription fee to do so. It only makes money from an ad when that ad is clicked, and it only makes money from a sale when the ad results in a conversion. For sellers, the latter is a commission and for authors it’s the 65% or 30% (depending on whether you chose the 35% or 70% royalty rate) adjusted for costs, etc. In either case, they make money from each sale and they make money from each click.

Amazon loses money if your ad wins lots of impressions, but nobody clicks on it. They would have been happier with a lower bid that actually resulted in clicks. If lots of people click on your ad, but few people buy your book, Amazon would have been happier with a lower bid which resulted in more sales. It’s a trade-off, but there are simple ways of computing these things. When you start fresh, Amazon has no history (though perhaps if you have other books, it uses their performance). It assigns you a set of default parameters representing the average performance of books in that genre. As impressions, clicks, and sales accrue, Amazon adjusts your parameters. This could be done through a simple Bayesian update or periodic regressions or some other method.

When a set of authors bids on an ad, Amazon can compute the expected value of each bid. This looks something like P(click|impression)*ebid + P(sale|click)P(click|impression)*pnl, where P(click|impression) is your predicted click-through-rate for that placement, P(sale|click) is your predicted conversion rate for that placement, ebid is the effective bid (I’ll discuss this momentarily), and pnl is the net income Amazon would make from a sale of your book. This is an oversimplification, but gets the basic idea across.

The ebid quantity is your effective bid, what you actually pay if you win the auction. There actually are two effective bids involved. Amazon’s ad auctions are “second-price,” meaning the winning bidder pays only the 2nd highest bid. Suppose there are 5 bids: 1,2,3,4,5. The bidder who bid 5 wins, but only pays 4. There are game theoretic reasons for preferring this type of auction, as it encourages certain desirable behaviors in bidders. In this case, the effective bid (and what Amazon gets paid) is 4. That is no mystery, and is clearly advertised in their auction rules. What isn’t advertised is the other, hidden effective bid. These effective bids may be 3,2,4,2,3, in which case the third bidder wins. What do they actually pay? I’m not sure, but something less than their actual bid of 3.

Apparently, whatever algorithm Amazon uses guarantees that a bidder never will pay more than their actual bid. It somehow combines the two types of effective bids to ensure this. I am not privy to the precise algorithm (and it constantly changes), so I cannot confirm this. However, I have been informed by an individual with intimate knowledge of the subject that Amazon’s approach provably guarantees no bidder will pay more than their actual bid.

Why would Amazon prefer a lower bid, when they could get 4? As mentioned, they only get paid 4 if the ad of the winning bidder (the 5) gets a click. If the ad makes every reader barf or have a seizure or become a politician, there won’t be a lot of clicks. If it’s the most beautiful ad in human history, but the book’s landing page makes potential buyers weep and tear their hair and gnash their teeth, it probably won’t make many sales. In either case, Amazon would do better with another bidder.

Even without knowing the precise formula, one thing is clear. These algorithms are a big problem for anyone who isn’t already a star.

The problem is that those two algorithms play into one another, generating a feedback loop. If you’re already successful, everything works in your favor. But if you start out unattractive to them, you remain that way. You have few quality ad placements, and get few sales, and this suppresses your organic rank. The organic rank factors into many things which affect P(click|impression) and P(sale|click) — such as the number of reviews, etc. Put simply, once they decide you’re a failure, you become a failure, and remain one. You won’t win quality bids, even if you bid high. If you bid high enough to override the suppression, then you’ll pay an exorbitant fee per click, and it will cost a huge amount to reach the point where success compounds.

I am unsure whether there is cross-pollination between works by a given author, but I strongly suspect so. A new work by a top-ranked author probably starts high and is buoyed by this success. This may be why we see a dozen works by the same author (obviously self-published, and sometimes with very few ratings per book) in the top-100 in a genre.

So how do you get out of this hole? There’s only one accessible way for most people: you cheat. And this is where Amazon’s tectonic policy shift comes into play.

There ARE success stories, like the aforementioned top-ranked self-published authors. But there won’t be any more. To understand why, we must turn to hallowed antiquity before Bezos was revealed to be the latest incarnation of Bchkthmorist the Destroyer, and when Amazon brought to mind a place with trees, snakes, and Sean Connery.

There was a time when the nascent self-publishing industry had really begun to boom, but was poorly regulated. The traditional publishers viewed Amazon, Kindle, and self-publishing as a joke. They relied on their incestuous old-boys network of reviewers from the NY Times, New York Review of Books, and pretty much anything else with New York in the name for promotion. 95% of self-published books were about how to self-publish, and authors who DID self-publish (and were savvy) quickly developed ways to game Amazon.

They COULD pump up their search results, get in top-100 lists, and so on. Usually, this involved getting lots of fake reviews and using keyword tricks to optimize search placement. Once in the top list for a genre, it was easy to stay there — though newcomers with more fake reviews and better keyword antics could displace you. The very top was an unstable equilibrium, but the top 500 or 1000 was not. Once up there, it was easy to keep in that range and then occasionally pop into the very top. Like a cauldron of mediocrity, circulating its vile content into view every now and then. Amazon periodically tweaked its algorithms, but authors kept up.

Then something happened. Amazon decided to crack down on fake reviews. This sounds laudable enough. Fake reviews have the word fake in them, and fake always is bad, right?

There were two problems with HOW Amazon went about it. First, they went way overboard. Overnight, it became well-nigh impossible for an author to get a single new review. If the reviewer had one letter in common with your name, lived in the same hemisphere, or also breathed air, they were deemed connected to you and thus biased.

If this had been applied uniformly, there would be nobody in the top 100 — or it would be random, since nobody would have any tricks they could play. This is where the second problem with Amazon’s approach came in. They didn’t remove legacy fake ratings. Those who cheated before the cutoff got to keep their position. In fact, that position now was secure against all newcomers. A gate had slammed down, and they were firmly on the right side of it. Aside from a few people near the boundary they had nothing to fear. Well, almost nothing to fear.

The only way to break into the top echelon, and thus benefit from the self-reinforcing algorithms which stabilize that position, is to rely on external sources of sales. If you have a million twitter followers who buy your book, or a massive non-amazon advertising campaign, you can break in. They YOU would be very difficult to displace.

Once traditional publishers realized that Amazon is the only de facto bookstore left (outside airport/supermarket sales), they took an interest. THEY have no problem getting a top rank, because they run huge advertising campaigns and have huge existing networks. This is why the top 100 lists are an odd mixture of self-published books you never heard of and traditionally published bestsellers. Eventually it only will be the latter.

So. You. Won’t. Break. In. Amazon created an impenetrable aristocracy, and you’re not it. You won’t be it. You can’t be it. If you use Amazon ads or buy into any of the snake oil sales nonsense, you’ll be the schnook bribing a maitre d’ who knows he’ll never let you in.

Most of those success stories (or at least the real ones) are from before the policy change, as are many of the methods being touted. That path is gone. Amazon ads only work for those who don’t need them, and they work very well for them. They won’t work for you. Becoming a success on Amazon is as unlikely as with a traditional publisher. You’ll always hear stories, but they’re either the few who randomly made it, those with hidden external mechanisms of promotion, or those already entrenched at the top.

That’s the sad truth, or at least my take on it. By all means, waste a few dollars trying. I used to be a statistical trader and know better, but I still buy a lottery ticket when the jackpot’s high enough. It’s entertainment. Two dollars to dream for a day. I just don’t expect to win.

Write what you want, revise, work your butt off, and make it perfect. But do it because you want to, because that’s what makes you happy. Don’t do it expecting success, or hoping for success, or even entertaining the remote possibility of success.

The worst reason to write is for other people. Your work won’t be read, and your work won’t make you money. If you accept that and are happy to write anyway, then write all you want. I urge you to do so. It’s what I do.