Category Archives: HowTo

Review: Substack

Over the past year, I’ve used Substack extensively to serialize my novel The Tale of Rin. I’d like to offer some thoughts about what Substack is and what it isn’t. I think there is a lot of confusion about this.

Let me begin by saying that I like Substack. A lot. I can say that about very few online services, and for good reason. In fact, it is quite possible that Substack is the only “social-media” type of service I actually find tolerable. Yes, my personal website is hosted on WordPress.com — but it would be accurate to say I barely tolerate WordPress. It is the least inconvenient of the alternatives, and I’ve tried many. Though WordPress barely does what I want — and overcharges for it — at least it meets that low bar.

This is not how I feel about Substack. They meet a rather high bar, and I actively like the service. In particular, I like their approach that leaves power in the hands of authors. This is not just some marketing line they tout (like Google’s “don’t be evil”). It permeates every aspect of their service. For starters, you have full control of your subscriber list. That is not true of almost any other social media service.

It is quite possible (perhaps even likely) that Substack one day will shed the character which appeals to me. This does seem to be the lifecycle of dot-coms, especially social-media ones. But right now it is in the honeymoon, the penetration marketing phase that is ideal for users and costly for investors. I have no idea where it will go after that. However, I am pleased with how they operate now. Pleased enough to even invest a small amount of money when they gave users the opportunity to. I don’t invest in Silicon Valley startups. Ever. This was done as a show of support, and I fully expect the money to evaporate.

Let me illustrate why I like them. When I began serializing The Tale of Rin, I knew very little about either Substack or Kindle Vella. So I decided to try both simultaneously. This threw them into stark contrast. Good heavens what a horror-show Kindle Vella is. Aside from the fact that 99.999% of the books on it were “billionaire werewolf superagent falls for middle-aged housewife” type schlock, Amazon itself was thoroughly unpleasant about posting and removing chapters. Each post had to be vetted, which entailed a multi-day delay. There was no subscription list because everything was done by “chapters read” using a token system, and if I made a mistake or needed to edit something there was a whole complicated process. The default assumption seemed to be that authors were out to scam their readers. Judging from the sort of authors the site attracted, I’m guessing this wasn’t far off.

By contrast, on Substack I was able to post all 79 chapters of The Tale of Rin (Book 1: Protege) without issue. I spent my time writing, not wrangling their technology or bureaucracy. It was quick and easy to post things, and (almost) everything operated as expected.

Also important, Substack doesn’t throw up a bunch of obstacles. You don’t need to personally communicate with their reps to get simple things done (though you do for certain major changes, like dispensing with paid subscriptions altogether). The author has plenty of control.

This said, Substack often is depicted as more than it is. So what is Substack? It is a relatively benign site for managing an online newsletter. In return for its benefits, you trade a portion of any paid subscription fees and a lot of web-design flexibility (though you still can do quite a bit with the freedom they give you). This is not WordPress or Squarespace, and it doesn’t pretend to be. Substack is a newsletter service, and that is where it shines.

You cede zero flexibility in the actual management of the newsletter itself, and the tools they provide actually are pretty useful. Their subscription model is flexible as well. Unlike Amazon (or anywhere else I’ve encountered), you can give away paid subscriptions without any hassle. You also can refund subscriptions or adjust them in various ways. Instead of assuming you’re out to scam people, Substack seems to understand you may occasionally need the flexibility to accommodate certain subscribers.

One of my favorite features is the ability to schedule posts far in advance. You can specify a precise date and time to release each post to your paid subscribers — and a separate date and time to release it to your free subscribers. This allows me to fire-and-forget several episodes of my novel in advance.

That doesn’t mean there isn’t room for improvement. One beef I have is that the minimum they allow you to charge paid subscribers still is quite high (when last I looked, it was $5 monthly or $30 annually). For a high-volume newsletter, those numbers may make sense — but for a slowly-serialized novel (mine was originally 4 episodes per month) or a low-volume newsletter they do not. Though it is quite possible that these minimums make sense in the context of the transaction fees charged by the financial system, they nonetheless were a source of frustration to me. I felt I was forced to overcharge (or not charge at all) for what my subscribers got. I’ve since dispensed with paid subscriptions altogether, so this no longer is an issue for me. However, it is something to be aware of.

I also have some minor quibbles concerning their online post-editor. This is what you use to actually type (or in my case, copy and paste) your post. It has some unintuitive and downright frustrating aspects, but the same is true of every other web-browser text-entry box I’ve ever seen. Substack’s certainly is no worse than WordPress’s (though it has many fewer features, of course). For me, the text-entry box wasn’t a major obstacle. I type everything in my own preferred text editor, and then copy and paste it into theirs. Thankfully, their editor appears to recognize markdown (which is what I use for formatting), and that saves me the headache of having to manually implement italics, bold-face, each time.

I highly recommend Substack for both newsletters and novel serialization. However, there are a number of misconceptions which seem to exist concerning it. Some of these can be a huge source of misplaced expectations and frustration. Before deciding whether to use it, you should be aware of them.

  • Substack won’t promote your blog. You will acquire almost no new subscribers on Substack itself, at least until you’re already successful. That is not their purpose. Yes, there is some discovery on Substack. For example, new newsletters appear to be briefly promoted. I’m not entirely sure what the criteria are, but my own visibility quickly diminished. I don’t begrudge them that. If I was famous and new subscribers kept pouring in, maybe they would have kept promoting me. I’m not sure. The point is that if you go into this relying on Substack to promote you or imagining that a steady stream of subscribers magically will materialize, you’ll be disappointed. There are cross-fertilization tools for newsletters to promote one another — but you’re unlikely to avail yourself of these until you become successful in your own right. In this (and only this) regard, it is best to think of Substack as similar to WordPress. You wouldn’t expect your blog to become famous just because it’s hosted on WordPress. Substack is very much BYOB. They supply the venue and the tools, you supply the writing and the readers. What recommends them is that their venue is pleasant and their tools work and they very much stay out of your way.

  • There aren’t very many Serialized Novels on Substack, though there should be. Actual Serialized Novels are few and far between on Substack, but I think it’s a fantastic platform for them. If you Google “substack” and “novel”, most of the results concern promoting an existing novel via a Substack newsletter. But Substack has far greater potential utility to authors than this. Serializing a novel on Substack can be very rewarding, as long as you go in with your eyes open. Here are a few things to note.

    • Working toward a regular deadline and for readers who care is highly motivating. I’ve published several novels whole, but serializing The Tale of Rin is what most consistently got me to the keyboard.

    • Most of the famous serialized novels (by Salmon Rushdie, etc) were commissioned (and paid for) by Substack itself. I.e., those authors didn’t decide that Substack is the future of novel-publishing, tell their agents to take a hike, and jump ship. They were paid to produce a specific work on the platform. By all accounts (and I haven’t read the works in question), the resulting novels not of their usual calibre. Be this as it may, I am disappointed with how Substack went about the whole thing. It probably dissuaded some authors from legitimately using the platform, and that’s a real shame. The gist is that you won’t be “in good company”, or at least in any better company than on any other platform.

    • Many of the “serialized novels” by ordinary authors never see completion. They either run out of steam or (worse) involved a bait and switch to begin with. In my case, The Tale of Rin already had been written and heavily edited (though apparently not heavily enough, since I spent a great deal of additional time tidying each episode before release). It is easy to see how somebody serializing a novel on the fly could become overwhelmed or write themselves into a corner or just lose interest. Sometimes, it’s also a bit less innocent than that. Apparently, certain authors begin serializing a novel as a trick to hook readers. There’s nothing wrong with this, if that’s what is being peddled. If the readers know they will get 10 episodes for free and then have to buy the book, no harm no foul. But some authors apparently don’t do that. Instead, they “suddenly” change their mind at some point and decide to publish an ordinary ebook instead. All the people who invested themselves in reading the first however-many chapters now have to buy that ebook. This is downright dishonest, and it sours readers to serialized novels in general. Though I also published The Tale of Rin (Book 1: Protege) as a book, I did so only after making the whole thing available for free on Substack. If you do serialize, I strongly urge you to stay the course — and certainly don’t demand money from readers who were led to expect otherwise.

  • A few newsletters dominate the scene. Like every other venue for subscriptions, ebooks, or products, less than 1% of the players account for 99% of the success. These are not merely some “in-crowd” who receive favorable treatment (though I can’t say for sure there isn’t an element of that). Most of them probably arrived with an already-successful franchise or were early to the table and rode the wave just the right way. Whatever the reason, they are the established “winners” and almost certainly will stay that way. However, this shouldn’t dissuade you. Since you’re not relying on Substack promotion and it’s not a zero-sum game, your chances of success probably aren’t hurt much by theirs. It always is hard to succeed, but the success of others doesn’t stack the deck against you like elsewhere. For example, with Kindle sales, Amazon’s algorithms entrench the top sellers and make it well-nigh impossible for a latecomer to succeed. Substack just hosts the party. Whether people come is up to you. In fact, the success of others on Substack actually can buoy you — but only if you become successful enough on your own for other newsletters to take note and recommend you. Then Substack will promote you to their readers, and their readers will see your Notes, etc.

  • The “Notes” feature is worthless unless you already are successful. Substack sometimes makes it sound like you can add your voice to the conversation and everyone will hear, and the interface can feed this illusion. You see a bunch of notes in your feed, many from big-name newsletters. It is tempting to think that if you post a note, they’ll see it in kind. They won’t. A user sees notes only from the newsletters they are subscribed to and those recommended by such newsletters. I.e., you get one extra layer of linkage from Notes. If you can get a big newsletter to recommend you, lots of new people will see your notes and you may get a boost in subscriptions. But this won’t happen. Big newsletters get tons of requests to recommend others, and they probably ignore them all. Nor will they reciprocate if you recommend them, because frankly they don’t care about you and your 12 subscribers (comprising your mom, your parakeet, and your ten sisters). With this in mind, you should resist the temptation to post notes or try to participate in the “conversation”. Your mic is off. Don’t waste time on Notes unless one or more big newsletters already have recommended you. Literally nobody other than your subscribers will see them.

  • I strongly advise against using a paid subscription model unless you (i) already have lots of subscribers and (ii) need the money desperately. If you do go this route, it probably is best to keep a free tier as well. You can distinguish the two by delaying the free posts or posting paid-only special additional content.

  • Though Substack provides seemingly detailed statistics on subscriber engagement, these need to be taken with a grain of salt. For example, they only count opened emails. They detect “opened” emails using the usual tricks, so if the reader’s software configuration doesn’t automatically download online assets (which many security-minded people prefer not to) then there’s no way to detect that the email was opened. Also, “opened” doesn’t mean “read”. Lots of people just skim their emails to clear their inbox. And finally, some subscribers prefer to read posts on the Substack site itself. Maybe they read them in batches and don’t want to dig through old emails or maybe they prefer the full-web version to reading them in their email-browser. Regardless, my guess is that their statistics aren’t particularly accurate.

Kindle Scribe — An Interesting Device Crippled by Bad Software

I’m a big fan of Kindle ereaders. However, since the long-defunct Kindle DX there hasn’t been one with sufficient screen real estate to read scientific papers and pdfs. The Kindle Scribe boasts a (slightly) bigger screen than the Kindle DX and allows writing as well. Couple that with the insanely long battery life of a Kindle ereader and it sounds like a dream machine, right? Wrong. Unfortunately, Amazon decided to follow Apple and Garmin down the path of crippling great hardware. As a result, the writing function is all but worthless to sensible users.

There were warning signs about whom the Kindle Scribe was intended for. I tried to find the screen dimensions (or resolution), and the top few pages of search results showed only the device dimensions or the diagonal (10.2 in). If you’re curious, the screen measures around 6×8 in (6.1×8.2 in to be precise, but there’s probably a tiny border) and probably is 1800×2400 pixels (given their 300 dpi spec), constituting a 1:1.3333 aspect ratio. Such information is pretty basic and useful, but even Amazon’s spec page didn’t list it. It did, however, list the pen’s dimensions, the WIFI security protocols it supports, the wattage of the power adapter, and the (sole) available color. I fail to see how any of those are more relevant than the screen dimensions.

Another warning sign is that finding any useful info about the Amazon Scribe is well-nigh impossible. Granted, it’s only been available for a few days. However, the many sites purporting to review it or to provide helpful tips contain nothing more than caffeinated regurgitation of marketing info. Even the “criticisms” read as thinly-veiled praise. You almost can hear the authors’ voices trembling, though whether from excitement or fear is unclear.

“Pro: Everything about everything. All is for the best in the best of all possible worlds, and Amazon IS that world.”

“Con: Amazon can’t get any more amazing than it currently is. But I’m sure it somehow will!!! Please don’t hurt my family …”

Frankly, I wouldn’t be surprised if they’re mostly shills. That seems to be standard practice in online marketing these days.

Marketing practices and information availability aside, the real problem is that there is no sensible way to extract notebooks from the Kindle Scribe. It is easy to write on the device, and the process is quite pleasant. It’s really a lot like a paper notebook. I’m not a user of existing writeable e-ink tablets or of Wacom devices, but I found the Kindle Scribe’s hardware perfectly suitable for my purposes. Sadly, hardware alone maketh not the product.

As far as I can tell, Kindle Scribe notebooks aren’t stored as pdfs or in any standard format. Rather, they appear to be stored as sequences of strokes in some sort of non-standard Sqlite DB. How can a Sqlite db be “non-standard”? When I try to examine the db schema in Sqlite3, I get a “malformed” error. Most likely, Amazon relies on some Sqlite3 plugin that encrypts the notebooks or otherwise obfuscates them.

None of this is worthy of criticism in itself; in fact, the ability to undo and redo my writing stroke-by-stroke is quite nifty. However, Amazon offers no sensible way to export a pdf from a notebook. The only solution is to email it to yourself as a pdf using their Whispernet.  From what I’ve read,  Whispernet pretty much acts like malware by logging all sorts of activity.  To my mind, the best policy is to keep it off all the time. Why not just fire it up briefly to export a notebook? It is quite possible that the Kindle logs activity locally and then sends the whole log to Amazon the moment Whispernet is activated. I have no evidence that this actually happens, but I wouldn’t be surprised.

However, there is a much bigger issue with Whispernet. Even if you trust Amazon’s intentions, there are a lot of other parties involved and a lot of potential data leaks. Anything you write is uploaded to Amazon’s servers, then converted to a pdf on their end, and finally emailed unencrypted to your email address. I.e., there are multiple insecure steps, not least of which is the receipt of that unencrypted email by your email service. Any security-minded individual would have an issue with this, and it precludes the use of notebooks for anything sensitive or by any professional required to adhere to security protocols.

One of the aforementioned hypercaffeinated blogs gushed over the ability of the author’s law-firm to collaboratively sign NDA’s using Kindle Scribes. My response is simple: “Are you out of your f-ing head?” I’ll mark down the name of that firm to make sure I never do business with it (though frankly, my experience is that most lawyers are terrible with technology and security — which is odd in a profession whose bread and butter is sensitive information). Why is this a bad idea? Let us count the ways. First, it would require that all participants have Amazon Scribes and Amazon accounts. I can count on one finger the number of people I know who have one — and probably ever will. Second, all the info would be passed unencrypted between the various parties and Amazon and whatever email servers are involved for each and every individual. Third, how is this innovative or necessary? Plenty of secure and simple collaborative signature mechanisms exist. Whatever you may think of Docusign, it’s a sight better than passing unencrypted PDFs around like this. I’d guess that iPads have had suitable functionality for some time (and with much better security) as well. And more than one person I know has those.

The inability to directly export PDFs may or may not be an oversight, but I suspect not. I tried a few obvious workarounds and was stymied at every turn. Sometimes it felt quite deliberate.

First, I tried extracting the relevant files. As mentioned, they appear to be in some proprietary variant of Sqlite3. There is no obvious way to extract useful info except by reverse engineering them. Supposedly, it is possible to view notebooks offline using an Amazon reader — but I have had no luck with this. Besides, Amazon doesn’t really support Linux (even though they use it on all their servers AND the Amazon Scribe) — so such functionality probably isn’t available to me without firing up a VM.

Second, I created a PDF notebook of blank pages using pdftk and a free college-ruled template I found online. My plan was to annotate it as a pdf rather than use the official notebook feature. Presumably, PDF annotations provide the same interface. I sideloaded the PDF file via USB, but annotations were unavailable. Apparently, they only are available for PDFs loaded via the “send-to-kindle” email mechanism. I can’t think of any benign reason for such a limitation.

Third, I tracked down the abstruse details of how to “send-to-kindle” and sent the PDF to my Kindle Scribe using the “convert” command in the subject line (as recommended). The Kindle Scribe wouldn’t even open it. Apparently, Amazon’s “convert” machinery is incapable of even converting a blank notebook.

Fourth, I did the same but with a blank subject line. Presumably, this passes the pdf along unchanged. Now, the Kindle Scribe both opened the file and allowed annotations. Success!!! Or not. Like notebooks, annotations apparently are stored as individual strokes in some proprietary Sqlite3 db. And apparently, there also is no mechanism to export them without going through the same email process as for a notebook. Unsurprising, but annoying. You’d figure Amazon at least would allow modified PDFs to be saved as … well … PDFs.

Put simply, Amazon appears to have made it impossible to securely or locally export notebooks or annotations. The writing feature therefore remains largely unusable by anyone with a sensible regard for privacy or security. The year is 2022, and the best solution is to photograph the notebook pages and xfer the photos to your computer. Yay technology! If only I had to send the photos to be developed and then scanned, my jubilation would be complete.

Given the house of horrors which is AWS, the ridiculosity surrounding Kindle Scribe Notebooks should come as no surprise. After all, building your own computer from vacuum tubes and then writing an OS for it is easier than setting up an AWS instance. Sadly, the institutional maladies which plague AWS (and Amazon author services, for anyone unfortunate enough to use those), seem to have spread to one of their few functional and innovative divisions.

“But hey — aren’t you being a bit harsh?” you may ask. “The thing just came out. Give them some time to fix the bugs.”

For anyone hoping that firmware updates will “solve” these problems, don’t count on it. Even for those issues with enormous user support, Amazon has been notoriously slow to provide obvious fixes, and this one is a niche problem that the typical “privacy? what’s privacy?” post-gen-x’er couldn’t care less about. Not to mention, Amazon just laid off much of their Kindle staff. If they followed the standard American corporate playbook, this included all the people who actually knew what they were doing.

Right now I’m debating whether to return the Amazon Scribe. Its size may be useful enough even without the writing features, but I doubt it. There’s also the remote possibility that Amazon will provide local export functionality.  But I’m not holding my breath.

 

How to Produce a Beautiful Book from the Command Line

Book Production Framework and Examples on GitHub

Introduction

Over the last couple of years, a number of people have asked me how I produce my books.  Most self-published (excuse me, ‘indie-published’) books have an amateurish quality that is easy to spot, and the lack of attention to detail detracts from the reading experience.  Skimping on cover art can be a culprit, but it rarely bears sole blame — or even the majority of it.   Indie-published interiors often are sloppy, even in books with well-designed covers.  For some reason, many authors give scant attention to the interior layout of their books.  Of course, professional publishers know better.    People judge books not just by their covers, but by their interiors as well.  If the visual appeal of your book does not concern you, then read no further.  Your audience most likely will not.

Producing a visually-pleasing book is not an insurmountable problem for the indie-publisher, nor a particularly difficult one.  It just requires a bit of attention.  Even subject to the constraints of print-on-demand publishing, it is quite possible to produce beautiful looking books.  Ebooks prove more challenging because one has less control over them (due to the need for reflowable text), but it is possible to do as well as the major publishers by using some of their tricks.  Moreover, all this can be accomplished from the command-line and without the use of proprietary software.

Now that I’ve finished my fifth book of fiction (and second novel), I figure it’s a good time to describe how I produce my books. I have automated almost the entire process of book and ebook production from the command-line. My process uses only free, open-source software that is well-established, well-documented, and well-maintained.

Though I use Linux, the same toolchain could be employed on a Mac or Windows box with a tiny bit of adaptation. To my knowledge, all the tools I use (or obvious counterparts) are available on both those platforms. In fact, MacOS is built on a flavor of unix, and the tools can be installed via Homebrew or other methods. Windows now has a unix subsystem which allows command-line access as well.

I have made available a full implementation of the system for both novels and collections of poetry, stories, or flash-fiction.   Though I discuss some general aspects below, most of the nitty gritty appears in the github project’s README file and in the in-code documentation.   The code is easily adaptable, and you should not feel constrained to the design choices I made.  The framework is intended as a proof of concept (though I use it regularly myself), and should serve as a point of departure for your own variant.  If you encounter any bugs or have any questions, I encourage you to get in touch.  I will do my best to address them in a timely fashion.

Examples

First, let’s see some examples of output (unfortunately, wordpress does not allow epub uploads, but you can generate epubs from the repository and view them in something like Sigil).  The novel and collection pdfs are best viewed in dual-page mode since they have a notion of recto and verso pages.

Who would be interested in this

If you’re interested in producing a fiction book from the command-line, it is fair to assume that (1) you’re an author or aspiring author and (2) you’re at least somewhat conversant with shell and some simple scripting. For scripting, I use Python 3, but Perl, Ruby, or any comparable language would work. Even shell scripting could be used.

At the time of this writing, I have produced a total of six books (five fiction books and one mathematical monograph) and have helped friends produce several more. All the physical  versions were printed through Ingram, and the ebook versions were distributed on Amazon. Ingram is a major distributor as well, so the print versions also are sold through Amazon, Barnes & Noble, and can be ordered through other bookstores. In the past I used Smashwords to port and distribute the ebook through other platforms (Kobi, Barnes & Noble, etc), but frankly there isn’t much point these days unless someone (ex. Bookbub) demands it. We’re thankfully past the point where most agents and editors demand Word docs (though a few still do), but producing one for the purpose of submission is possible with a little adaptation using pandoc and docx templates. However, most people accept PDFs these days.

My books so far include two novels, three collections of poetry & flash-fiction, and a mathematical monograph.  I have three other fiction books in the immediate pipeline (another collection of flash-fiction, a short story collection, and a fantasy novel), and several others in various stages of writing.  I do not say this to toot my own horn, but to make clear that the method I describe is not speculative.  It is my active practice.

The main point of this post is to demonstrate that it  is quite possible to produce a beautiful literary book using command-line, open-source tools in a reproducible way.  The main point of the github project is to show you precisely how to do so.   In fact, not only can you produce a lovely book that way, but I would argue it is the best way to go about it! This is true whether your book is a novel or a collection of works.

One reason why such a demonstration is necessary is the dearth of online examples. There are plenty of coding and computer-science books produced from markdown via pandoc. There are plenty of gorgeous mathematics books produced using LaTeX.   But there are very few examples in the literary realm, despite the typesetting power of LaTeX, and the presence of the phenomenal Memoir LaTeX class for precisely this purpose.  This post is intended to fill that gap.

A couple of caveats.

Lest I oversell, here are a couple of caveats.

  • When I speak of an automated build process, I mean for the interiors of books. I hire artists to produce the covers. Though I have toyed with creating covers from the command-line in the past (and it is quite doable), there are reasons to prefer professional help. First, it allows artistic integration of other cover elements such as the title and author. Three of my books exhibit such integration, but I added those elements myself for the rest (mainly because I lacked the prescience to request them when I commissioned the art early on). I’ll let you guess which look better. The second big reason to use a professional artist comes down to appeal. The book cover is the first thing to grab a potential reader’s eye, and can make the sale. It also is a key determinant in whether your book looks amateurish or professional. I am no expert on cover design, and am far from skilled as an artist. A professional is much more likely to create an appealing cover. Of course, plenty of professionals do schlocky work, and I strongly advise putting in the effort and money to find a quality freelancer.  In my experience, it should cost anywhere from $300-800 in today’s dollars.  I’ve paid more and gotten less, and I’ve paid less and gotten more.   My best experiences were with artists who did not specialize in cover design.
  • The framework I provide on github is intended as a guide, not as pristine code for commercial use. I am not a master of any of the tools involved. I learned them to the extent necessary and no more. I make no representation that my code is elegant, and I wouldn’t be surprised if you could find better and simpler ways to accomplish the same things. This should encourage rather than discourage you from exploring my code. If I can do it, so can you. All you need is basic comfort with the command-line and some form of scripting. All the rest can be learned easily. I did not have to spend hundreds of hours learning Python, make, pandoc, and so on. I learned the basics, and googled whatever issues arose. It was quite feasible, and took a tiny fraction of the time involved in writing a novel.

The benefits of a command-line approach

If you’ve come this far, I expect that listing the benefits of a command-line approach is unnecessary. They are roughly the same as for any software project: stability, reproducibility, recovery, and easy maintenance. Source files are plain text, and we can bring to bear a huge suite of relevant tools.

A suggestion vis-a-vis code reuse

One suggestion: resist the urge to unify code. Centralizing scripts to avoid code duplication or creating a single “universal” script for all your books may be enticing propositions. I am sorely tempted to do so whenever I start a new project. My experience is that this wastes more time than it saves. Each project has unforeseeable idiosyncrasies which require adaptation, and changing centralized or universal scripts risks breaking backward compatibility with other projects. By having each book stand on its own, reproducibility is much easier, and we are free to customize the build process for a new book without fear of  unexpected consequences. It also is easier to encapsulate the complete project for timestamping and other purposes. It’s never pleasant to discover that a backup of your project is missing some dependency that you forgot to include.

A typical author produces new books or revises old ones infrequently. The ratio of time spent maintaining the publication machinery to writing and editing the book is relatively small. On average, it takes me around 500 hours to write and edit a 100,000 word novel, and around 100 hours for a 100 page collection of flash-fiction and poetry. Adapting the framework from my last book typically takes only a few hours, much of which is spent on adjustments to the cover art.

Even if porting the last book’s framework isn’t that time consuming, why trouble with it at all? Why not centralize common code? The problem is that this produces a dependency on code outside the project. If we change the relevant library or script, then we must worry about the reproducibility of all past books which depend on it. This is a headache.

Under other circumstances, my advice would be different. For example, a small press using this machinery to produce hundreds of books may benefit from code unification. The improved maintainability and time savings from code centralization would be significant. In that case, backward-compatibility issues would be dealt with in the same manner as for software: through regression tests. These could be strict (MD5 checksums) or soft (textual heuristics) depending on the toolchain and how precise the reproducibility must be. For example, non-visual changes such as an embedded date would alter the hash but not textual heuristics. The point is that this is doable, but would require stricter coding standards and carefully considered change-metrics.

The other reason to avoid code reuse is the need for flexibility. Unanticipated issues may arise with new projects (ex. unusually formatted poems), and your stylistic taste may change as well. You also may just want to mix things up a bit, so all your books don’t look the same. Copying the framework to a new book would be done a few times a year at most, and probably far less.

Again, if the situation is different my advice will be too. For example, a publisher producing books which vary only in a known set of layout parameters may benefit from a unified framework. Even in this case, it would be wise to wait until a number of books have been published, to see which elements need to be unified and which parameters vary book to book.

Tools

Here is a list of some tools I use. Most appear in the project but others serve more of a support function.

Core tools

  • pandoc: This is used to convert from markdown to epub and LaTeX. It is an extremely powerful conversion tool written in Haskell. It often requires some configuration to get things to work as desired, but it can do most of what we want.  And no, you do not need to know Haskell to use it.
  • make: The entire process is governed by a plain old Makefile. This allows complete reproducibility.
  • pdfLaTeX: The interior of the print book is compiled from LaTeX into a pdf file via pdfLaTeX. LaTeX affords us a great way to achieve near-total control over the layout. You need not know much LaTeX unless extensive changes to the interior layout are desired. The markdown source text is converted via pandoc to LaTeX through templates. These templates contain the relevant layout information.
  • memoir LaTeX class: This is the LaTeX class I use for everything. It is highly customizable, relatively easy to use, and ideally suited to book production. It has been around for a long time, is well-maintained, has a fantastic (albeit long) manual, and boasts a large user community. As with LaTeX, you need not learn its details unless customization of the book layout is desired.  Most simple things will be obvious from the templates I provide.

Essential Programs, but can be swapped with comparables

  • python3: I write my scripts in python 3, but any comparable scripting language will do.
  • aspell: This is the command-line spell-checker I use, but any other will do too. It helps if it has a markdown-recognition mode.
  • emacs: I use this as my text editor, but vim or any other text editor will do just fine. As long as it can output plain text files (ascii or unicode, though I personally stick to ascii) you are fine. I also use emacs org-mode for the organizational aspects of the project. One tweak I found very useful is to have the editor highlight anything in quotes. This makes conversation much easier to parse when editing.
  • pdftools (poppler-utils): Useful tools for splitting out pages of pdfs, etc. Used for ebook production. I use the pdfseparate utility, which allows extraction of a single page from a PDF file. Any comparable utility will work.

Useful Programs, but not essential

  • git: I use this for version control. Strictly speaking, version control isn’t needed. However, I highly recommended it. From a development standpoint, I treat writing as I do a software project. This has served me well. Any comparable tool (such as mercury) is fine too. Note that the needs of an author are relatively rudimentary. You probably won’t need branching or merging or rebasing or remote repos. Just “git init”, “git commit -a”, “git status”, “git log”, “git diff”, and maybe “git checkout” if you need access to an old version.
  • wdiff, color-diff: I find word diff and color-diff very useful for highlighting changes.
  • imagemagick: I use the “convert” tool for generating small images from the cover art. These can be used for the ebook cover or for advertising inserts in other books. “identify” also can be useful when examining image files.
  • pdftk (free version): Useful tools for producing booklets, etc. I don’t use it in this workflow, but felt it was worth mentioning.
  • ebook-convert: Calibre command-line tool for conversion. Pandoc is far better than calibre for most conversions, in my experience. However, ebook-convert can produce mobi and certain other ebook formats more easily.
  • sigil: This the only non-command-line tool listed, but it is open-source. Before you scoff and stop reading, let me point out that this is the aforementioned “almost” when it comes to automation. However, it is a minor exception. Sigil is not used for any manual intervention or editing. I simply load the epub which pandoc produces into sigil, click an option to generate the TOC, and then save it. The reason for this little ritual is that Amazon balks at the pandoc-produced TOC for some reason, but seems ok with Sigil’s. It is the same step for every ebook, and literally takes 1 minute. Unfortunately, sigil offers no command-line interface, and there is no other tool (to my knowledge) to do this. Sigil also is useful to visually examine the epub output if you wish. I find that it gives the most accurate rendering of epubs.
  • eog: I use this for viewing images, though any image viewer will do. It may be necessary to scale and crop (and perhaps color-adjust) images for use as book covers or interior images. imageMagick’s “identify” and “convert” commands are very useful for such adjustments, and eog lets me see the results.

How I write

All my files are plain text. I stick to ascii, but these days unicode is fine too. However, rich-text is not.  Things like italics and boldface are accomplished through markdown.

Originally, I wrote most of my pieces (poems, chapters, stories) in LaTeX, and had scripts which stitched them together into a book or produced them individually for drafts or submissions to magazines. These days, I do everything in markdown  — and a very simple form of markdown at that.

Why not just stick with LaTeX for the source files? It requires too much overhead and gets in the way. For mathematical writing, this overhead is a small price to pay, and the formatting is inextricably tied to the text. But for most fiction and poetry, it is not.

I adhere to the belief that separating format and content is a wise idea, and this has been borne out by my experience. Some inline formatting is inescapable (bold, italics, etc), and markdown is quite capable of accommodating this. On the rare occasions when more is needed (ex. a specially formatted poem), the markdown can be augmented with html or LaTeX directly as desired. Pandoc can handle all this and more. It is a very powerful program.

I still leave the heavy formatting (page layout, headers, footers, etc) to LaTeX, but it is concentrated in a few templates, rather than the text source files themselves.

There also is another reason to prefer markdown. From markdown, I more easily can generate epubs or other formats. Doing so from LaTeX is possible but more trouble than it’s worth (I say this from experience).

What all this means is that I can focus on writing. I produce clear, concise ascii files with minimal format information, and let my scripts build the book from these.

To see a concrete example, as well as all the scripts involved, check out the framework on github.

Book Production Framework and Examples on GitHub

Why Your Book Won’t Be an Amazon Success Story

I’m going to be that guy. The one nobody likes at parties. The one who speaks unpleasant truths. If you don’t want to hear unpleasant truths, stop reading.

If you want to be told which self-help books to buy and which things to do and which gurus will illuminate the shining path to fame and fortune, stop reading.

If you want somebody to hold your hand, and nod at all the right moments and ooh and aah about how your writing has come a long way and you’re “almost there,” stop reading.

It doesn’t matter whether you’ve come a long way. It doesn’t matter whether your writing is almost there, is there, or is beyond there. It doesn’t matter what you’re saying or how you’re saying it. You may have written the most poignant 80,000 words in the English language, or you may have another book of cat photos. None of that matters.

Unless you’re a certain type of person saying a certain type of thing in a certain way, none of it matters. And that certain type of person, that certain type of thing, and that certain way changes all the time. Today it’s one thing, tomorrow it will be another.

Statistically speaking, you’re not it.

“But what about all those success stories,” you argue. “I’m always hearing about Amazon success stories. Success, success, success! This book mentioned them and that blog mentioned them and the 12th cousin of my aunt’s best friend’s roommate had one.”

There are two reasons this doesn’t matter.

Most of those stories are part of a very large industry of selling hope to suckers. Any endeavor which appeals to the masses and appears to be accessible to them spawns such an industry. Business, stock picking, sex, dating, how to get a job, how to get into college, and on and on. Thanks to today’s low barrier to entry, self-publishing is the newest kid on that block.

This isn’t a conspiracy, or some evil corporation with a beak-nosed pin-striped CEO, cackling ominously while rubbing his hands. Self-publishing just attracts a lot of people who see an easy way to make money. When there’s a naive, eager audience, a host of opportunists and charlatans purvey snake oil to any sucker willing to pay. They’re predators, plain and simple. Hopefully, I can dissuade you from being prey. Leave that to others. Others unenlightened by my blog. Cynicism may not always be right, but it’s rarely wrong.

Even seemingly reputable characters have become untrustworthy. The traditional publishing industry has grown very narrow and institutional, and life is hard for everyone associated with it. The temptation to go for the easy money, and cast scruples to the winds, is quite strong. Not that denizens of the publishing industry ever were big on scruples. Many individuals from traditionally respectable roles as agents, editors, and publishers find it increasingly difficult to eke out a living or are growing disillusioned with a rapidly deteriorating industry. It is unsurprising that they are bedazzled by the allure of easy money. Unsurprising, and disappointing. This is especially insidious when agents offer paid services which purport to help improve your chances with other agents. The argument is that they know what their kind wants. Anybody see the problem with this? Anybody, anybody, Bueller? It would be like H.R. employees taking money to teach you how to get a job with them. Oh wait, they do. How could THAT possibly go wrong…

I’m not going to delve into the “selling hope to suckers” angle here. That is fodder for a separate post, in which I analyze a number of things which did or did not work for me. For now, I’ll focus on the second reason your book won’t be an Amazon Success Story. Incidentally, I will resist the temptation to assign an acronym to Amazon Success Story. There! I successfully resisted it.

In this post, I’ll assume that ALL those stories you hear are right. Not that they’re 99% bunk or that most actual successes had some outside catalyst you’re unaware of or were the result of survivorship bias (the old coin-flipping problem to those familiar with Malkiel’s book). To paraphrase the timeless wisdom of Goodfellas, if you have to wait in line like everyone else you’re a schnook. If you’re trying what everyone else tries, making the rounds of getting suckered for a little bit here a little bit there, with nothing to show for it — you’re the schnook.

Don’t feel bad, though. No matter how savvy we are in our own neighborhoods, we’re all schnooks outside it. Hopefully, I can help you avoid paying too much to learn how not to be a schnook.

I can’t show you how to be successful, but I can show you to avoid paying to be unsuccessful. But that’s for another post. We’re not going to deal with the outright lies and deception and rubbish here. Those are obvious pitfalls, if enticing. Like pizza.

In this post, we’re going to assume the success stories are real — as some of them surely are. We’re going to deal with something more subtle than false hope. We’re going to discuss the OTHER reason you won’t be successful on Amazon. It’s not obvious, and it can’t be avoided.

But first, I’m going to make a plea: if you’re the author of one of those breathless, caffeinated “how to be a bzillionaire author like me” books or blogs or podcasts … stop it. Please. Just stop it. Unless you’re cynically selling hope to suckers or mass-producing content-free posts as click-bait. In that case, carry on. I don’t approve of what you do, but I’m not going to waste breathe convincing dirtbags not to be dirtbags. However, if you’re even the least bit well-meaning, stop. Maybe you have some highly popular old posts along these lines. Update them. Maybe you’re writing a new series of posts based on what your friend named John Grisham has to say to self-publishing authors. Don’t.

You’re doing everyone a disservice. People will waste money and time and hope. Best to tell them the truth. You may not be that guy. You may be too nice, tactful, maybe even (dare I say) an optimist. I’m not an optimist. I AM that guy. No false hope sold here.

Maybe you’re still reading this and haven’t sky-dived into a volcano or fatally overdosed on Ben & Jerries, or turned to one of those cheerful, caffeinated blogs. Shame on you. There’s special internet groups for people like you. But you’re still here, and I haven’t driven you away. I must be doing something wrong.

If you’re a true dyed in the wool masochist, I’ll now explain why you won’t be successful. It has to do with a tectonic shift in Amazon’s policies.

Over a year ago, I wrote a post titled “Why NOT to use Amazon Ads for your book,” which many people have written me about. Most found it a useful take on Amazon ads, and one of the few articles which doesn’t regurgitate lobotomized praise for the practice.

I stand by that. Subsequent experiments (to be reported in a future post) have shown that Amazon ads perform even worse now. This led me to wonder why. Why did all the long-tailed keywords and the reviews and the ads make no difference. None of us know the precise inner workings of Amazon ads, but there are strong indications of their behavior.

I now will offer my theory for why there are success stories, why it’s tempting to believe they can be emulated, and why they cannot. To do so, let’s review some basic aspects of Amazon’s algorithms.

There are two algorithms we care about:

(1) The promotion algorithm, which ranks your book. It is responsible for placing it in any top 100 lists, determining its visibility in “customers also bought” entries, when and how it appears in searches, and pretty much any other place where organic (i.e. non-paid) placement is involved.

(2) The ad auction algorithm, which determines whether you win a bid for a given ad placement.

The promotion algorithm determines how much free promotion your book gets, and is critical to success. It has only a couple of basic pieces of information to work with: sales and ratings. The algorithm clearly reflects the timing of sales, and is heavily weighted toward the most recent week. It may reflect the source of those sales — to the extent Amazon can track it — but I have seen no evidence of this. As for ratings, all indications are that the number of ratings or reviews weighs far more heavily than the ratings themselves. This is true for consumers too, as long as the average rating is 3+. Below that, bad ratings can hurt. Buyers don’t care what your exact rating is, as long as there isn’t a big red flag. The number of ratings is seen as a sign of legitimacy, that your book isn’t some piece of schlock that only your grandmother and dad would review — but your mom was too ashamed to attach her name to. Anything from a traditional publisher has 100’s to 1000’s of ratings. A self-published work generally benefits from 15+. More is better.

It makes sense that the promotion algorithm can play a role, but why mention an “ad auction algorithm”. Ad placement should depend on your bid, right? Maybe you can tweak the multipliers and bids for different placements or keywords, but the knobs are yours and yours alone. You might very well think that, but I couldn’t possibly comment. Unlike the ever-diplomatic Mr. Urquhart, I’m too guileless to take this tack. I also don’t use Grey Poupon. I can and will comment. You’re wrong. Amazon’s ad algorithm does a lot more behind the scenes. You may be the highest bidder and still lose, and you may be the lowest bidder and win.

As usual, we must look at incentives to understand why things don’t behave as expected. Amazon does not run ads as a non-profit, nor does it get paid a subscription fee to do so. It only makes money from an ad when that ad is clicked, and it only makes money from a sale when the ad results in a conversion. For sellers, the latter is a commission and for authors it’s the 65% or 30% (depending on whether you chose the 35% or 70% royalty rate) adjusted for costs, etc. In either case, they make money from each sale and they make money from each click.

Amazon loses money if your ad wins lots of impressions, but nobody clicks on it. They would have been happier with a lower bid that actually resulted in clicks. If lots of people click on your ad, but few people buy your book, Amazon would have been happier with a lower bid which resulted in more sales. It’s a trade-off, but there are simple ways of computing these things. When you start fresh, Amazon has no history (though perhaps if you have other books, it uses their performance). It assigns you a set of default parameters representing the average performance of books in that genre. As impressions, clicks, and sales accrue, Amazon adjusts your parameters. This could be done through a simple Bayesian update or periodic regressions or some other method.

When a set of authors bids on an ad, Amazon can compute the expected value of each bid. This looks something like P(click|impression)*ebid + P(sale|click)P(click|impression)*pnl, where P(click|impression) is your predicted click-through-rate for that placement, P(sale|click) is your predicted conversion rate for that placement, ebid is the effective bid (I’ll discuss this momentarily), and pnl is the net income Amazon would make from a sale of your book. This is an oversimplification, but gets the basic idea across.

The ebid quantity is your effective bid, what you actually pay if you win the auction. There actually are two effective bids involved. Amazon’s ad auctions are “second-price,” meaning the winning bidder pays only the 2nd highest bid. Suppose there are 5 bids: 1,2,3,4,5. The bidder who bid 5 wins, but only pays 4. There are game theoretic reasons for preferring this type of auction, as it encourages certain desirable behaviors in bidders. In this case, the effective bid (and what Amazon gets paid) is 4. That is no mystery, and is clearly advertised in their auction rules. What isn’t advertised is the other, hidden effective bid. These effective bids may be 3,2,4,2,3, in which case the third bidder wins. What do they actually pay? I’m not sure, but something less than their actual bid of 3.

Apparently, whatever algorithm Amazon uses guarantees that a bidder never will pay more than their actual bid. It somehow combines the two types of effective bids to ensure this. I am not privy to the precise algorithm (and it constantly changes), so I cannot confirm this. However, I have been informed by an individual with intimate knowledge of the subject that Amazon’s approach provably guarantees no bidder will pay more than their actual bid.

Why would Amazon prefer a lower bid, when they could get 4? As mentioned, they only get paid 4 if the ad of the winning bidder (the 5) gets a click. If the ad makes every reader barf or have a seizure or become a politician, there won’t be a lot of clicks. If it’s the most beautiful ad in human history, but the book’s landing page makes potential buyers weep and tear their hair and gnash their teeth, it probably won’t make many sales. In either case, Amazon would do better with another bidder.

Even without knowing the precise formula, one thing is clear. These algorithms are a big problem for anyone who isn’t already a star.

The problem is that those two algorithms play into one another, generating a feedback loop. If you’re already successful, everything works in your favor. But if you start out unattractive to them, you remain that way. You have few quality ad placements, and get few sales, and this suppresses your organic rank. The organic rank factors into many things which affect P(click|impression) and P(sale|click) — such as the number of reviews, etc. Put simply, once they decide you’re a failure, you become a failure, and remain one. You won’t win quality bids, even if you bid high. If you bid high enough to override the suppression, then you’ll pay an exorbitant fee per click, and it will cost a huge amount to reach the point where success compounds.

I am unsure whether there is cross-pollination between works by a given author, but I strongly suspect so. A new work by a top-ranked author probably starts high and is buoyed by this success. This may be why we see a dozen works by the same author (obviously self-published, and sometimes with very few ratings per book) in the top-100 in a genre.

So how do you get out of this hole? There’s only one accessible way for most people: you cheat. And this is where Amazon’s tectonic policy shift comes into play.

There ARE success stories, like the aforementioned top-ranked self-published authors. But there won’t be any more. To understand why, we must turn to hallowed antiquity before Bezos was revealed to be the latest incarnation of Bchkthmorist the Destroyer, and when Amazon brought to mind a place with trees, snakes, and Sean Connery.

There was a time when the nascent self-publishing industry had really begun to boom, but was poorly regulated. The traditional publishers viewed Amazon, Kindle, and self-publishing as a joke. They relied on their incestuous old-boys network of reviewers from the NY Times, New York Review of Books, and pretty much anything else with New York in the name for promotion. 95% of self-published books were about how to self-publish, and authors who DID self-publish (and were savvy) quickly developed ways to game Amazon.

They COULD pump up their search results, get in top-100 lists, and so on. Usually, this involved getting lots of fake reviews and using keyword tricks to optimize search placement. Once in the top list for a genre, it was easy to stay there — though newcomers with more fake reviews and better keyword antics could displace you. The very top was an unstable equilibrium, but the top 500 or 1000 was not. Once up there, it was easy to keep in that range and then occasionally pop into the very top. Like a cauldron of mediocrity, circulating its vile content into view every now and then. Amazon periodically tweaked its algorithms, but authors kept up.

Then something happened. Amazon decided to crack down on fake reviews. This sounds laudable enough. Fake reviews have the word fake in them, and fake always is bad, right?

There were two problems with HOW Amazon went about it. First, they went way overboard. Overnight, it became well-nigh impossible for an author to get a single new review. If the reviewer had one letter in common with your name, lived in the same hemisphere, or also breathed air, they were deemed connected to you and thus biased.

If this had been applied uniformly, there would be nobody in the top 100 — or it would be random, since nobody would have any tricks they could play. This is where the second problem with Amazon’s approach came in. They didn’t remove legacy fake ratings. Those who cheated before the cutoff got to keep their position. In fact, that position now was secure against all newcomers. A gate had slammed down, and they were firmly on the right side of it. Aside from a few people near the boundary they had nothing to fear. Well, almost nothing to fear.

The only way to break into the top echelon, and thus benefit from the self-reinforcing algorithms which stabilize that position, is to rely on external sources of sales. If you have a million twitter followers who buy your book, or a massive non-amazon advertising campaign, you can break in. They YOU would be very difficult to displace.

Once traditional publishers realized that Amazon is the only de facto bookstore left (outside airport/supermarket sales), they took an interest. THEY have no problem getting a top rank, because they run huge advertising campaigns and have huge existing networks. This is why the top 100 lists are an odd mixture of self-published books you never heard of and traditionally published bestsellers. Eventually it only will be the latter.

So. You. Won’t. Break. In. Amazon created an impenetrable aristocracy, and you’re not it. You won’t be it. You can’t be it. If you use Amazon ads or buy into any of the snake oil sales nonsense, you’ll be the schnook bribing a maitre d’ who knows he’ll never let you in.

Most of those success stories (or at least the real ones) are from before the policy change, as are many of the methods being touted. That path is gone. Amazon ads only work for those who don’t need them, and they work very well for them. They won’t work for you. Becoming a success on Amazon is as unlikely as with a traditional publisher. You’ll always hear stories, but they’re either the few who randomly made it, those with hidden external mechanisms of promotion, or those already entrenched at the top.

That’s the sad truth, or at least my take on it. By all means, waste a few dollars trying. I used to be a statistical trader and know better, but I still buy a lottery ticket when the jackpot’s high enough. It’s entertainment. Two dollars to dream for a day. I just don’t expect to win.

Write what you want, revise, work your butt off, and make it perfect. But do it because you want to, because that’s what makes you happy. Don’t do it expecting success, or hoping for success, or even entertaining the remote possibility of success.

The worst reason to write is for other people. Your work won’t be read, and your work won’t make you money. If you accept that and are happy to write anyway, then write all you want. I urge you to do so. It’s what I do.