This is just a brief look at the step by step guide with a few additional hints/suggestions/explanations that might make it easier to understand if you are still struggling with the basics.
-h
flag after a command to get a description of how to use it e.g. se create-draft -h
se roman2dec
and se titlecase
mapped to keyboard shortcuts.
Open the script editor (you might have to adjust the location of the tools):
tell application "BBEdit"
tell window 1
copy selection
set theString to the clipboard
set outputtext to do shell script "/Users/YOUR_USER_NAME/.local/bin/se titlecase " & "\"" & theString & "\""
set the clipboard to outputtext
paste
end tell
end tell
Most of the se commands need to act on a file or directory. se split-file src/epub/text/body.xhtml
runs split-file on the body.xhtml file. But most of them ned to run on the whole project. So as long as your terminal is in proper directory e.g. pierre-choderlos-de-laclos_dangerous-liaisons_thomas-moore using the .
will tell the command to act on the whole directory. Sometimes the Step-by-Step tells you to run command (like with se find-mismatched-diacritics
) without specifically mentioning you have to run it with the dot at the end: se find-mismatched-diacritics .
— just something to note.
You will be strongly encouraged to pick a book off the First Production List as your first project. Many people come to the list because they have a project in mind and balk at this restriction. After what is often a lot of argument, some reluctantly agree to do it the Standard Ebooks way and some depart unhappily.
The reason for this restriction is two-fold. As a producer you will discover in some cases it is harder than it looks to balance the editorial and structural demands of a text (even little things like letters or posters can pose some tough semantic and coding problems). And length is a big factor in being able to balance the many dimensions of producing a quality project — many projects are abandoned each year because the producer just couldn’t sustain the effort. The other main reason is it is highly likely you will make errors on your first attempt and the reviewer assigned to you is left having to scrutinize not only the basics, but all that extra length and/or complexities involved in a longer, more complicated book.
So stick to something safe for the first book—and if you are new to coding, maybe the fist couple of books—before you strike out in an attempt to bring your favourite piece of literature to the world. The good new is that a first production was categorized as 40,000 words when I started and is now, thanks to improvements in the toolset, set at < 100,000 words.
Remember though, even given the restrictions you are free to use the methodology and toolset to produce your own book—it just won’t be eligible for inclusion in the Standard Ebook corpus.
As a Canadian, finding texts (especially for cover searches) can often be problematic. Because of geo-blocking due to territorial rights and differing copyright periods often books that are freely available in the U.S. are not available to Canadian producers — I find Google Books especially frustrating on the score and rarely used it until I downloaded a tor browser and forced it to use an American-based location.
There are often multiple sources for finding original texts and sorting through them to find the best, copyright free version is just another challenge you should be prepared to face.
I used to only use the online version of the scans but more and more I have been downloading the pdf and having it open when I proof. Often this can be faster and more convenient for comparison purposes than using the online version. But either way works.
Standard form:
se create-draft --author="Pierre Choderlos de Laclos" --title="Dangerous Liaisons" --pg-id=45512
With translator:
se create-draft --author="Pierre Choderlos de Laclos" --translator="Thomas Moore" --title="Dangerous Liaisons" --pg-id=45512
Remember you can always use the help files for adding translators etc.
se create-draft -h
How rough can it get? You mainly have to get rid of all the cruft and leave only the basic body of the content including headers and titles.
Be sure to get rid of tables of contents, copyright pages and statements, title pages etc.
Leave in forewords, epigraphs, endnotes, afterwords etc. Occasionally you will end up dropping—or even adding in—extra material later but as long as it resides in ints own file this can be done painlessly.
If you are woking on something that isn’t basic chapters (let’s say they are named letter-1, letter-2 etc.), you can use se split-file
to change the name (among other things).
se split-file -f letter-%n.xhtml src/epub/text/body.xhtml
Hint If you pull out any prefaces ore forewords into separate file before you run split-file
then they will be numbered correctly and you can build the front matter files later.
Sometimes it‘s broke — so you have to fix it. Running se clean .
successfully relies on well-formed xhtml files. If there is an error somewhere in you project then it won’t run. Occasionally I have had to run it , find and fix an error, run it again, find and fix and error etc. severe times before everything works.
Pretty simple and straightforward. Make sure you run it more than just once, especially if you make changes.
Make note of any typographical changes you revert by hand (sometimes there are a few), since you will have to do it again every time your run typogrify.
Not for beginners. More on this when I have this chance.
Single quotes? WTF mate?
Often old texts will have the use of quotes reversed from how we usually do it here in North America and instead uses single quotes around quotations. Running the se british2american .
will switch them to double.
This will never be perfect so you will have to double check manually.
Coming soon
How far to go? And when? Generally speaking you can rely on the tools to take care of the modernization. Occasionally you might run across a word that isn’t part of the tools and you can suggest it to the list for addition.
The other thing that is harder to judge is compound words and hyphenation. In a very, very broad generalization, the English language seems to have a pattern when it comes to making new words out of others. Initially it is an open compound: “well being”. Once it becomes more common it tends to gain a hyphen: “well-being”. And eventually, when it becomes part of the every day vernacular it usually ends up being a closed compound: “wellbeing”. Of course, sometimes they start as hyphenated words Lots of words form the 19-century are like this: motorcycle, steamroller etc.
Its not necessary (and often dangerous) to go fiddling around with compounds unless you are well versed with grammar (hyphens can often mean it’s being used adjectively) and the history of the compound in question. The Modernize Spelling section of the Step-by-Step points out a bunch of ambiguous and difficult terms like “everything” vs “every thing” and “anyone” vs. “any one” and explains how to handle them correctly.
But it’s something to note as occasionally a word has changed how it is compounded since the text you are using was published and might warrant modernizing. Always check with Merriam Webster before making any changes!
Pretty straightforward. Can’t think of anything to say except at this moment the Step—by-Step doesn’t have a copyable command—remember to add the “dot” at the end: se find-mismatched-diacritics .
But I already did this? I have a bad habit of doing the titles earlier in the process and if the title is just Chapter 1 or Chapter 2 there isn’t anything wrong with this. But if you start to get into more complex books the have structures like:
<section id="letter-10" epub:type="chapter">
<hgroup>
<h2>
<span epub:type="label">Letter</span>
<span epub:type="ordinal">10</span>
</h2>
<h3 epub:type="title">The Marchioness de Merteuil, to Viscount Valmont</h3>
…then you are much better off leaving the tiles until this step and using the tool to make sure they are correctly formed.
If you have done everything correctly then these steps are pretty straightforward.
Remember the spine order control how the book is ordered so make sure that the command did it correctly: i.e. if you have a break between chapters it will usually be placed at the end instead of where it actually belongs. Just cut and past it into the correct order.
Also pretty straightforward if you have done everything correctly. If it doesn’t work the issue is almost always with how you set up headings and titles rather than the tool itself. But just almost always… so check and double check.
What, you haven’t done this yet?
Clean often. After a while you will find yourself cleaning after most every text change, just to keep the text/code more organized and easier to decipher.
Lint is always changing and improving. Each iteration gives us better clues as to what errors exist and where they can be found. It can be very, very frustrating at times, but trust me, it is your best friend. If all else fails you can post the lint error to the list and have fresh eyes look at it. Make sure your repo is up to date so potential helpers can check out your code.
While the Step-by-Step suggests you build a proof version without the -check flag, I generally skip to the bottom and run se build --output-dir=$HOME/dist/ --kindle --kobo --check .
so I can see if there are any errors that show up at that point. It will give you some errors (as you haven’t finished the process yet) but sometimes it will point out ones that you should have corrected by this point.
Be sure to run it again without the -check flag so you can get an epub version to proof with (if that is what you are using).
Yes, he’s serious; no, you can’t do that; yes it’s hard.
OS X’s Preview can:
Ask and you shall receive Quite a few member of the list have access to sophisticated software (e.g. Gigapixel) that can upsize smaller images without too much distortion. If you can’t find an image source that is big enough (minimum 2100 pixels high) then ask and generally someone will offer to help out making it larger.
More to Come
Lots and lot to come
You can search subject headings here: http://id.loc.gov/authorities/subjects.html
You can also search the authority.nacoaf if Wikipedia doesn’t list the link: https://id.loc.gov/authorities/names.html
You need to use escaped html text for the long description in the content.opf file. But you also need to ensure that each paragraph is indented by 3 tabs. This might be corrected by the latest version of lint.
Only a few thing to look out for here.
and on digital scans available at the<br/>
Internet Archive (<a href="https://archive.org/details/dangerousconnec00laclgoog">Volumes 1 and 2</a> and <a href="https://archive.org/details/DangerousConnectionsVol.3">Volume 3</a>).
Always, always, always run: clean, typogrify, semanticate, modernize-spelling, and build-toc on your project before presenting it for review. If you know there are some changes that might have to be redone if you do so then run it on a copy of the project. But generally that is the first thing a reviewer will do and if if anything pops up they will just ask you to fix any issues before they go any further.
Also remember to publish any commits that might not yet be uploaded to GitHub.
More Coming soon
Ignore this, it’s a red herring! It’s very unlikely you will do this unless it’s just a project for personal use. This is the last thing Alex does before posting it to the website.
Modified December 07, 2022 — 07:37 AM