a gemtext converter
NOTE: this readme only sort of describes the use of clarence. i'm working on it.
gemtext is almost perfect---it's only missing a few things, roughly in order of disruption to the existing spec:
- lightweight inline markup: strong, emphasis,
code
, etc. - "smart" punctuation, like “...”, —, etc.
- extra verbatim properties and filters
- a handful of extra line types
- line continuation (?)
clarence attempts to address these issues and provide a compatible way to export gemtext documents to richer formats, like html. in this project my explicit goals are as follows:
- provide a way for gemtext to "scale up" to more expressive formats like HTML or LaTeX while staying readable as plain gemtext
- provide a gemtext normalization function to remove breaking changes in a build step to gemtext
- allow for all extra functionality to be turned off
- be reasonably extensible (maybe)
using clarence
clarence [OPTIONS] < FILE [> FILE]
- FORMAT is one of
gemtext
=html=. - -o FILE specifies the output file. You can also pipe clarence's output wherver.
- if a FILE is given on the command line, it's read as input; otherwise clarence reads standard input.
installing clarence
you'll need
- sbcl (i have version 2.1.11) -- or possibly any CL impl (not tested)
- common lisp libraries CL-PPCRE, CLINGON
- I think that's it..?
make install
or do the sbcl commands yourself, i don't care.
we're not on quicklisp yet, womp
rationale
inline markup
i think one of the main things markdown got right was how it made writing inline markup like bold and italic text much easier than plain HTML. adding this functionality to gemtext is pretty straight forward and i think degrades gracefully the same way as it does in markdown. to keep the line-based nature of gemtext the inline tags can't span lines, making the implementation even simpler: basically a search and replace for \*[^*]\*
and the like.
what works:
- * bold * , _ italic _ , = code = (without spaces)
> bold, italic, =code
what doesn't:
- no escaping logic at the moment, which e.g. just now came up
extra verbatim properties and filters
another thing that I personally think is a good idea for gemtext as an authoring format is extension through filtering, that is, using properties on ```-delimited text to call other programs on preformatted blocks for nicer output in output formats. the example that immediately comes to mind is tables:
| State | Capital |
| New York | Albany |
| Tennessee | Nashville |
| Maine | Augusta |
```
In this example, table
is the name of some Unix filter that will convert the table shown above into, say, an HTML table or a LaTeX table or what-have-you. A benefit is that the table is still fairly readable in a plain gemtext context as well -- though the table
utility could also align columns for plaintext output.
verbatim filter syntax
the trickiest part is figuring out the syntax of verbatim filters. features i want:
|filter
means pipe the contents of the verbatim block to the filter, allowing other verbatim alt-text lines to still have meaning- some way to pass the current output to the program -- i'm thinking env vars
- i've thought that having directives to only filter a block if it's a given output format, or hide it altogether, would be good --- but i'm starting to think that's against the spirit of gemini
smart punctuation [NOT IMPLEMENTED]
i think smart punctuation can be somewhat overblown, but it can also be useful. i especially find -- as en-dash and --- as em-dash useful, and really, the smart quotes thing is only marred by the fact that the regular rules aren't intuitive (to me). double quotes are easy: they enclose the text, so if whitespace is on the left it's a left quote, and otherwise it's a right quote.
single quotes are more complex though: they're also used as apostrophes, share the same appearance as right quotes even when beginning a word. thus, "'twas" does not look the same as "he said, 'are you sure?'" because the first use is rarer, i'm going to do this:
- ' is for "default" smart quotes -- that is, whitespace on left means it's a left quote, otherwise it's a right quote. - ` is for "reverse" quotes -- it reverses the meaning of '
so 'twas becomes `twas. not the most beautiful, but functional.
extra line types [NOT IMPLEMENTED]
horizontal rules
sometimes it's nice to have a thematic break. in clarence gemtext, ---
makes a <hr> in html while staying as-is in gemtext, which gets the meaning across just fine.
metadata lines
in an earlier iteration i had =:= introduce a metadata line, structured like a link but containing metadata. the other day i had a thought that maybe having page metadata all in its own index file would be a better idea.... also metadata doesn't add anything really to the document's view, which would be needed for gemtext compat. the same goes for comments as well...
escaping
the conventional wisdom with escaping text lines that start with one of the special characters is to prepend it with a space. is this what i want? do i want to use a backslash? should i remove one prepended space, all prepended spaces, ... ?
line continuation [NOT IMPLEMENTED]
gemtext is a line-based format, with but one state variable: verbatim-ness. i don't want to break that, but at the same time i want the ability to split a paragraph across lines or continue list items for readability ...
or now that i'm thinking about it, do i? i'm writing this document in emacs with everything as one big line, and it's fine! i have visual-line-mode, visual-fill-column-mode, and adaptive-wrap-prefix-mode all on, and it's doing exactly what i want it to do, to be honest.
contact
uh, questions? comments? email me at acdw@acdw.net