Tip for making a book from markdown documents (also RIP sphinx/rst, long live to pandoc/md.)

While uploading a new Pypi package I noticed the new « default » format for documentaiton is not RST (restructured text) anymore but md (markdown).

I am NOT gonna rant about changes, but it makes me think nicely of how code documentation evolved with time : from txt files in /usr/share/doc, man pages and info to perldoc, rst and markdown.

I pretty much like them all: since day 1 tech writing have been influenced by the hardcore nerds of typogrpahy often resulting in nice output formats.

As much as I like man pages, nroff syntax to write them was hyper verbose, non permissible and tough for the brain.
We then saw perdoc, docgen, phpdoc ... in each languages to self document peculiar fields of interest. I must admit that in these 3rd generation scripts I loved perldoc for its variety of output and lean results, and pydoc+restructured text. Even though setting up autodocumentation for the API in sphinx/rst is a tad circumvoluted.

And with git came the gfm : github flavored markdown. As with any mono-culture, markdown may lack a lot compared to sphinx/rst but it has become a de facto standard, which tooling has been improved with the arrival of pandoc.

Pandoc is probably the external tools that revolutionized the modern way of documenting code ; it can read and write in almost any format, making multi origin/multi target docs (man pages, html docs, pdf) a breeze to generate.

I have never been has free as I am now, being able to choose perldoc to self document bash scripts (and python), while making a unified markdown out of all the heteroclites sources.


The cadeau bonux of pandoc : making your own pdf book



Just THREE pandoc trick enables you to transform a lot of what you see is what you mean (WYSIWYM) in a very clean PDF.

I am the first one looking at latex output and says, wouhaou, it is definitly smarter. I know it's false, but it still impresses me as more «savant».
Would the acm be able to publish that much non sense without being called out as frauds if it was in the same boring format as an IETF RFC?
Here are the tricks to make a book out of a bunch of markdown files.

The metadata trick



When you add at the beginning of your file the following metadata pandoc interprets them as bilbiographic notice and generates a « cover »:
% title
% author(s) (separated by semicolons)
% date

the multimarkdown file



cat ../*md API.md > _index.md
pandoc -f gfm  -s _index.md -o ../index.md
cat << EOF ../index.md > index.with_cover.md
% title
% author(s) (separated by semicolons)
% date
EOF
This will correct all your creative section numbering and make a consistent markdown from a bunch of markdown


the toc trick



pandoc ../index.with_cover.md --toc -V documentclass=report   -o FAIM.pdf
This will create a pdf with a minimalist cover and an hyperlinked TOC that also refers to the page.

Sure, you could do it with sphinx, but it was a lot more work.

The full script for my doc generation is here , and the resulting PDF

Conclusion


What is fun in the apparent victory of markdown over all the documentation formats is that I perceive it as happening at the same time pandoc usage grow.

I am not as sure that markdown as become the unrivaled documentation formats because of its terseness or if pandoc is slowly replacing cumbersome tools such as sphinx and perldoc.

It's not the format of documentation that is slowly evolving : it's the practices of documentation that are becomiing more flexible, and I embrace this future.

No comments: