I write drafts on paper. It would be nice if the electronic versions of documents looked similar to the drafts. It would be even better is the same paper-like form could be used for a web site with all its menus and an eBook with all the chapters. In this blog, I will go over the technology stack I use to present the same simple text on a web site and an eBook. This works for a static web site (things are added only once in a while). One needs to be comfortable with open source software, and tweaking other people's code a little.

HTML, CSS, and Javascript

HTML felt like a miracle in the mid 1990s. With a small collection of tags committed to memory I could create a web site with a text editor (emacs at the time, but I switched to vim). I have fond memories of those days.

The next step was Cascading Style Sheets (CSS). CSS added a visual style to all the elements. I used someone else's CSS collection, avoiding the details which looked dull.

For web sites that need to change on the fly for small browsers like phones, one needs the combination of CSS and Javascript. By shrinking a browser into a narrow column, you can tell if this sort of software is on a site. Nothing changes here at Science20 at the moment, but the New York Times navigation gets simpler as the browser narrows. Javascript has to manage all the silly differences between browsers, a dead dull subject to me.

Markdown - A simple subset of HTML

John Gruber's blog back in 2004 starts:

> Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you
> to write using an easy-to-read, easy-to-write plain text format, then convert
> it to structurally valid XHTML (or HTML).

> Thus, “Markdown” is two things: (1) a plain text formatting syntax; and (2) a
> software tool, written in Perl, that converts the plain text formatting to
> HTML. See the Syntax page for details pertaining to Markdown’s formatting
> syntax. You can try it out, right now, using the online Dingus.

> The overriding design goal for Markdown’s formatting syntax is to make it as
> readable as possible. The idea is that a Markdown-formatted document should be
> publishable as-is, as plain text, without looking like it’s been marked up with
> tags or formatting instructions. While Markdown’s syntax has been influenced by
> several existing text-to-HTML filters, the single biggest source of inspiration
> for Markdown’s syntax is the format of plain text email.

[end blog]

If interested, read the decade old blog. I appreciate a clear vision.

# This is a title in markdown

## This is a section

### Guess this should be a subsection.

Lists
* look like
* what you would
* guess

For an ordered list, go from a * to a 1.

The full collection of marks fits on a page.
There is not an international panel deciding the markdown standard. The hope is to keep things more fluid so adaptation can be faster than what is seen with HTML.


More than markdown

What if you need more than what is available, for example, superscripts? You can use the tags to write E = m c<super>2</super>. I install the subscript and superscript extensions which converts ^2^ to a <super>2</super> and ~2~ that does the subscript. Doing LaTeX math remains an open issue of me. I have not dived into the issue. I use a program on a Mac called LaTeXiT that then exports a png file.

I wanted to add a gallery for one section of a site. I found a site, FancyApps.com, that had FancyBox, a combination of CSS and javascript that would do all the work. The tools is free for non-commercial sites. The instructions were not too hard to follow. Now I have the fancy gallery look for the few pages that need it. Nice.

From a site or blogs to markdown


Markdown is simpler than HTML or a collection of blogs. That means there are programs that can translated an investment in HTML into markdown. I used a python program, html2text for the work of converting quaternions.com to a collection of markdown files.

mkdocs - From markdown to a freely hosted web site

mkdocs is a python program that takes care of so many details of creating and maintaining a static documentation web site it appears amazing. Here is the directory structure of my main site, quaternions.com:


~/Documents/Q> tree -d docs
docs
├── About
│   └── Pop_science
├── Classical_physics
├── EM
├── Gravity
│   └── Measurement-101
├── Math
├── QM
├── SR

What does it take to make all the navigation elements? Just one text file, mkdocs.yml.

>head -30 mkdocs.yml
site_name: Q
site_author: sweetser@alum.mit.edu
theme_dir: spacelab-mod

markdown_extensions:
    include:
    subscript:
    superscript:
    footnotes:

pages:
- [index.md, '']
- [Math/math.md, Math, The Math]
- [Math/overview.md, Math, Overview]
- [Math/history.md, Math, History]
- [Math/numbers_101.md, Math, Numbers 101]
- [Math/multiplying.md, Math, Easy multiplying]
- [Math/products.md, Math, Products]
- [Math/scalars_vectors.md, Math, 'Scalars, vectors, tensors and all that']
- [Math/analysis.md, Math, Analysis]
- [Math/topology.md, Math, Topology]
- [Math/fit.md, Math, Where quaternions fit]

The documentation on mkdocs.org is great, as you might expect. If interested, click through all the pages which should take about 15 minutes. Building the web site - transforming from markdown to HTML - requires one command: mkdocs build --clean.  To see it, run mkdocs serve. It will appear on localhost:8000 unless a different port is supplied. On my mac, I paid for a program called LaunchControl to always start up my local version of quaternions.com without doing a thing. I also paid for Marked 2 which allows me to look at the markdown file as a browser would see it. After all file saves, it automatically updates.


To the web via GitHub


Git is a program to manage all the text files that go into a software development project. In particular, Linus Trovals wrote it for the Linux kernel. Every old version ever written can be retrieved as long as one knows the right key value (technically the sha1 value of the graph). GitHub.com is a commercial web site that is indeed the biggest hub providing git services for free. Every repository is public - unless you pay them a fee to keep it private.

An additional free service GitHub provides is to host a web site based on a git branch called "gh-pages". So long as there is an index.html page at the top of the gh-pages branch, it will be on the web.

The URL is not the stuff people pay for. In my case it is http://dougsweetser.github.io/Q. This is made of my user name and the repository name, Q. Now I tell the web site where I pay for the name quaternions.com to point to the github page.

So how do I get all this text there? Wait for it...

> mkdocs gh-deploy

In a minute, the update is live. But the situation is really better than that. The site is a git repo. Hack away locally. When happy, commit it and then deploy it. If you want to undo anything, that is the power of git. I don't have to log into my web page server ever. All the details are handled with gh-deploy. Nice.

Leanpub.com - a path to eBooks

What group of people are going to be the very best with software to make books? The writers who have to deliver computer books. Being the first author on a new type of software make a huge difference. Speed matters. Write the book as the flavor-of-the-moment framework is being put together. In such an evolving situation, the book would need to evolve along with the latest implementation of the software.

Faced over and over again with this situation, there is now a push to create a superset of markdown that can handle the creation of eBooks and pdfs. Books have extra things like a title page and table of contents.

In leanpub.com's effort to fill this need, they start with the manuscript directory. Since this is just the location of the markdown files, I symbolically linked manuscript to docs. The most significant difference is that the web site has small markdown files in directories and subdirectories, while the eBook needs one markdown file. Fortunately Dave Hein has written a program to go collect all the small markdown files to make one book size file. The program is called "mdmerge" and is discussed in his blogAll one needs is a text file that has the location of all the markdown files for the book. This also means some markdown files can either be skipped or be added.

Getting this to work did require a little bit of shell scripting. The biggest issue was getting the directory paths to the image files right. For the web site, they lived in sub-sub-directories of the docs directory. For the manuscript, one starts at the same upper level. I used a bunch of sed commands to get things straight.

> % cat book_edit.sh#!/bin/bash
pushd .
cd ~/Documents/Q/docs
cat book_to_merge.txt | mdmerge -o doing_physics.0.md --book -
sed 's|\.\.\/\.\.\/\.\./images|images|g' doing_physics.0.md | tee doing_physics.1.md
sed 's|\.\.\/\.\./images|images|g' doing_physics.1.md | tee doing_physics.2.md
...
sed '/./,/^$/!d' doing_physics.10.md | tee doing_physics.md

perl -e '@f = 0..10; for $f (@f){unlink("doing_physics.$f.md");}'
popd

A better solution would be to write a pre-mdmerge program that goes and finds all the image files and their locations and adjust things as need be. It did only take about 20 minutes to do the job with sed, and it works.

One additional bell added was to call the shell program (book_edit.sh in my case) as part of every git commit. Even though I had not done a git hook before, it turns out to be easy. Go into the directory .git/hooks. There are sample files. Copy git-commit.sample to git-commit, edit that file to call the program, and then it will be called with each commit. A new book markdown file is generated with each commit.

Here is another crazy thing: leanpub.com has a service at github so that if a commit is made to the file that creates the book (doing_physics.md in my case), then a new ebook will be created, without me doing a thing. Both my web site and my ebook will be updated with each push of the committed files.

The printed book

I have not looked into it. Please feel free to comment on the subject if you
have experience.

Back in control

I would only update quaternions.com once every few years when I felt badly enough about some aspect of the site. There were all kinds of tags and links and navigation and image files all on a server I rarely visited. Now the site looks like a bunch of pretty simple text files on my own computer. It is trivial for me to push modifications out to github to both the web site and ebook. This got me to update the quaternion EM section considerably. That did require a few weeks of writing, but that is the nature of technical writing.