Lucky Friday (the 13th): From org to HTML without bugs

Table of Contents

Two weeks ago, I received a message form my web host OVH: the version of PHP used on my website will become obsolete. The website was managed through a content management system (CMS) called Spip.

As I did not use the CMS much, I have decided to come back to basics: static web pages, but generated form the org mode. It is a mode in emacs, my favorite text editor, for:

It is a good solution for someone who is used to:

The main drawbacks are the following:

In the future, to address this last problem, I envision to develop some basic functionalities as CGI scripts written with the high-level language Python.

For the development, two web sources have been especially useful and concretely used to get open source code:

Five hours have been spent for the development, whereas I have never learned HTML, CSS nor Javascript, and I have only a very limited experience with these languages.

Configuration of emacs

First, the org mode must be configured in order to generate HTML files from org files. They are assumed to belong to distinct directories, src and html. Add the following lines to the file .emacs.

(require 'org-publish)
(setq org-publish-project-alist
        ("org-notes"               ;Used to export .org file
         :base-directory "/home/login/Public/siteWeb/src"  ;directory holds .org files 
         :base-extension "org"     ;process .org file only    
         :publishing-directory "/home/login/Public/siteWeb/html"    ;export destination
         ;:publishing-directory "/ssh:user@server" ;export to server
         :recursive t
         :publishing-function org-publish-org-to-html
         :headline-levels 4               ; Just the default for this project.
         :auto-preamble t
         :auto-sitemap nil                ; Does not generate sitemap.org automagically...
         :sitemap-filename "sitemap.org"  ; ... call it sitemap.org (it's the default)...
         :sitemap-title "Sitemap"         ; ... with title 'Sitemap'.
         :export-creator-info nil    ; Disable the inclusion of "Created by Org" in the postamble.
         :export-author-info nil     ; Disable the inclusion of "Author: Your Name" in the postamble.
         :auto-postamble nil         ; Disable auto postamble 
         :table-of-contents t        ; Set this to "t" if you want a table of contents, set to "nil" disables TOC.
         :section-numbers nil        ; Set this to "t" if you want headings to have numbers.
         :html-postamble "    <p class=\"postamble\">Last Updated %d.</p> " ; your personal postamble
         :style-include-default nil  ;Disable the default css style
        ("org-static"                ;Used to publish static files
         :base-directory "/home/login/Public/siteWeb/src"
         :base-extension "css\\|js\\|png\\|jpg\\|gif\\|pdf\\|mp3\\|ogg\\|swf"
         :publishing-directory "/home/login/Public/siteWeb/html"
         :recursive t
         :publishing-function org-publish-attachment
        ("org" :components ("org-notes" "org-static")) ;combine "org-static" and "org-static" into one function call

The locations need to be updated (search for login). The post-amble of each page can also be configured (see html-postamble).

Second, in order to get highlighted syntax for source code, the package htmlize must be installed. Add the following lines to configure the list of package archives that are used.

(require 'package) ;; You might already have this line
(add-to-list 'package-archives
             '("melpa" . "http://melpa.org/packages/") t)
(when (< emacs-major-version 24)
  ;; For important compatibility libraries like cl-lib
  (add-to-list 'package-archives '("gnu" . "http://elpa.gnu.org/packages/")))
(package-initialize) ;; You might already have this line

(require 'htmlize)

Then install the package.

M-x package-install [RET] htmlize [RET]


  • After any modification of the file .emacs, you need to reload the file.

    M-x load-file [RET] .emacs [RET]
  • Read the buffer Messages to check that the configuration is correct.

Homepage of the website: an HTML file

See the source code of the homepage. It follows the style and architecture of Joshua Eckroth's homepage. Two CSS files are used: css/styleIndex.css and css/styleSite.css. All the pages of the web site consistently use the second file css/styleSite.css.

The homepage is at the root of a tree of web pages. Each page has a link towards the root and towards its immediate parent. The depth of the tree is two or three. It contains a static part, which slowly evolves, and a dynamic part, which more quickly evolves by a regular addition of posts.

  • Homepage

    • Research Activity
      • Publications
    • Teaching
    • Software development

    • Bio

    • Email
    • Contact

    • Posts
      • Post 0

Generation of web pages from org files

Except the homepage, all other web pages are generated from org files.

They have the following header.

#+SETUPFILE: /path/css/config.orgcss      
#+LINK_UP: parent_url

The configuration file css/config.orgcss is automatically included in the org file. It contains basic properties that are shared:

  • the location of the style file used (css/styleSite.css),
  • the location of the shortcut icon,
  • the link towards the homepage,
  • other options.
#+STYLE: <link href="url/css/styleSite.css" rel="stylesheet" type="text/css"> <link rel="shortcut icon" href="url/images/logoSite.png">
#+LINK_HOME: url/index.html
#+OPTIONS: html-postamble:auto html-preamble:t tex:t

The link UP must be defined for each page according to the whole architecture.

The body is a standard org file with its tree structure. Links can be easily defined.

Opening and closing tags for structural elements (like #+BEGIN\SRC and #+END\SRC pairs) can be easily added to a text, via specific selectors. See the template selectors.

Finally, to generate the HTML files, launch the following command.

M-x org-publish-project [RET] org [RET]

Sometimes, dependencies are forgotten (specifically, from .emacs or from css/config.orgcss): in that case, it is possible to force a complete generation.

C-u M-x org-publish-project [RET] org [RET]

Generation of an index

It is possible to generate an index from entries written as follows in org pages.

#+INDEX: level 1
#+INDEX: level 1!level 2

Only two levels are allowed. To get special symbols, like accented letters (é, è), use a latex command \command{}. See the list of available commands for special symbols.

To activate the index generation, add to the .emacs file the following declaration:

(require 'org-publish)
(setq org-publish-project-alist
        ("org-notes"               ;Used to export .org file
         :makeindex non-nil ; index generation
        ("org-static"                ;Used to publish static files
        ("org" :components ("org-notes" "org-static")) ;combine "org-static" and "org-static" into one function call

The generation will produce a file theindex.org that can be customized. It includes a file theindex.inc that contains all the index entries. Finally, from the file theindex.org, a file theindex.html is generated.

Address obfuscation

To prevent email addresses from being harvested by spam-bots, I have developed a solution based on a strict discipline and an obfuscation technique.

  • Discipline: email addresses are forbidden, except in a limited set of locations, where specific measures are applied.
  • Obfuscation: an email address is obfuscated in the source code and requires a computation to be discovered. The computation depends on an input from the user, the first time the page is visited.

More in a forthcoming post.

Version history: v1: 2015-03-13.
Comments or questions: Send a mail.
The webpage content is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.