Pandoc is a command line tool that transforms one text format (like Markdown or reStructuredText) into another (like HTML, PDF, or Word). I primarily use it to transform my markdown source files into pretty things like web pages or PDFs that are ready to be printed out. Pandoc is more suitable to my needs as an academic than other markdown tools because it allows for necessary extensions of markdown like footnotes, bibliographies, and tables without becoming too unwieldy. Pandoc’s template system also means that I have full control over my output—although the defaults are pretty good when I just want to spit something out in an unusual format.
After installing Pandoc, you can simply run it from a command line (like Terminal or iTerm in OSX or cmd.exe or Cygwin in Windows). The basic usage is simple:
pandoc -o myOutputFile.docx myInputFile.md
-o flag just means “output” and the filepath after it specifies the name of the file you want to create. Pandoc will automatically detect which format you want to go from and which format you want to output just by the file extensions you give it. In my example, I’m going from a markdown file to a word file.
So far so good, but what if you want to take advantage of those cool things like automatically formated bibliographic footnotes? Simply add the appropriate flags with the necessary information.
pandoc --csl=myChicagoDefault.csl --bibliography=myBibliography.bib -o myOutputFile.docx myInputFile.md
In this example the
--csl flag specifies a file with the CSL-style specification for how footnotes and bibliographic entries should be formated and the
--bibliography flag specifies a file containing my BibTeX database of bibliography information. You can get more information on my exact set up in this post. Other flags I often include are
--smart for transforming straight quotes into curly quotes and hyphens into dashes and
--chapters for making first level headers in markdown into chapters in my dissertation.
At this point it becomes somewhat unmanageable to type all these flags in every time you want to get a Word version of the essay you’ve been writing. Luckily, I rarely open up the command line for Pandoc. Instead, I just save the command I use together with all the flags as a “build system” in Sublime Text.
Here’s the build system that I use as a Gist:
Using this I just press
Ctrl+Shift+B and I get a menu with the various output formats. I just select one, using either the mouse or the arrow keys, and out pops a new version of the file that I have focused in Sublime. The new file has the same name as the markdown version except that the extension is different.
I’m really impressed with what Shawn Graham over at Electric Archeology is doing with the Sublime Zettelkasten plugin. He is using Pykwiki to create a static site from his local folder of markdown files so that he can host an open notebook. You should really go over and check it out.
Sascha Fast said that I should write up a blog post explaining why I find my association of notes through Git useful. Admittedly, I only use this kind of association rarely. When I do, however, it proves to be just what I need.
I tend to work on my notes in creative bursts during which my mind is deeply immersed in the material. While I am putting all the notes and citations from (e.g.) an article where they go, I’m also opening totally fresh topics and thinking about seemingly unrelated things. For example, thoughts about Socrates’s comments about the real self, the soul, and conversation in the Alcibiades, will make me think about something connected to Martin Buber and Gabriel Marcel. When I’m done with a burst I commit in Git.
Most of the time, I navigate around my notes by either:
- using the quick panel in Sublime because I know exactly where I want to go,
- using direct links between explicitly associated notes, or
- searching for words or citekeys because I know the general topic I am looking for.
But these ways of moving from one note to others will not capture the association between Socrates and Martin Buber—there is no citekey in the later note since the article I was reading had nothing to do with Martin Buber; there are no obvious key words to search between the two files; and the random inspiration during the burst did not cause me to explicitly associate the notes with a hard link. Nevertheless, a year later I start to see the deep pattern that initially led me to think of Martin Buber while reading about Socrates. When I had the original inspiration, my grasp of this underlying pattern was totally inchoate. But now, I’m starting to see many little ideas across years of reading form one big constellation. As I work on this, it is super helpful to pull up a list—in under ten keystrokes—of all the notes I edited when I was working on this particular sentence of my note on the Alcibiades.
Christian Tietze had this to say on Twitter:
Date-based IDs in the file name do the same but only upon creation—the tech hurdle for Git is high, though.
I wanted to point out that I think everyone should be using Git anyway if they are doing anything in plain text. With an appropriate plugin, there is really very little to learn (no need to get into the CLI or any advanced features for our purposes). I also want to point out that Christian is exactly right. With date-of-creation timestamps or date-of-modification timestamps, you only get one point at which to place this note near others. With Git, you get nearness based on each change—both to the starting note and to its “change neighbors.”
After telling someone that they should use definition lists, I noticed that I had ignored them in my papers because I was too lazy back then to set up a nice format in LaTeX. Definition lists are great for philosophers (especially analytic philosophers) because it provides a good semantic structure for defining a proposition or example. So I updated the source of my papers to reflect my own advice. A good example of this is in my paper “Why Molinism Does not Help with the Rollback Argument.” You can see the raw source of this paper here, the PDF output here, and the HTML output here.
To get the PDF output to style nicely, I added this to the header of the LaTeX template that I use for Pandoc:
To get the HTML output for this site to style nicely, I used this CSS:
Hope this helps!
Yesterday, I was helping another colleague get set up with markdown for his dissertation and realized that I did not have a convenient way of giving him the CSL file that I use to automagically format my footnotes according to the Chicago Manual of Style. So here is a link to this file posted to Gist.
CSL is an open standard that defines how bibliographic elements are put together (e.g. parentheses versus footnotes). You can use this with many tools, but I use it with Pandoc. To get it to work, you need to define two files when you run Pandoc:
- You need the
--bibliographyflag to point to a BibTeX file with your bibliographic information so that Pandoc knows which author wrote which book. (This is the format that BibDesk and JabRef save in automatically.)
- You need the
--cslflag to point to the CSL file so that Pandoc knows how you want things to look.
An example command might look like this:
pandoc --bibliography=~/Dropbox/mybib.bib --csl=~/Dropbox/chicago.csl -o test.html test.md
You can have multiple CSL files for different formats, say one for author–date and one for footnotes. Then, on a project-by-project basis you can easily switch between them without having to change your source document. The source will just contain a Pandoc citation that looks like this
[@gerson03 67] and it will get formatted differently based on which CSL you use.
TheClearHorizon asked an interesting question in a comment thread over at Zettelkasten.de about a particular advantage of Luhmann’s index card system. While responding to his post, I realized that there was a certain structural feature that I had reproduced using Git without reflecting that I had done this. I would like to make two points, first a theoretical point about the architecture of a Zettelkasten and second a practical point about using Git to achieve this.
There are different kinds of association between ideas, and the architecture of a note-taking system can be better or worse at capturing these connections. Luhmann’s Zettelkasten system is brilliant because it captures many types of association very efficiently. Each index card has an ID that looks something like this
143b/3c/2. The first number
143 stands for a particular topic, say Sartre. Letters indicate branching within that topic. So
143b may stand for Sartre > No Exit. A slash followed by a number stands for a continuation (which is necessary because of the physical size of the index cards). So
143b/3 would stand for the third index card of Satre > No Exit. (My apologies if this is not a strictly accurate representation of Luhmann’s actual system.)
The disadvantage of this system is that the numbers are rather cryptic. On an electronic version we have the space, so we can just write
Sartre > No Exit rather than
143b. We can still capture the two kinds of association between ideas that matter most: (i) explicit reference and (ii) hierarchical nesting. Subtly, however, we have just lost an important form of association: (iii) the sequential relationship between 142 and 143.
142 might stand for Drama and we would not immediately think to associate Drama and Sartre. They are associated in the numbering system because the person creating the cards first opened a new topic for Drama and then opened a new topic for Sartre. The two ideas are associated for the researcher because of the chronological nearness of the work.
One of the great advantages of working in a plain-text format like Markdown is that you can use powerful tools like Git rather than whatever your software happens to ship with. Many change-tracking features in software like Word or Dropbox happen automatically. Each time you save or make a change, the software keeps track of what you did. This is fine if all you want is to keep from loosing your work, and you can make Git work this way too. But there is a better way: make intentional commits with brief, descriptive messages that log what you have done at logical intervals in your work. For years, I have been doing this for all my writing as a matter of course.
Now the realization: I have also been using Git to get the same kind of association that I thought might be lost by switching from numbers to descriptive titles. Each time you make a commit in Git, you have a group of files that have been changed at the same time. They are “change neighbors” so to speak. With a Git plugin in Sublime Text, it takes only a few keystrokes to pull up a log of all the times a particular file has been changed. By selecting one of these commits, I can—within seconds—see all the “change neighbors” of this particular file at any stage of its development. In other words, I can easily see all the notes (whether I would later think to relate them or not) that I was working on at the same time as this note.
This method of association is actually more powerful that the association between
143 above. Those numbers were close to each other only because those top-level topics were created at the same time. Association through Git commits, however, established nearness each time the note is changed not just at creation.
So I have a colleague who is using the LaTeX template that I developed to go from Markdown to PDF using Pandoc and he wants to put his name and class information at the top of an essay. Pandoc expects metadata like this to be at the top of your markdown file in a YAML block like this:
--- author: Dan Sheffler title: Example Title class: PHI 735 semester: Fall 2015 ---
So how do we get this to render in the PDF via LaTeX? I used Pandoc’s conditional template tags to check and see if each piece of the metadata is there (that way you could leave something blank and it won’t choke). If a piece of the metadata is there, then it goes into the document on a new line with no paragraph indent.
\noindent $if(author)$$author$$endif$\\ $if(title)$$title$$endif$\\ $if(class)$$class$$endif$\\ $if(semester)$$semester$$endif$\\
Make sure you copy and paste this LaTeX bit into your
default.latex template somewhere below the
\begindocument command and above the
\doublespace command (if you are using the spacing package). If you wanted to extend this it would be simple to add another variable to both the YAML and the template with something like
When all is done, this should produce something at the top of the PDF that looks like this:
Hope this helps!
MK over at the Taking Note Blog just commented on my recent post about one thought per note. He helpfully adds several more links on the same idea and mentions that this is “one of the most basic features” of his note-taking.
I have mentioned the Taking Note Blog before, and I highly recommend it. A few years ago I took a saturday and read through every single post in the archives—and have kept up with every post since. This has probably been the single greatest influence on the way that I think about managing information. So head over there and start taking notes!
This week Christian Tietze over at Zettelkasten.de wrote up two helpful reviews of my note taking methods: (i) this post discusses the advantages of keeping notes in a notebook and (ii) this post discusses my recent description of moving from reading to organized notes on the computer.
I discovered Christian and Sascha’s work on this site only recently, but I have already learned a lot, so head over there and check out what they have to say about taking notes.
I’ve recently been asked how I process my notes from my notebook into the computer. I make a distinction between the low-friction handwritten notes that I create for the purpose of engaging with my reading or a lecture and those more organized and polished notes that help me actually remember material years down the road. See here for details.
While reading a book or article, I make a note every time I have a substantial thought or observation. I begin by writing the page number I am currently on, then the thought. As much as possible, I try to write in my own words, critically evaluating what I am reading rather than merely echoing. If, however, I think that I will need to quote an author explicitly, I copy down the quote making sure to place directly quoted material in quotation marks. I then double check the quote for accuracy. All of this should take a minimum of effort and organization while still maintaining clarity and depth of thought.
When I come to organize these notes, I slowly work through each page reference one at a time. First, I ask myself whether it is important that I remember this bit of information long-term. If not, I move on immediately. Second, I ask myself, “What is the single idea here?” trying to remember the principle of one thought per note. So I have two possibilities: (1) I need to create a file that captures this single idea or (2) I already have a file. Let’s take both these possibilities in turn:
- This option is more involved since I have to create a new idea note. (This is accomplished easily with my Sublime Text plugin.) Let’s break this down into steps with a couple screen-shots:
- I begin by populating the title of the note.
- I then create a bullet list off the top of my head out of every other note that has a conceptual connection to this new idea I am working on. If the connection is not completely obvious, I add a one-line description of the connection. I then follow all these links and add back-links in each of those files with the same procedure. While I am in those files, I keep an eye out for links to other files I did not remember in the first step. I keep adding to the list of connections and adding back-links until I have exhausted the connections to the best of my knowledge. This sounds time consuming, but it usually only ends up taking a minute or two and it greatly helps with keeping things tidy once the Zettelkasten expands to hundreds of notes.
- Now the hardest part: I think very carefully about the single idea that prompted the creation of this note file. I try to write out my own view as clearly as possible in about a paragraph. In all likelihood, I will copy and paste this paragraph into a rough draft of something I am writing, so I try to put in the effort up front to make the writing level as good as possible while the ideas are fresh in my mind. If this ends up going on to more than 300 words or so I likely have more than one idea and I need to split it up into multiple files.
- I then add a “References” section beneath my written paragraph with subsections for each source contributing to my understanding of the topic. Since this is a new note from reading I will just create one reference to the current author. In this section I put important quotes or page citations that I may need when I go to transform my humble paragraph into part of a real document.
- I find the file that accurately captures the single thought. I review its contents and think about whether this new bit of information alters my views on the matter. If so, I edit my description of the idea. Either way, I add a link at the top of the note in the Connections section to the relevant reading note file with a one-line description of what this reading in particular contributes to my understanding. If there is a quote involved or a more extended description of an authors position, I may also create a subsection at the end of the file in the section called “References.”