An Emacs lisp example - Everything2.com

This is a simple Emacs lisp function for converting a C++ program's printf function calls to Java System.out.prints. One small limitation is only the first %s/%d/etc. substitution is moved into the format string. Additionally, only simple %s/%d/etc. patterns are handled -- sizes and precisions aren't allowed for.

You should know a bit of lisp here; this example shows how to use the Emacs specific stuff like searching, inserting, and deleting. Additionally, this example makes heavy use of regular expressions. Perl hackers should feel right at home. Like perl, Emacs lisp becomes much more powerful -- and difficult to read -- with regular expressions.

Here is the whole example:

(defun printf-to-println-region ()
  ""
  (save-excursion
    (save-restriction
      (narrow-to-region (region-beginning) (region-end))

      ;; start at top and search for all printf's ending with \n
      (goto-char (point-min))
      (while (re-search-forward "[
]\\(printf\\) *(\"\\([^\"]\\|\\\\\"\\)*\\(\\\\n\\)\"" (point-max) t)
        (goto-char (match-beginning 3))
        (delete-char (length "\\n"))
        (goto-char (match-beginning 1))
        (delete-char (length "printf"))
        (insert "System.out.println"))

      ;; go back to start and search for all remaining printf's
      (goto-char (point-min))
      (while (re-search-forward "[
]\\(printf\\) *(" (point-max) t)
        (goto-char (match-beginning 1))
        (delete-char (length "printf"))
        (insert "System.out.print"))

      ;; go through and move the first %s, %d, etc. argument into the
      ;; format string
      (goto-char (point-min))
      (while (re-search-forward "System.out.print\\(ln\\)? *(\"\\(\\([^\"]\\|\\\\\"\\)*\\)\"\\(, *\\)\\([^,)]*\\)" (point-max) t)
        (kill-region (match-beginning 5) (match-end 5))
        (delete-region (match-beginning 4) (match-end 4))
        (goto-char (match-beginning 2))
        (re-search-forward "%[^%]")
        (backward-delete-char 2)
        (insert "\" + ")
        (yank)
        (insert " + \""))

      )))

It starts off with by calling defun; that's how a function is defined in lisp. Next it calls save-excursion and save-restriction. Almost all of my Emacs lisp functions will begin with these two calls. save-excursion saves the current cursor position and the area of selected text (the point and the mark in emacs-speak) so after the user calls the function, his or her cursor and selection won't be in unexpected places. save-restriction does the same thing for the restriction. Emacs allows a function to pretend only a particular part of the buffer is there; the cursor won't move outside of the restricted area. Since we are going to change the restriction, this will make sure it gets restored when we are done.

Next it calls narrow-to-region. This restricts the function from moving or touching anything outside the current selection. If this line was removed, the function would operate on the whole buffer instead of just the selected area. Thats why we needed to call save-restriction.

Now we can get down to business. First the function will search for calls to printf where the format string ends in \n. These will be replaced by System.out.println and the \n is chopped off the end of the string. (goto-char (point-min)) goes to the top of the buffer (or in this case, the top of the restricted area). It then loops, doing a regular expression search until it the expression doesn't match anymore. re-search-forward usually signals an error if the no matching text can be found so we pass in the third argument 't' so the function just returns nil on failure instead.

The regular expression is "[<nl><tab><sp>]\\(printf\\) *(\"\\([^\"]\\|\\\\\"\\)*\\(\\\\n\\)\"".

In Emacs, the parentheses used for grouping and selecting must be escaped with two backslashes which, unfortunately, makes them very hard to read. There are three subexpressions enclosed in parentheses: the word printf, the format string up to but not including the \n, and finally the \n.

(goto-char (match-beginning 3)) moves the cursor to the start of the third parenthesized subexpression, which happens to be the trailing \n. It is then deleted from the buffer with delete-char. The same is done with the first subexpression, "printf", which is replaced with "System.out.println".

Next, we do the same for printf's that don't end with a newline. This one is a bit easier; it just returns to the top of the restriction and replaces all remaining printfs with System.out.prints.

Finally, the last part searches for %s/%d/etc. patterns in the format string, and moves the argument it represents into the format string. For example, from printf("Hello %s!", "world") we want System.out.print("Hello " + "world" + "!"). We kill subexpression 5 (the replacement, or "world" in the example), which deletes it and copies into the kill ring like C-w would. Then we delete subexpression 4, the comma and whitespace after the format string. Finally, we replace the %s pattern with the replacement by yanking it off the kill ring.

That's it! You can experiment with Emacs lisp by getting a buffer in Lisp Interaction Mode and typing away. Pressing C-j in Lisp Interaction Mode causes your statement to be executed. Happy Lisping!

Tips for using GNU Emacs in MS Windows	term-nasty.el	useful emacs lisp functions	Small helpful scripts for noders
Lisp Interaction Mode	LISP	Emacs Lisp	emacs vi disjunction
Java	Interpol	LISP for UNIX	Have you let Emacs into your heart?
text editor	XEmacs	Emacs