Formatting information

A beginner's introduction to typesetting with LATEX

Chapter 2 — Using your editor to create documents

Peter Flynn

Silmaril Consultants
Textual Therapy Division

v. 3.6 (March 2005)


  1. Installing TEX and LATEX
  2. Using your editor to create documents
  3. Basic document structures
  4. Typesetting, viewing and printing
  5. CTAN, packages, and online help
  6. Other document structures
  7. Textual tools
  8. Fonts and layouts
  9. Programmability (macros)
  10. Compatibility with other systems
  1. Configuring TEX search paths
  2. TEX Users Group membership
  3. The ASCII character set
  4. GNU Free Documentation License

This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it.

Peter Flynn is author of The HTML Handbook and Understanding SGML and XML Tools, and editor of The XML FAQ.



Using your editor to create documents


  1. Markup
  2. Quick start for the impatient
  3. Editors
  4. LATEX commands
  5. Special characters
  6. Quotation marks
  7. Accents
  8. Dimensions, hyphenation, justification, and breaking
  9. Mathematics

LATEX documents are all plain-text files.1 You can edit them with any editor, and transfer them to any other computer system running LATEX and they will format exactly the same. Because they are plain text they cannot corrupt your system, as they cannot be used for hiding or transporting virus infections as binary wordprocessor files can. Everything you can see is in the file and everything in the file is there for you to see: there is nothing hidden or secret and there are no manufacturers' proprietary ‘gotchas’ like suddenly going out of date with a new version.

  1. ‘Plain-text’ originally meant just the 95 printable characters of the American Standard Code for Information Interchange (ASCII) — see Table C.1 — but now more commonly includes both these and the relevant 8-bit characters from one (only) character set such as ISO-8859-1 (Western Latin–1) or ISO-8859-15 (Western Latin plus the Euro sign). These are international standards which work everywhere: you should avoid using manufacturers' proprietary character sets like Microsoft Windows–1252 or Apple Macintosh Roman–8, because they may make your documents unusable on some other systems.

ToC2.1 Markup

In a LATEX document, you type your text along with markup which identifies the important parts of your document by name, for example ‘title’, ‘section’, ‘figure’, etc. LATEX does all the formatting for you automatically, using the markup to guide its internal rules and external stylesheets for typesetting.

You do not need to format any of your text in your editor, because LATEX does it all by itself when it typesets. You can of course regularise or neaten its appearance in your editor for ease of editing (for example, keeping each item in a list on a separate line), but this is not required.

You will often hear LATEX markup referred to as ‘commands’ or sometimes ‘control sequences’ (the proper TEXnical term for them). For all practical purposes these terms all mean the same thing.

This course assumes that users have one of TEXshell, TEXnicCenter, WinShell, or WinEdt (Windows only), or Emacs or LYX (any platform) installed. These are discussed briefly in section 2.3, and the menus and toolbars for running LATEX are explained in Chapter 4.

ToC2.2 Quick start for the impatient

If you already know all this stuff about editors and plain-text files and running programs, and you know your system is already correctly installed (including your editor), you'd probably like to type something in and see LATEX do its job. If you don't, then skip forward to section 2.4 and read a bit more about LATEX first.

Figure 2.1Sample document
\section*{My first document}

This is a short example of a \LaTeX\ document I wrote 
on \today. It shows a few simple features of automated 
typesetting, including

\item setting the default font size to 12pt;
\item specifying `article' type for formatting;
\item using the Palatino typeface;
\item adding special formatting for URIs;
\item formatting a heading in `section' style;
\item using the \LaTeX\ logo;
\item generating today's date;
\item formatting a list of items;
\item centering and italicizing;
\item autonumbering the pages.

\subsection*{More information}

This example was taken from `Formatting Information',
which you can download from 
and use as a teach-yourself guide.

\textit{Have a nice day!}


    Up and running in a few minutes
  1. Install the software

    Make sure you have a properly-installed LATEX system and a copy of a suitable editor.

  2. Create a sample document

    Open your editor and type in the text exactly as shown in Figure 2.1. Do not make any changes or miss anything out or add anything different at this stage.

  3. Save the document

    Save the document as demo.tex

  4. Run LATEX or pdfLATEX

    Click on the LATEX or pdfLATEX toolbar icon or the TEXTEX File menu item; or type latex demo or pdflatex demo in a command window.

  5. Preview the typesetting

    Click on the DVI or PDFview toolbar icon or the TEXTEX View menu item; or type your previewer command in a terminal shell.

    (Note that there may be a pause the first time you use your DVI viewer, while WYSIWYG font files are created.2)

  6. Print it

    Click on the Print toolbar icon within the viewer, or use the TEXTEX Print menu item, or type dvips -f demo | lpr (Unix/Linux).

If you encounter any errors, it means you do need to study this chapter after all!

  1. DVI previewers use their own font files specially created from the font outlines (TrueType, OpenType, PostScript, METAFONT, etc). The first time you use a font at a size not used before, there will be a second or two's pause while it gets created. The more you use LATEX, the less this happens.

ToC2.3 Editors

All the text of your documents can be typed into your LATEX document from a standard keyboard using any decent plain-text editor. However, it is more convenient to use an editor with special features to make using LATEX easier. Some of the most popular are TEXshellWinShell, TEXnic Center, and WinEdt (Windows only); and LYX and Emacs (all platforms).

ToC2.3.1 LYX

The LYX document editor (all platforms) is a special case, as it uses the What You See Is What You Mean (WYSIWYM) model of synchronous typographic editing as opposed to What You See Is What You Get (WYSIWYG), and many users prefer this interface (but see the reservations in the item ‘Synchronous typographic displays’ in section 3).

LYX makes a strong case for using synchronous typographical editing: it is possible to create even quite large and complex documents without seeing a backslash very often, although with math or complex macros there is probably no way to avoid having to do some manual insertion of LATEX code.

The free availability on multiple platforms makes this a clear answer to the myth of ‘having to edit like a programmer’, and as it is an Open Source project, there is constant improvement, both to the facilities and to the interface.

Probably the only real reservation is that it does not save native LATEX files by default. It uses its own internal format, and it can export LATEX for use in other editors, but the exported files are not designed for human legibility, only for LATEX processing. In a co-operative environment this would be a serious drawback, but for the individual user this interface is an excellent tool.

ToC2.3.2 TEXshell

Figure 2.1The TEXshell editor

This is one of the simplest of all the plaintext Windows editors, but it has most of the tools needed to begin with. Sectioning, lists, and graphics can be inserted from the menus, and there are buttons for running LATEX on the open document and for previewing the typeset document.

The syntactic highlighting distinguishes between commands and your text, and it comes with options for spellchecking (you need to install ispell), and for adding math, Greek (math), and some symbol characters from a pickchart. The typeset display is done using your installed DVI viewer (there is no provision for PDF, although as it is configurable, that could probably be edited into the menus).

Download the .tar.gz file from CTAN in the support/TeXshell/ directory and unwrap it into somewhere like C:\Program Files\TeXshell\.3 There is a tsconfig program in the same directory on CTAN, which is designed to help with reconfiguring TEXshell.

  1. Yes, I know I said don't use directory names with spaces in them — and so you shouldn't, for your LATEX documents — but Windows programs are usually safe enough in them.

ToC2.3.3 WinShell

Figure 2.1The WinShell editor

This is another free Windows editor for beginners with LATEX. Despite its simplicity, it is capable of a considerable amount of document management and assistance with editing. As well as handling stand-alone LATEX files, you can create a ‘Project’ for larger documents, which helps you keep track of additional files like separate chapters, illustrations, diagrams, indexes, etc.

You run LATEX direct from the toolbar icons or with F-key shortcuts. Both standard LATEX and pdfLATEX are supported, as well as creation and previewing of PostScript and PDF output. There are additional toolbars for math characters, and there is a ‘Table Wizard’ for handling tables. The syntax highlighting distinguishes between commands (in blue) and delimiters (in red), leaving your text in black.

Download the WinShellnn.exe program (self-contained setup: the nn changes with the version) from CTAN in the systems/win32/winshell/ directory and double-click it to start the setup.

ToC2.3.4 TEXnicCenter

TEXnicCenter is a powerful Windows editor suitable both for the beginner and the more advanced user. Its ‘Project’ environment keeps track of multiple files, and the processing function (the bit which actually runs LATEX, here called ‘Build’) tries to ensure that all the files you need for a large or complex document are in place before you start typesetting, to avoid errors like missing illustrations.

It's a much more wordprocessor-like control interface, with configurable toolbars and button-controls for lists, math, tables, and previewing options.

Download the TXCSetupxxx.exe program (self-contained setup: the xxx bit changes with the version) from CTAN in the systems/win32/TeXnicCenter/ directory and double-click it to start the setup.

ToC2.3.5 WinEdt

Figure 2.1The WinEdt editor

WinEdt is a highly configurable plain-text editor for Windows. It comes with a host of special functions and shortcuts for TEX and LATEX, based on the MikTEX distribution. It is supplied on the TEX Collection 2004 DVD and the proTEXt CD. You can also download it from — in either case there's a 1-month free trial, then it reminds you to buy it.

WinEdt uses a built-in toolbar of configurable buttons, preset for use with LATEX, and it provides syntactic coloring of LATEX commands. Both the positioning and effect of the buttons can be changed, using an editable file of icons and a configuration panel. This flexibility lets you bind a program and arguments (equivalent to a typed command) to a particular icon.

There are default buttons on the toolbar for one-click typesetting, previewing, and PostScript or PDF generation from LATEX documents, and it manages multi-file document projects like most of the other editors. Winedt is also used by many people for normal plaintext file-editing tasks, in preference to more limited programs like Notepad. If you're using the fpTEX which came with the 2003 TEX Collection, some editing of the menus is required (explained in the local installation document) because the default setup is for MikTEX/proTEXt.

ToC2.3.6 GNU Emacs

Figure 2.1Emacs editing LATEX

Emacs is a product of the GNU Project.4 Versions are available for all makes and models of computer, and it has a LATEX-mode which provides syntactic colouring (‘fontification’ in Emacs-speak) and mouseclick processing from a menu or toolbar.

Emacs is a very large and powerful editor, with ‘modes’ (plug-ins) to handle almost everything you do on a computer. Many users run Emacs once on logging in, and never leave it for the rest of the day — or month. As well as edit, you can use it to read your mail, browse the Web, read Usenet news, do wordprocessing and spreadsheets, compile programs, help you write in any computer language — including XML and LATEX — and it provides a few games as well.

Emacs knows about LATEX and how to process it, so it comes with a menu full of LATEX operations to click on. If you are editing complex documents with mathematics, there is a mode (AUCTEX) which has even more functionality. LATEX support is well-developed, and there is a hierarchy of newsgroups for Emacs support.

Because Emacs runs on Microsoft Windows, Macs, Linux, and most other platforms, many LATEX users who have multiple machines (and those who have multiple users to support) prefer it to other editors because it provides the same environment regardless of which platform they are using.

It's sometimes criticised for a steep learning curve, but in fact it's no worse in this respect than any other editor, given the power that it provides, and it is significantly better than most which lack many of the authorial tools available in Emacs.

  1. GNU's Not Unix (GNU)’ is a project to make a computing environment completely free of restrictions.

ToC2.3.7 Mac editors

Mac users will be disappointed that I haven't included any of the Mac interfaces here. It's simple: I don't have a Mac right now to try them out on. I hope to remedy this for a future edition.

ToC2.4 LATEX commands

LATEX commands all begin with a backslash (\)5 and are usually made up of lowercase letters only, for example:


The \tableofcontents command is an instruction to LATEX to insert the Table of Contents at this point. You would usually use this in a book or report (or perhaps a very long article) somewhere close to the beginning. You don't have to do anything else. Provided that you have used the sectioning commands described in section 3.5, all the formatting and numbering for the Table of Contents is completely automated.

  1. Do not confuse the backslash with the forward slash (/). They are two different characters. The forward slash is used on the Web and on Unix systems to separate directory names and filenames. The backslash is used in Microsoft Windows (only) for the same purpose, and in LATEX to begin a command.

ToC2.4.1 Simple commands

Simple one-word commands like \tableofcontents must be separated from any following text with white-space. This means a normal space, or a newline [linebreak] or a TAB character. For example either of these two forms will work:

\tableofcontents Thanks to Aunt Mabel for all her help 
with this book.
Thanks to Aunt Mabel for all her help with this book.

If you forget the white-space, as in the following example, LATEX will try to read it as a command called \tableofcontentsThanks. There's no such command, of course, so LATEX will complain at you by displaying an error message (see section

\tableofcontentsThanks to Aunt Mabel for all her help 
with this book.

LATEX swallows any white-space which follows a command ending in a letter. It does this automatically, so you don't get unwanted extra space in your typeset output, but it does mean that any simple command which ends in a letter and has no arguments (see below) must be followed by white-space before normal text starts again, simply to keep it separate from the text.

ToC2.4.2 Commands with arguments

Many LATEX commands are followed by one or more arguments, a term from the field of Computer Science, meaning information to be acted upon. Here are two examples:

\chapter{Poetic Form}

Such arguments always go in {curly braces} like those shown above. Be careful not to confuse the curly braces on your keyboard with round parentheses ( ), square brackets [ ], or angle brackets < >. They are all different and they do different things.

With commands that take arguments you do not need to use extra white-space after the command, because there is an argument following it which will keep it separate from any normal text with follows after that. The following is therefore perfectly correct (although unusual because it's harder to edit: normally you'd leave a blank line between the chapter title or label and the start of the first paragraph).

\chapter{Poetic Form}\label{pform}The shape of poetry 
when written or printed distinguishes it from prose.

ToC2.4.3 White-space in LATEX

In LATEX documents, all multiple spaces, newlines (linebreaks), and TAB characters are treated as if they were a single space or newline during typesetting. LATEX does its own spacing and alignment using the instructions you give it, so you have extremely precise control. You are therefore free to use extra white-space in your editor for optical ease and convenience when editing.

The following is therefore exactly equivalent to the example in the preceding section:

\chapter        {Poetic 

The shape of poetry when written or printed 
distinguishes it from prose.

That is, it will get typeset exactly the same. In general, just leave a blank line between paragraphs and a single space between words and sentences. LATEX will take care of the formatting.

ToC2.5 Special characters

There are ten keyboard characters which have special meaning to LATEX, and cannot be used on their own except for the following purposes:

Key Meaning If you need
the actual character
itself, type:
\ The command character \textbackslash \
$ Math typesetting delimiter \$ $
% The comment character \% %
^ Math superscript character \^ ^
& Tabular column separator \& &
_ Math subscript character \_ _
˜ Non-breaking space ˜
# Macro parameter symbol \# #
{ Argument start delimiter $\{$ {
} Argument end delimiter $\}$ }

These characters were deliberately chosen, either because they are rare in normal text, or (in the case of $, #, &, and %) they already had an established special meaning on computers as metacharacters (characters standing as symbols for something else) by the time TEX was written, and it would have been misleading to choose others.

ToC2.5.1 Using the special characters

We have already seen (the first paragraph in section 2.4) how to use the backslash to start a command, and curly braces to delimit an argument. The remaining special characters are:


Because of the special mathematical meaning LATEX uses for the dollar-sign on its own, if you want to print $35.99 you type \$35.99


The comment character makes LATEX ignore the remainder of the line in your document, so you can see it in your editor, but it will never get typeset. For example
Today's price per kilo is £22.70 % get Mike to update this
If you want to print 45% you need to type 45\%


The caret sign lets you type \(E=mc^2\) to get E=mc2. If you need the circumflex accent on a letter like ê, just type the letter or use the symbolic notation \^e.


The ampersand is used in tables to separate columns (see section 6.3). If you want to print AT&T you need to type AT\&T.


The underscore lets you type \(r_2\) for r2. If you want to underline text (extremely rare in typesetting) see the last paragraph in section 8.2.3.


The tilde prints as a space, but prevents a linebreak ever occurring at that point. It's often used between a person's initials and their surname, eg Prof D.E.~Knuth


If you want a hash mark (the octothorpe or American number or ‘pound’ [weight] sign) you type \#. For a pound (sterling) sign £, now nearly obsolete except in the UK and some of its former dependencies, use your £ key or type \textsterling.

While we're on the subject of money, an unusual but interesting serif-font Euro sign is got with the \texteuro command from the textcomp package. The standard sans-serif needs the marvosym package and is done with the \EUR command.6

  1. The European Commission has specified that everyone use the sans-serif design, even in serif text, but this is amazingly ugly and most designers rightly ignore it.

ToC2.6 Quotation marks

Do not use the unidirectional typewriter keyboard " key for quotation marks. Correct typographic quotes are got with the ` key and the ' key, doubled if you want double quotes:

He said, ``I'm just going out.''

He said, ‘‘I'm just going out.’’

This ensures you get real left-hand and right-hand (opening and closing) quotes (usually shaped like tiny 66 and 99 or as symmetrically-balanced strokes). If you are using Emacs as your editor, the " key is specially programmed in LATEX-mode to think for itself and produce correct `` and '' characters (so this is one occasion when you can use the " key).

!!!If you are reading this in a browser, or if you have reprocessed the file using different fonts, it may not show you real quotes (some old browser fonts are defective) and the \thinspace below may be too wide. Download the typeset (PDF) version of this document to see the real effect.

When typing one quotation inside another, there is a special command \thinspace which provides just enough separation between double and single quotes (a normal space is too much and could allow an unwanted linebreak):

He said, `Her answer was ``never''\thinspace'.

He said, ‘Her answer was ‘‘never’’ ’.

ToC2.7 Accents

For accented letters in western European languages7 or other Latin-alphabet character sets just use the accented keys on your keyboard — if you have the right ones. You must also tell LATEX what character repertoire (‘input encoding’) you are using. You specify this by using the inputenc package8 in your preamble with the relevant option. For example, to tell LATEX you will be typing ISO Latin–1 accented characters, use:


If you have a real Unicode editor, which lets you insert any letter or symbol from any language on the planet (for example, mixed European, Asian, and other languages), use utf8 instead of latin1. The encoding definitions that are available on your system are in /texmf/tex/latex/base (all files ending in .def).

If you don't have accented letter keys on your keyboard, you'll need to use your operating system's standard keyboard Ctrl or Alt key combinations to generate the characters (see the panel ‘If you don't have accented letters’ in this section).

If you cannot generate accented characters from your keyboard at all, or if you need additional accents or symbols which are not in any of the keyboard tables, you can use the symbolic notation in Table 2.1. In fact, this can be used to put any accent over any letter: if you particularly want a g˜ you can have one with the command g (and Welsh users can get ŵ with \^w).

If you use this symbolic method only, you do not need to use the inputenc package. Before the days of keyboards and screens with their own real accented characters, the symbolic notation was the only way to get accents, so you may come across a lot of older documents (and users!) using this method all the time: it does have the advantage in portability that the LATEX file remains plain ASCII, which will work on all machines everywhere, regardless of their internal encoding, and even with very old TEX installations.9

Table 2.1Built-in LATEX accents
Accent Example Characters to type
Acute (fada) é \'e
Grave è \`e
Circumflex ê \^e
Umlaut or diæresis ë \"e
Tilde ñ \~n
Macron ō \=o
Bar-under o \b o
Dot-over (séımhıú) \.m
Dot-under \d s
Breve ŭ \u u
Háček (caron) ŭ \v u
Long umlaut ő \H o
Tie-after \t oo
Cedilla ç \c c
O-E ligature œ, Œ \oe, \OE
A-E ligature æ, Æ \ae, \AE
A-ring å, Å \aa, \AA
O-slash ø, Ø \o, \O
Soft-l ł, Ł \l, \L
Ess-zet (scharfes-S) ß \ss

Irish and Turkish dotless-ı is done with the special command \i, so an í-fada (which is normally typed with í) requires \'\i if you need to type it in the long format, followed by a backslash-space or dummy pair of curly braces if it comes at the end of a word and there is no punctuation, because of the rule that LATEX control sequences which end in a letter (see the first paragraph in section 2.4.1) always absorb any following space. So what you might see as Rí Teamhraċ has to be R\'\i\ Tea\.mra\.c when typed in full (there are not usually any keyboard keys for the dotless-ı or the lenited characters). A similar rule applies to dotless-j and to uppercase Í.

  1. ISO 8859-1 (Latin–1, Western European) to 8859-15 (includes the Euro).
  2. We haven't covered the use of packages yet. Don't worry, see section 5.1 if you're curious.
  3. Remember not everyone is lucky enough to be able to install new software: many users on corporate and academic networks still have to use old versions of TEX because their system administrators are too busy to install new ones.

ToC2.8 Dimensions, hyphenation, justification, and breaking

LATEX's internal measurement system is extraordinarily accurate. The underlying TEX engine conducts all its business in units smaller than the wavelength of visible light, so if you ask for 15mm space, that's what you'll get — within the limitations of your screen or printer, of course. Most screens cannot show dimensions of less than 1/96″ without resorting to magnification or scaling; and on printers, even at 600dpi, fine oblique lines or curves can still sometimes be seen to stagger the dots.

At the same time, many dimensions in LATEX's preprogrammed formatting are specially set up to be flexible: so much space, plus or minus certain limits to allow the system to make its own adjustments to accommodate variations like overlong lines, unevenly-sized images, and non-uniform spacing around headings.

TEX uses a very sophisticated justification algorithm to achieve a smooth, even texture to normal paragraph text. The programming for this has been borrowed by a large number of other DTP systems, and users of these are often quite unaware that they are in fact using a significant part of TEX in their work. Occasionally, however, you will need to hand-correct an unusual word-break or line-break, and there are facilities for doing this on individual occasions as well as throughout a document.

ToC2.8.1 Specifying size units

Most people in printing and publishing habitually use points and picas and ems. Some designers use cm and mm. Many English-language speakers still use inches. You can specify lengths in LATEX in any of these units, plus some others (see Table 2.1).

Table 2.1Units in LATEX
Unit Size
Printers' fixed measures
pt Anglo-American standard points (72.27 to the inch)
pc pica ems (12pt)
bp Adobe ‘big’ points (72 to the inch)
sp TEX ‘scaled’ points (65,536 to the pt)
dd Didot (European standard) points (67.54 to the inch)
cc Ciceros (European pica ems, 12dd)
Printers' relative measures
em ems of the current point size (historically the width of a letter ‘M’ but see below)
ex x-height of the current font (height of letter ‘x’)
Other measures
cm centimeters (2.54 to the inch)
mm millimeters (25.4 to the inch)
in inches

The em can cause beginners some puzzlement because it's based on the ‘point size’ of the type, which is itself misleading. The point size refers to the depth of the metal body on which foundry type was cast in the days of metal typesetting, not the printed height of the letters themselves. Thus the letter-size of 10pt type in one face can be radically different from 10pt type in another (look at the table in section 8.2, where all the examples are 10pt). An em is the height of the type-body in a specific size, so 1em of 10pt type is 10pt and 1em of 24pt type is 24pt.

Another name for a 1em space is a ‘quad’, and LATEX has a command \quad for leaving exactly that much horizontal space. A special name is given to the 12pt em, a ‘pica’ em, as it has become a fixed measure in its own right.

If you are working with other DTP users, watch out for those who think that Adobe points (bp) are the only ones. The difference is only .27pt per inch, but in 10″ of text (a full page of A4) that's 2.7pt, which is nearly 1mm, enough to be clearly visible if you're trying to align one sample with another.

ToC2.8.2 Hyphenation

LATEX hyphenates automatically according to the language you use (see section 2.8.6). To specify different breakpoints for an individual word, you can insert soft-hyphens (discretionary hyphens, done with \-) wherever you need them, for example:

When in Mexico, we visited Popoca\-tépetl by helicopter.

To specify hyphenation points for all occurrences of a word, use the \hyphenation command in your preamble (see the panel ‘The Preamble’ in section 3.4) with one or more words in its argument, separated by spaces. This will even let you break ‘helico-
pter’ correctly. In this command you use normal hyphens, not soft-hyphens.

\hyphenation{helico-pter Popoca-tépetl 

If you have frequent hyphenation problems with long, unusual, or technical words, ask an expert about changing the value of \spaceskip, which controls the flexibility of the space between words. This is not something you would normally want to do, as it can change the appearance of your document quite significantly.

If you are using a lot of unbreakable text (see next section and also section 6.6.1) it may also cause justification problems. One possible solution to this is shown in section 9.3.

ToC2.8.3 Unbreakable text

To force LATEX to treat a word as unbreakable, use the \mbox command: \mbox{pneumonoultramicroscopicsilicovolcanoconiosis}.
This may have undesirable results, however, if you change margins or the width of the text: pneumonoultramicroscopicsilicovolcanoconiosis...

To tie two words together with an unbreakable space (hard space), use a tilde (~) instead of the space (see the item ‘˜’ in section 2.5.1). This will print as a normal space but LATEX will never break the line at that point. You should make this standard typing practice for things like people's initials followed by their surname, as in Prof. D. E. Knuth: Prof.\ D.~E.~Knuth.

!!!Note that a full point after a lowercase letter is treated as the end of a sentence, and creates more space before the next word. Here, after ‘Prof.’, it's not the end of a sentence, and the backslash-space forces LATEX to insert just an ordinary word-space because it's OK to break the line after ‘Prof.’, whereas it would look wrong to have initials separated with Prof. D.
E. Knuth broken over a line-end.

ToC2.8.4 Dashes

For a long dash — what printers call an ‘em rule’ like this — use three hyphens typed together, like~--- this, and bind them to the preceding word with a tilde to avoid the line being broken before the dash. It's also common to see the dash printed without spaces—like that: the difference is purely æsthetic. Never use a single hyphen for this purpose.

Between digits like page ranges (35–47), it is normal to use the short dash (what printers call an en-rule) which you get by typing two hyphens together, as in 35--47. If you want a minus sign, use math mode (section 2.9).

ToC2.8.5 Justification

The default mode for typesetting is justified (two parallel margins, with word-spacing adjusted automatically for the best optical fit). In justifying, LATEX will never add space between letters, only between words. There is a special package called so (‘space-out’) if you need special effects like letter-spacing, but these are best left to the expert.

There are two commands \raggedright and \raggedleft which set ragged-right (ranged left) and ragged-left (ranged right). Use them inside a group (see the panel ‘Grouping’ in section 8.2.2) to confine their action to a part of your text.

These modes also exist as ‘environments’ (see the last paragraph in section 3.2) called raggedright and raggedleft which are more convenient when applying this formatting to a whole paragraph or more, like this one.

These modes also exist as environments called raggedright 
and raggedleft which is more convenient when applying this 
formatting to a whole paragraph or more, like this one.

Ragged setting turns off hyphenation. There is a package ragged2e which retains hyphenation in ragged setting, useful when you have a lot of long words.

ToC2.8.6 Languages

LATEX can typeset in the native manner for several dozen languages. This affects hyphenation, word-spacing, indentation, and the names of the parts of documents used as headings (e.g. Table of Contents).

Most distributions of LATEX come with US English and one or more other languages installed by default, but it is easy to add the babel package and specify any of the supported languages or variants, for example:


Changing the language with babel automatically changes the names of the structural units and identifiers like ‘Abstract’, ‘Index’, etc. to their translated version. For example, using French as above, chapters will start with ‘Chapitre’.10

  1. Note that the babel package also sets the hyphenation patterns provided your version of LATEX has them precompiled (see the start of your log files for a list). For other languages you need to set the hyphenation separately (outside the scope of this book).

ToC2.9 Mathematics

As explained in the text in the Preface, TEX was originally written to automate the typesetting of books containing mathematics. The careful reader will already have noticed that mathematics is handled differently from normal text, which is why it has to be typeset specially. This document does not cover mathematical typesetting, which is explained in detail in many other books and Web pages, so all we will cover here is the existence of the math mode commands, and some characters which have special meaning, so they don't trip you up elsewhere.

In addition to the 10 special characters listed in section 2.5, there are three more characters which only have any meaning inside mathematics mode:

Key Meaning
| Vertical bar
< Less-than
> Greater-than

If you type any of these in normal text (ie outside math mode), you will get very weird things happening and lots of error messages. If you need to print these characters, you must type them using math mode, or use their symbolic names from the textcomp package (\textbrokenbar, \textlangle, and \textrangle).

The hyphen also has an extra meaning in math mode: it typesets as a minus sign, so if you want to write about negative numbers you need to type the number in math mode so the minus sign and the spacing come out right.

To use math mode within a paragraph, enclose your math expression in \( and \) commands. You can get the much-quoted equation E=mc2 by typing \(E=mc^2\), and to get a temperature like −30° you need to type \(-30\)°.11

To typeset a math expression as ‘displayed math’ (centered between paragraphs), enclose it in the commands \[ and \].12

\[\bar n^*_j(s)=\frac{\left\{s\sum_{i=1}^k

Displayed equations can be auto-numbered with the equation environment instead of the \[ and \] commands.

  1. Bear in mind that the degree symbol is a non-ASCII character, so you must specify what input encoding you are using if you want to type it: see the example of the inputenc package in section 2.7. If you don't want to use non-ASCII characters (or if you are using a system which cannot generate them), you can use the command \textdegree to get the degree sign.
  2. You will also see dollar signs used for math mode. This is quite common but deprecated: it's what plain TEX used in the days before LATEX, and the habit got ingrained in many mathematicians. It still works as a convenient shorthand like $x=y$, as do double-dollars for display-mode math like $$E=mc^2$$, but they are only mentioned here to warn readers seeing them in other authors' work that \(...\) and \[...\] are the proper LATEX commands.

Previous Top Next