The e-TEX format source file "etex.src" (V2.0)

The primary e-TeX format source file, "etex.src", is in principle merely a wrapper for "plain.tex", providing modified definitions for some Plain TeX commands (at present, just one: \tracingall), improving and generalising the register allocation mechanism, and adding new commands

In so doing, we have taken the opportunity to (a) provide intrinsic support for multiple-language typesetting (by deferring the processing of patterns and exceptions until a rudimentary language-handling environment has been defined), (b) provide a local as well as a global register allocation mechanism, (c) provide for the allocation of blocks of registers as well as single registers, (d) provide a means of allocating and accessing vectors (monodimensional arrays) of registers, and finally (e) provide a simple but effective module-handling system, to allow e-TeX ancilliary source files to be structured as libraries rather than as flat linear text files.

As the new commands and other features are not documented elsewhere, a brief explanation of their syntax and semantics is provided here.

\tracingall
The definition is augmented to enable tracing for the new e-TeX tracing primitives \tracingassigns, \tracinggroups, \tracingifs, \tracingnesting and \tracingscantokens; the numeric value assigned to the TeX primitives \tracingcommands and \tracinglostchars is increased as e-TeX will report additional detail in these circumstances.

\eTeX
A simple implementation of the e-TeX logo; a more sophisticated version, capable of being used in maths sub/superscripts for example, may find its way into etexdefs.lib in due course.

\loggingall
This command is equivalent to the sequence \tracingall \tracingonline = 0 .

\tracingnone
This command restores the initial state of the various \tracing... primitives following use of \tracingall or \loggingall.

\newmarks
As e-TeX provides 215 \marks rather than the single \mark of TeX, an allocator mechanism is required; we believe that \marks are closer to \boxes than to (say) \counts or \dimens and so have provided an analogous allocation mechanism, in that \newmarks <control sequence or active character> assigns a numeric value to the parameter rather than making it a synonym for an actual \mark; this numeric value can then be used to access individual \marks, \topmarks, \splitbotmarks, etc., as in
        \newmarks \rectomarks
         . . .
        \marks \rectomarks {This may form part of the recto running head}
         . . .
        \leftline {\topmarks \rectomarks}
It should be noted that as \marks 0 is synonymous with \mark, \newmarks will never allocate that particular value.
Note: This command was called \newmark in V1.1; the alternative spelling has been retained for compatibility, although it is now classed as deprecated.

\globbox, \globcount, \globdimen, \globmarks, \globmuskip, \globskip, \globtoks
Analogous to TeX's \newbox (etc), these commands globally allocate registers from e-TeX's extended register pool (i.e. from the register range from 20 (for \marks) or from 28 (for all other classes) to 215-1). Registers are globally allocated from the lower end of the range.

\locbox, \loccount, \locdimen, \locmarks, \locmuskip, \locskip, \loctoks
Analogous to \globbox (etc), these commands locally allocate registers from e-TeX's extended register pool (i.e. from the register range from 20 (for \marks) or from 28 (for all other classes) to 215-1). Registers are locally allocated from the upper end of the range.

\globboxblk, \globcountblk, \globdimenblk, \globmarksblk, \globmuskipblk, \globskipblk, \globtoksblk
These commands extend \globbox (etc) by globally allocating contiguous blocks of registers from e-TeX's extended register pool. The syntax used is:
        \glob(whatever)blk <control sequence or active char> n
where n specifies the length of the desired block. As n is parsed as an undelimited parameter, it must be expressed as a balanced text if it exceeds a single token. The <control sequence or active character> will be \mathchardef'd to the ordinal of the lowest register allocated.

\locboxblk, \loccountblk, \locdimenblk, \locmarksblk, \locmuskipblk, \locskipblk, \loctoksblk
These commands extend \locbox (etc) by locally allocating contiguous blocks of registers from e-TeX's extended register pool. The syntax used is:
        \loc(whatever)blk <control sequence or active char> n
where n specifies the length of the desired block. As n is parsed as an undelimited parameter, it must be expressed as a balanced text if it exceeds a single token. The <control sequence or active character> will be \mathchardef'd to the ordinal of the lowest register allocated.

\globcountvector, \globdimenvector, \globmuskipvector, \globskipvector, \globtoksvector
An extension to \globcountblk (etc), these commands use e-TeX's arithmetic expression capabilities to globally allocate vectors of boxes (etc) from e-TeX's extended register pool. The syntax used is:
        \glob(whatever)vector <control sequence or active char> n
where n specifies the length of the desired block. As n is parsed as an undelimited parameter, it must be expressed as a balanced text if it exceeds a single token. Once the vector has been defined, element m can be accessed in both left- and right-hand contexts as
        <control sequence or active char> m
where 0 <= m < n. As with n, m must be expressed as a balanced text if it exceeds a single token.

\globboxvector, \globmarksvector,
Analogous to but subtly different from the above, these commands use e-TeX's arithmetic expression capabilities to globally allocate vectors of boxes (etc) from e-TeX's extended register pool. The syntax used is:
        \glob(box-or-marks)vector <control sequence or active char> n
where n specifies the length of the desired block. As n is parsed as an undelimited parameter, it must be expressed as a balanced text if it exceeds a single token. Once the vector has been defined, element m can be accessed in left-hand contexts as
        <box-or-marks-referencer> <control sequence or active char> m
and in right-hand contexts as
        <box-or-marks-dereferencer> <control sequence or active char> m
where 0 <= m < n. As with n, m must be expressed as a balanced text if it exceeds a single token.

The significance of <box-or-marks-(de)referencer> is that boxes and marks are unlike other registers in that there exists a whole family of (de)referencers, one of which must be used in order to access the particular element required. For boxes, the sole referencer is \setbox, whilst the possible dereferencers include \box, \copy, \unhbox, \unvbox,, \unhcopy and \unvcopy. For marks, the sole referencer is \marks, whilst the possible dereferencers include \topmarks, \firstmarks, \botmarks, \splitfirstmarks and \splitbotmarks.

\loccountvector, \locdimenvector, \locmuskipvector, \locskipvector, \loctoksvector
An extension to \loccountblk (etc), these commands use e-TeX's arithmetic expression capabilities to locally allocate vectors of boxes (etc) from e-TeX's extended register pool. The syntax used is:
        \loc(whatever)vector <control sequence or active char> n
where n specifies the length of the desired block. As n is parsed as an undelimited parameter, it must be expressed as a balanced text if it exceeds a single token. Once the vector has been defined, element m can be accessed in both left- and right-hand contexts as
        <control sequence or active char> m
where 0 <= m < n. As with n, m must be expressed as a balanced text if it exceeds a single token.

\locboxvector, \locmarksvector,
Analogous to but subtly different from the above, these commands use e-TeX's arithmetic expression capabilities to locally allocate vectors of boxes (etc) from e-TeX's extended register pool. The syntax used is:
        \loc(box-or-marks)vector <control sequence or active char> n
where n specifies the length of the desired block. As n is parsed as an undelimited parameter, it must be expressed as a balanced text if it exceeds a single token. Once the vector has been defined, element m can be accessed in left-hand contexts as
        <box-or-marks-referencer> <control sequence or active char> m
and in right-hand contexts as
        <box-or-marks-dereferencer> <control sequence or active char> m
where 0 <= m < n. As with n, m must be expressed as a balanced text if it exceeds a single token.

The significance of <box-or-marks-(de)referencer> is that boxes and marks are unlike other registers in that there exists a whole family of (de)referencers, one of which must be used in order to access the particular element required. For boxes, the sole referencer is \setbox, whilst the possible dereferencers include \box, \copy, \unhbox, \unvbox,, \unhcopy and \unvcopy. For marks, the sole referencer is \marks, whilst the possible dereferencers include \topmarks, \firstmarks, \botmarks, \splitfirstmarks and \splitbotmarks.

\reserveinserts
As there are now so many registers available, there is a risk that a macro package may allocate so many that there are none of the first 255 left for use by insertions (which cannot use the extended register set). \reserveinserts n allows a package writer or user to reserve an additional n insertions above and beyond those already allocated. The syntax used is:
        \reserveinserts n
n must be expressed as a balanced text if it exceeds a single token.

\load
Although (Plain) TeX provides facilities for either \inputting a complete file or for \reading a file line-by-line, it makes no provision for any intermediate level of file access. In e-TeX, we provide facilities for \inputting one or more named modules from a suitably structured library file. The syntax used is:
        \load <module>[, <module>...] from <file>
whilst the library file itself should be structured as:
        %% e-TeXlib Vx.y
        \module {<name>}
         . . .
        \endmodule

        \module {<name>}
         . . .
        \endmodule
   
         etc.  
The %% header is required, and the actual values in Vx.y must correspond to the current version/revision of e-TeX; for the current release, the header must therefore be:
        %% e-TeXlib V2.0
If a library file is changed during the lifetime of the system, it is recommended (but not required) that this amendment be recorded in a cycle number appended to the header; a cycle number is of the form ;digit[s], and thus a valid header for the current release of e-TeX might be any of:
        %% e-TeXlib V2.0
        %% e-TeXlib V2.0;1
        %% e-TeXlib V2.0;247
etc. The necessity for a perfect match between the library header and the current version/revision of e-TeX may be relaxed in a future release if it transpires that no changes in the structure of user libraries are required for compatibility with future versions of e-TeX.

Modules in the standard library (etexdefs.lib)
e-TeX is distributed with a standard library which provides mnemonic names for the various values which can be returned by the new primitives. The library contains four modules: grouptypes, nodetypes, interactionmodes and iftypes. The standard format source, etex.src, loads interactionmodes by default; the others can be loaded using the appropriate one of the following:
	\load grouptypes from etexdefs.lib
	\load nodetypes from etexdefs.lib
	\load iftypes from etexdefs.lib
Once the relevant module has been loaded, the numeric values associated with each of the possible types/modes can be retrieved using one of the following commands with one of the parameters shewn:
	\grouptypes
			{simple}
			{hbox}
			{adjustedhbox}
			{vbox}
			{vtop}
			{align}
			{noalign}
			{output}
			{math}
			{disc}
			{insert}
			{vcenter}
			{mathchoice}
			{semisimple}
			{mathshift}
			{mathleft}

	\nodetypes
			{char}
			{hlist}
			{vlist}
			{rule}
			{ins}
			{mark}
			{adjust}
			{ligature}
			{disc}
			{whatsit}
			{math}
			{glue}
			{kern}
			{penalty}
			{unset}
			{maths}

	\conditionaltypes
			{charif}
			{catif}
			{numif}
			{dimif}
			{oddif}
			{vmodeif}
			{hmodeif}
			{mmodeif}
			{innerif}
			{voidif}
			{hboxif}
			{vboxif}
			{xif}
			{eofif}
			{trueif}
			{falseif}
			{caseif}
			{definedif}
			{csnameif}
			{fontcharif}

	\interactionmodes
			{batch}
			{nonstop}
			{scroll}
			{errorstop}
Multiple language typesetting
When TeX gained \language and \setlanguage primitives with the advent of TeX 3.0, no change was made to the Plain TeX source code to really exploit these features with the single exception of the \newlanguage command. In "etex.src", we defer the loading of patterns and hyphenation exceptions until a rudimentary language handling environment has been defined. We now assume that the user (or rather the format-creator) will, if required, modify the file called "language.def" by adding the various languages to be supported by the format. Each entry apart from the last in "language.def" is of the form:
        \addlanguage {<language>}
                     {<patterns file>}
                     {<exceptions file>}
                     {<left hyphen min>}
                     {<right hyphen min>} %%% shewn wrapped for clarity
The first line must be:
        \addlanguage {USenglish}{hyphen.tex}{}{2}{3}
whilst the last must be
        \uselanguage {USenglish}
to ensure that legacy documents not explicitly specifying a language process in a manner identical to TeX (that is, using American English patterns, exceptions and left- and right-hyphen minima). In the absence of a suitable language.def file, the default fallback mode (USenglish, with the canonical patterns, exceptions and left- and right-hyphen minima for TeX) will be used. Within the user document, \uselanguage {<some language>} should be used to switch languages, which will have the effect of loading appropriate patterns, exceptions, and left- and right hyphen minima. To allow the use of more powerful language-handling environments (e.g. Babel), the \uselanguage command finishes by testing whether the putative control sequence \uselanguage@hook is defined; if it is, then it is expanded, passing as parameter the name of the language to be used. It should be noted that \uselanguage is automatically invoked during the expansion of \addlanguage prior to the reading of patterns; a further hook, \addlanguage@hook, is invoked in an identical manner after the reading of patterns and exceptions so as to allow (for example) category-codes to be changed for the duration of the pattern- and exception-loading r�gime. This code is still classed as experimental, and if it transpires that a superior mechanism would improve the interface to Babel or LaTeX, it may be enhanced in the future.

The e-TeX format source "etex.src" is a product of the NTS group.


Please notify any errors in this document to its creator;
Last updated and validated 24-MAR-1998 19:45:12 /PT
W3C HTML 4.0 validated.