mf2pt3 (c) Copyright 1998 Apostolos Syropoulos [email protected]

This is program mf2pt3 a Perl script that generates a PostScript Type 3 font that corresponds to a METAFONT font description. In order to achieve its goal the program utilizes another program: mfplain (METAPOST with the mfplain base preloaded). This program generates EPSF files for each character. This document assumes that the reader is familiar with the Type 3 font terminology and its structure. For more information one should consult the ``PostScript User Manual'', or any good book on PostScript like ``Postscript by Example'', by Henry McGilton and Mary Campione, published by Addison-Wesley Pub Co. We now describe the way the program operates. First of all, we generate for each character of a METAFONT font an EPSF file. Next we collect the BoundingBox information for each character, as this piece of information is vital to the construction of the Type 3 font. Now we can proceed with the construction of the font. Finally, we delete some unnecessary files and we output the line the user must add to his/her psfonts.map file in order to use the Type 3 font when he/she creates a PostScript file from a DVI file.

<*>=
#!/usr/bin/perl
#
#(c) Copyright 1998 Apostolos Syropoulos
#                   [email protected]
# 
<Initialization of constants>
<Command line argument handling>
<Generation of EPSF files>
<Construction of Type 3 font>
<Delete unnecessary files>
print "\n$MFfile $MFfile <$MFfile.pt3\n";

This code is written to a file (or else not used).

Since we don't know on what system the program will be used, we must make sure it calls the GhostScript and METAPOST programs in the proper way. Moreover, we supply to each command the proper command line switches. The magnification is set to 100 as the usual design size is 10 pt. The BoundingBox information are kept in a compact format in an array.

<Initialization of constants>=
   $mfplain="mfplain \'\\mode=localfont; \\batchmode; ";
   <BoundingBox initialization>
   <Encoding array initialization>

Used above.

Since Perl does not provide record structures, we use the pack function to create a structure which will contain the BoundingBox information. Each BoundingBox corresponds to four numbers: llx, lly, urx, and ury. If any of the 256 character slots is undefined each of these four numbers is set to zero. For efficiency reasons each BoundingBox structure contains one more piece of information--- an ASCII character, which indicates whether the corresponding character is defined ("d") or undefined ("u"). All the BoundingBox information are kept in an array which is assumed to contain only undefined characters.

<BoundingBox initialization>=
$notdef=pack("ai4","u",0,0,0,0);
for($i=0; $i<=255; $i++){ $BoundingBox[$i]=$notdef }

Used above.

The encoding vector is a vital part of a PostScript font. The internal name of each character is completely irrelevant to the final output. So, someone can choose any name it pleases him.

<Encoding array initialization>=
@Encoding = ("/_a0", "/_a1", "/_a2", "/_a3", "/_a4", 
             "/_a5", "/_a6", "/_a7", "/_a8", 
             "/_a9", "/_a10", "/_a11", "/_a12", 
             "/_a13", "/_a14", "/_a15", "/_a16", 
             "/_a17", "/_a18", "/_a19", "/_a20", 
             "/_a21", "/_a22", "/_a23", "/_a24", 
             "/_a25", "/_a26", "/_a27", "/_a28", 
             "/_a29", "/_a30", "/_a31", "/_a32", 
             "/_a33", "/_a34", "/_a35", "/_a36", 
             "/_a37", "/_a38", "/_a39", "/_a40", 
             "/_a41", "/_a42", "/_a43", "/_a44", 
             "/_a45", "/_a46", "/_a47", "/_a48", 
             "/_a49", "/_a50", "/_a51", "/_a52", 
             "/_a53", "/_a54", "/_a55", "/_a56", 
             "/_a57", "/_a58", "/_a59", "/_a60", 
             "/_a61", "/_a62", "/_a63", "/_a64", 
             "/_a65", "/_a66", "/_a67", "/_a68", 
             "/_a69", "/_a70", "/_a71", "/_a72", 
             "/_a73", "/_a74", "/_a75", "/_a76", 
             "/_a77", "/_a78", "/_a79", "/_a80", 
             "/_a81", "/_a82", "/_a83", "/_a84", 
             "/_a85", "/_a86", "/_a87", "/_a88", 
             "/_a89", "/_a90", "/_a91", "/_a92", 
             "/_a93", "/_a94", "/_a95", "/_a96", 
             "/_a97", "/_a98", "/_a99", "/_a100", 
             "/_a101", "/_a102", "/_a103", "/_a104", 
             "/_a105", "/_a106", "/_a107", "/_a108", 
             "/_a109", "/_a110", "/_a111", "/_a112", 
             "/_a113", "/_a114", "/_a115", "/_a116", 
             "/_a117", "/_a118", "/_a119", "/_a120", 
             "/_a121", "/_a122", "/_a123", "/_a124", 
             "/_a125", "/_a126", "/_a127", "/_a128", 
             "/_a129", "/_a130", "/_a131", "/_a132", 
             "/_a133", "/_a134", "/_a135", "/_a136", 
             "/_a137", "/_a138", "/_a139", "/_a140", 
             "/_a141", "/_a142", "/_a143", "/_a144", 
             "/_a145", "/_a146", "/_a147", "/_a148", 
             "/_a149", "/_a150", "/_a151", "/_a152", 
             "/_a153", "/_a154", "/_a155", "/_a156", 
             "/_a157", "/_a158", "/_a159", "/_a160", 
             "/_a161", "/_a162", "/_a163", "/_a164", 
             "/_a165", "/_a166", "/_a167", "/_a168", 
             "/_a169", "/_a170", "/_a171", "/_a172", 
             "/_a173", "/_a174", "/_a175", "/_a176", 
             "/_a177", "/_a178", "/_a179", "/_a180", 
             "/_a181", "/_a182", "/_a183", "/_a184", 
             "/_a185", "/_a186", "/_a187", "/_a188", 
             "/_a189", "/_a190", "/_a191", "/_a192", 
             "/_a193", "/_a194", "/_a195", "/_a196", 
             "/_a197", "/_a198", "/_a199", "/_a200", 
             "/_a201", "/_a202", "/_a203", "/_a204", 
             "/_a205", "/_a206", "/_a207", "/_a208", 
             "/_a209", "/_a210", "/_a211", "/_a212", 
             "/_a213", "/_a214", "/_a215", "/_a216", 
             "/_a217", "/_a218", "/_a219", "/_a220", 
             "/_a221", "/_a222", "/_a223", "/_a224", 
             "/_a225", "/_a226", "/_a227", "/_a228", 
             "/_a229", "/_a230", "/_a231", "/_a232", 
             "/_a233", "/_a234", "/_a235", "/_a236", 
             "/_a237", "/_a238", "/_a239", "/_a240", 
             "/_a241", "/_a242", "/_a243", "/_a244", 
             "/_a245", "/_a246", "/_a247", "/_a248", 
             "/_a249", "/_a250", "/_a251", "/_a252", 
             "/_a253", "/_a254", "/_a255");

Used above.

The program accepts at most five command line arguments:

In case it is invoked without any command line arguments, it prints usage information. In order to properly use the program, one has to provide only the font name. Command line processing is being done in a relatively standard way. A while loop goes through each command line argument, checks its form and sets special global variables. In the case of the UniqueID argument we make sure it lies within the valid range, i.e., it is a number greater or equal than 4,000,000 and less than 5,000,000. Once we have finished with the scanning of the command line arguments we must further process certain pieces of information.
<Command line argument handling>=
$argc = @ARGV;
$design_size = -1;
$nodel = 0;
$eofill = 0;
$noID = 1;
SWITCHES: while($_ = $ARGV[0], /^-/)
{
    shift;
    if(/^-d(\d+)/)
    {
        $design_size = $1;
    }
    elsif(/^-nodel$/)
    {
        $nodel = 1;
    }
    elsif(/^-eofill$/)
    {
        $eofill = 1;
    }
    elsif(/^-I(\d+)$/)
    {
      die "UniqueID must lie in the range 4,000,000...4,999,999\n"
      if ($1 > 4999999 || $1 < 4000000);
      $UniqueID = $1;
      $noID = 0;
    }
    elsif (!@ARGV)
    {
        last SWITCHES;
    }
}    
if (!@ARGV)
{
   print <<Usage;
This ``mf2pt3'' version 1.1
Usage: mf2pt3 [-dsize] [-nodel] [-eofill] [-IUniqueID] <METAFONT file name>
Usage
exit(0);
}
else
{
   $MFfile = $ARGV[0];
}
<Further command line argument processing>

Used above.

If the user has specified a Unique Font Identity number, the program must generate one. Moreover, if the user hasn't explicitly specified a design size, the program must extract it from the font name. Finally, we have to remove the file name extension in order to get the name of the new PostScript font.

<Further command line argument processing>=
if ($noID)
{
   <Generate UniqueID>
}
<Get design size>
<Remove file extension>

Used above.

In case the user hasn't specified a UniqueID we must generate one For this purpose we use function rand and the random number generator seed (srand), in order to ensure some sort of... randomness. Since rand produces a number in the range 0...1, we multiply the output of rand by 999999 so that we have a number number in the range 0...999999, and we add to this the number 4000000 so that the final random number is in the expected range.

<Generate UniqueID>=
   srand();
   $UniqueID = int(999999*rand())+4000000;

Used above.

As it is know a METAFONT font name consists of two parts: a symbolic acronym, specifying its characteristics, and a number specifying its design size. The number can be either a two or a four digit number (compare cmr17 with ecsx1095.) In both cases we extract the number from the font name and then in the first case we divide it by 10 and in the second case we divide it by 1000 to get the magnification factor. Finally, we set the magnification. The number 100 is chosen because font data must be integers greater than 100. Of course, nn case the user has already given the design size, there is no reason to extract if from the font name. But, still we must process it in oder to ensure the generation of valid output.

<Get design size>=

if ($design_size == -1)
{
   if ($MFfile =~ /\D+(\d+)$/)
   {
      $design_size=$1;
   }
   else
   {
      die "$MFfile must be a PostScript font name: there is no design size.\n";
   }
}
if($design_size >100)
{
   $mag_factor=$design_size/1000;
}
else
{
   $mag_factor=$design_size/10;
}
$mag = 100 /$mag_factor;

Used above.

If the user supplies the METAFONT file name with an extension we simple chop it off. This is done pretty simple by employing Perl's fantastic regular-expression mechanism.

<Remove file extension>=
   $MFfile = $1 if $MFfile =~ /(\w+)\.\w*/;

Used above.

We proceed now to the generation of the EPSF files. This task is performed by METAPOST. Initially, we create the EPSF files by executing mfplain. If for some reason there is no TFM file, the program stops and prints an error message. (It is most likely that the user has typed the name of non-existing METAFONT font.) The $mfplain command is augmented by the magnification factor and the input part.

<Generation of EPSF files>=
   $mfplain .= "mag=$mag; input $MFfile \'";
   system($mfplain);
   if (!(-e "$MFfile.tfm"))
   {
      $nodel || unlink "mpout.log";  
      die "$MFfile: no such font in system\n";
   }

Used above.

Since, the various EPSF files have been generated successfully, we can now start collecting the BoundingBox information. First, we get the names of all EPSF files. Next, we must open each EPSF file, and find the line that contains the BoundingBox information. This is easy, since in an EPSF file the line that contains this piece of information look like the following one: %%BoundingBox: 0 -1 6 6. While doing this we must get the FontBBox information. We use four variables for this purpose. The next step is produce the first part of the font, i.e., the metrics section and the encoding information.

<Construction of Type 3 font>=
     <Get the file names of the EPSF files>
     $Min_llx = $Min_lly = $Max_urx = $Max_ury = 0; 
     <get bounding boxes>
<Generate Type 3 Font>
     
  
Used above.

It could be easy to get the file names by a simple pipe, but since this program may be used in OS other that Unix, we prefer to do it in a more portable way--- we simply open the directory and store all the file names that fulfill with a name that is identical to $MFfile.

<Get the file names of the EPSF files>=
   opendir(Dir, ".");
   $pattern = "$MFfile" . "\\.\\d+";
   @EPSFs = grep(/$pattern/, readdir(Dir));
   closedir Dir;

Used above.

In order to get the BoundingBox information we open each EPSF file and we get the line that contains these information. For this we simply employ the pattern matching capabilities of Perl. Then we store these information in a compact way to the appropriate index in the BoundingBox array. (Readers not familiar with regular expressions should consult the Perl manual.) As a side effect we calculate the total number of characters that the font will provide in variable $total_chars (plus one as there is always the /.notdef character).

<get bounding boxes>=

   $total_chars = @EPSFs+1;
   foreach $file (@EPSFs)
   {
      open(EPSF_FILE,"$file")||die "Can't open file $file\n";
      while (<EPSF_FILE>)
      {
         $BBox = pack("ai4","d",$1,$2,$3,$4)
         if /%%BoundingBox: (-?\d+) (-?\d+) (-?\d+) (-?\d+)/;
      }
      close EPSF_FILE;
      $_=$file;
      /$MFfile\.(\d+)/;
      $BoundingBox[$1] = $BBox;
      ($_, $llx, $lly, $urx, $ury) = unpack("ai4", $BBox);
      $Min_llx = $llx if $llx < $Min_llx;
      $Min_lly = $lly if $lly < $Min_lly;
      $Max_urx = $urx if $urx > $Max_urx;
      $Max_ury = $ury if $ury > $Max_ury;
   }

Used above.

We now have all the information we need in order to generate the complete Type 3 font. We first, create the font file and then we print to it some information which are pretty standard, such as the /FontType, etc. Next, we let PostScript know which characters will the font provide, and then we generate the BoundingBox dictionary, the Metrics dictionary, and the CharProcs dictionary. Finally, we generate the BuildGlyph procedure and we define the font.

<Generate Type 3 Font>=
      open(TYPE3, ">$MFfile.pt3")||die "Can't create file $MFfile.pt3\n";
      $date = localtime;
      print TYPE3 <<DATA;
%!PS-Adobe-2.0
%%Creator: mf2pt3 1.1 Copyright 1998 Apostolos Syropoulos
%%CreationDate: $date
%%EndComments
11 dict % 11 entries
begin
    /FontType 3 def
    /UniqueID $UniqueID def
    /FontName /$MFfile def
    /FontBBox [ $Min_llx $Min_lly $Max_urx $Max_ury] def 
    /FontMatrix [ 0.001 0 0 0.001 0 0 ] def
    /Encoding 256 array def
    0 1 255 { Encoding exch /.notdef put } for
DATA
    <make known all the characters of the font>    
    <generate bounding box dictionary>
    <generate metrics dictionary>   
    <generate CharProcs dictionary>
<generate BuildGlyph procedure>
 
Used above.

After initializing the Encoding vector, we must make all those assignments so that PostScript will know which characters are in this font. This is trivial--- we simply scan the BoundingBox array and for each defined character we print a line of the form Encoding N Name, where N is the number of the character and Name the Nth entry in the Encoding array.

<make known all the characters of the font>=
   for($i=0; $i<=256; $i++)
   {
      $_ = unpack("ai4", $BoundingBox[$i]);
      print TYPE3 "Encoding $i $Encoding[$i] put\n" if $_ eq "d";
   } 

Used above.

In general the bounding boxes dictionary consists of some of entries like the following one: /A [ 0 -100 600 700 ] def. Since, the numbers are stored in the BoundingBox array our task is simple. However we must print some data that let PostScript know the size of the dictionary.

<generate bounding box dictionary>=
   print TYPE3 "/BoundingBoxes $total_chars dict def\n";
   print TYPE3 "BoundingBoxes begin\n";
   print TYPE3 "/.notdef { 0 0 0 0 } def\n";
   for($i=0; $i<=256; $i++)
   {
      ($_, $llx, $lly, $urx, $ury) = unpack("ai4",$BoundingBox[$i]);
      print TYPE3 "$Encoding[$i] [ $llx $lly $urx $ury ] def\n" 
                                                 if $_ eq "d";
   }
   print TYPE3 "end %BoundingBoxes\n"; 
  

Used above.

The metrics dictionary is created in way similar to the bounding box dictionary. Generally it consists of lines of the form: /A 600 def where the number is the difference urx-llx. As usual we must first output some book-keeping information.

<generate metrics dictionary>=
   print TYPE3 "/Metrics $total_chars dict def\n";
   print TYPE3 "Metrics begin\n";
   print TYPE3 "/.notdef 0 def\n";
   for($i=0; $i<=256; $i++)
   {
      ($_, $llx, $lly, $urx, $ury) = unpack("ai4",$BoundingBox[$i]);
      $diff = $urx - $llx;
      print TYPE3 "$Encoding[$i] $diff  def\n" if $_ eq "d";
   }
   print TYPE3 "end %Metrics\n";


Used above.

Generating the CharProcs dictionary involves the extraction of the PostScript code from the various EPSF files. Moreover, we must be careful to avoid extracting comments and to delete the keyword showpage from the source code. Both, operations are getting done by making use of the regular expression facilities and operators that Perl provide. Apart from that we generation process is similar to the previous ones. First, we generate some standard code and then the PostScript code for each character. Please note, that after initial experimentation I have found that METAPOST generates a lot of strange setgray commands which actually confuse a PostScript interpreter, so the program eliminates all such commands.

<generate CharProcs dictionary>=
    print TYPE3 "/CharProcs $total_chars dict def\n";
    print TYPE3 "CharProcs begin\n";
    print TYPE3 "/.notdef { } def\n";
    for($i=0; $i<=256; $i++)
    {
      $_ = unpack("ai4", $BoundingBox[$i]);
      if ($_ eq "d")
      {
         open(CHAR, "$MFfile.$i")||
         die "Can't open file $MFfile.$i\n"; 
         $code = ""; #"100 100 scale\n";
         while (<CHAR>)
         {
            $code .= $_ if $_ !~ /^%/;
         }
         close CHAR;
         $code =~ s/showpage\n*//mg; #eliminate showpage
         $code =~ s/\d setgray//mg;  #eliminate setgray
         $code =~ s/(\d*)\.\d+/$1/mg; #chop decimal digits
         print TYPE3 $Encoding[$i], " { %\n";
         if ($eofill)
         {
            <Manipulate fills and newpaths> 
         }
         else
         {
            print TYPE3 $code;
         }
         print TYPE3 "} bind def\n";
      }
    }  
    print TYPE3 "end %CharProcs\n"; 

Used above.

In the eofill mode, we eliminate newpath calls and fill calls that are not following gsave. And in the end we call eofill.

<Manipulate fills and newpaths>=
$code =~ s/gsave fill/GSAVE FILL/mg;  # save for soon restore
$code =~ s/fill//mg;                  # eliminate fill
$code =~ s/newpath//mg;               # eliminate fill
$code =~ s/GSAVE FILL/gsave fill/mg;  # restore 
print TYPE3 $code, "eofill\n";

Used below.

Procedure BuildGlyph is very important as it defines the way PostScript will be able to use the font. The generated code is pretty standard.

<generate BuildGlyph procedure>=
 print TYPE3 <<BUILDGLYPH;
    /BuildGlyph {
         exch
         begin
             dup
             Metrics exch get % get x displacement
             0                % set y displacement
             2 index
             BoundingBoxes
             exch get aload pop
             setcachedevice
             CharProcs exch get
             exec
             fill
         end
    } def
    currentdict
end
/$MFfile exch definefont
pop
BUILDGLYPH
close TYPE3;

Used above.

Now that we are done we must free some space on our hard disks, unless of course the user has decided the opposite. In case we are allowed to free space, we remove all the EPSF files, the log file and the TFM file, leaving only the Type 3 font file.

<Delete unnecessary files>=
   if (!$nodel)
   {
      unlink @EPSFs;
      unlink "$MFfile.log", "$MFfile.tfm";
   }

Used above.

*

Acknowledgments


John Hobby (the creator of METAPOST) and Yotam Medini had helped me in the development phase and the debugging phase correspondingly. Many thanks to both of you!