Sophie: zhcon-0.2.b2-1mdk i586

zhcon-0.2.b2-1mdk.i586.rpm

         The PSF file-format for big charset

     (C) 2001 Hu Yong <ccpaging@online.sh.cn>
      
This file documents the PSF file-format for big charset(called as BPSF as 
below), as understood by version 0.15 and above of zhcon, and zhcon is orient
language envirment for linux console. This file is based on PSF format used in
Linux console utilites by Yann Dirson <dirson@debian.org>

This file has revision number 0.1, and is dated 2001/08/08.
Any useful additional information on BPSF files would be great.


0. Changes:

1. Summary

   BPSF stands for PC Screen Font in orient language environment, and bases on
PSF file format used in lct. We need understand PSF at first, and why we 
extend it?

   The PSF file basically contains one character-font, whose width is 8 
pixels, i.e. each scan line in a character occupies 1 byte.

   It may contain characters of any height between 0 and 255, though character
heights lower than 8 or greater than 32 are not attested to exist or even be
useful [more info needed on this].

   Fonts can contain either 256 or 512 characters.

   The file can optionally contain a unicode mapping-table, telling, for each
character in the font, which UCS2 characters it can be used to display.

   The "file mode" byte controls font size (256/512) and whether file contains
a unicode mapping table.

   When we developing zhcon, we have two main problems.

   The first is font width, we need to get more choice to display beautiful 
and more chars on console, the width may be any between 12 to 24, even 48.

   The second is chars count in font. Chinese charset is more then 8,000 chars,
PSF can only hold 512 characters.

   According Yann Dirson's document, XPSF(the extend of PSF) can hold on big 
charset. We may change BPSF to XPSF in the future, but now we can not wait. 
You can not expect BPSF solve all problem for the console font, and it works
for big charset only.

	BPSF extend "file mode" byte. BPSF is same as PSF when "file mode" between 
0 to 3, and follow only one byte, char height. If "file mode" is 4 or 5, there 
is six bytes followed, include char width, char height, and number of chars.
  

2. History

   The PSF file format was designed by H. Peter Anvin <hpa@transmeta.com> in
1989 or so for his DOS screen font editor, FONTEDIT.EXE. When he became 
involved with Linux, he used it for the Linux font stuff he worked with, 
released a binary of FONTEDIT.EXE for free distribution, and added the Unicode
table to the spec.


3. Known programs understanding this file-format.

   Only zhcon can read and/or write BPSF files. 
   
   The program in the Linux console utilities, like:
      setfont (R/W)
      psfaddtable (R/W)
      psfstriptable (R/W)
      psfgettable (R)
   can read and/or write PSF files. Those program would show error message 
when involved in BPSF font file.

4. Technical data

   The file format is described here in sort-of EBNF notation. Upper-case
WORDS represent terminal symbols, i.e. C types; lower-case words represent
non-terminal symbols, i.e. symbols defined in terms of other symbols.
  [sym] is an optional symbol
  sym1 sym2   is sym1 followed by sym2
  sym1 | sym2 is either sym1 or sym2
    {sym} is a symbol that can be repeated 0 or more times
  {sym}*N is a symbol that must be repeated N times
    Comments are introduced with a # sign.


 # The data (U_SHORT's) are stored in LITTLE_ENDIAN byte order.

  psf_file =  psf_header
    raw_fontdata
     [unicode_data]
      

 psf_header =   CHAR = 0x36  CHAR = 0x04   # magic number
      filemode
      font info
     
  filemode =  CHAR     # 0 : 256 characters, no unicode_data
                       # 1 : 512 characters, no unicode_data
                       # 2 : 256 characters, with unicode_data
                       # 3 : 512 characters, with unicode_data
                       # 4 : big charset, no unicode_data
                       # 5 : big charset, with unicode_data

   # file mode 4 or 5 is the extend of PSF
   font info :
      fontheight              # filemode is between 0 to 3
      | big_charset_info      # filemode is between 4 to 5
 
   fontheight =   CHAR  # measured in scan lines

   big_charset_info :
      charwidth
      charheight
      fontsize
      
   charwidth :
      u8       # char width in bits
      
   charheight :
      u8       # char height in bits
      
   fontsize :
      u32      # number of chars in font

   raw_fontdata = {char_data}*<fontsize>

   char_data = {row_data}*<charheight>
   
   row_data = scan line bits expand to byte, i.e., 1 byte when charwidth is 
less then 8, 2 bytes when charwidth between 9 to 16, and so on. The expand bit
is 0. This may cost more memory, but it really efficent when display.

   # unicode_data is just behind raw_fontdata

   unicode_data = { unicode_array psf_separator }*<fontsize>

   unicode_array =   { unicode }          # any necessary number of times

   unicode =   U_SHORT              # UCS2 code
   psf_separator =   unicode = 0xFFFF