de.dante.extex.language.hyphenation.liang
Class LiangsHyphenationTable

java.lang.Object
  extended byde.dante.extex.language.hyphenation.base.BaseHyphenationTable
      extended byde.dante.extex.language.hyphenation.liang.LiangsHyphenationTable
All Implemented Interfaces:
Hyphenator, Language, LigatureBuilder, ModifiableLanguage, java.io.Serializable, WordTokenizer
Direct Known Subclasses:
CompressedLiangsHyphenationTable

public class LiangsHyphenationTable
extends BaseHyphenationTable

This class stores the values for hyphenations and hyphenates words. It uses Liang's algorithm as described in the TeXbook.

Liang's Algorithm

The hyphenation in TeX is based on Liang's thesis. This algorithm is based on patterns which consist of characters or a special marker for the beginning and the end of the word. For each pattern it is characterized how desirable or undesirable it would be to hyphenate before, between, or after it.

This weighted hyphenation codes cna be represented by integers. The even integers denote the undesirable positions and the odd numbers denote the optional hyphenation points.

Let us consider the pattern hyph} this pattern has associated to it the code 00300. The first number corresponds to the position before the letter h, the second number to the position before the letter p, and so on. Thus this pattern indicates that a hyphenation point can be inserted between y and p. This leads to hy\-ph if written explicitly in TeX.

The following table shows some more examples taken from the original hyphenation patterns of TeX for English. The character . denotes the beginning or the end of a word. In the TeX patterns the word pattern and the hyphenation codes are intermixed and the hyphenation codes 0 are left out.

Word patternCodesTeX Pattern
ader. 005000ad5er.
.ach 00004 .ach4
sub 0043 su4b3
ty 100 1ty
type 00003 type3
pe. 4000 pe.

To find all hyphenation points in a word all matching patterns have to be superimposed. During this superposition the higher hyphenation codes overrule the lower ones.

In the following figure the patterns for the word ``subtype'' are shown.

   s u b t y p e
  0s0u4b3
        1t0y0
        0t0y0p0e3
            4p0e3.
  ---------------
  0s0u4b3t0y4p0e3
 

The superposition of all patterns leads to the result sub\-type\-. Here two additional parameters come into play. \lefthyphenmin denotes the minimal number of characters before a hyphenation at the beginning of a word and \righthyphenmin the corresponding length at the end of a word. \lefthyphenmin is set to 2 and \righthyphenmin to 3 for English in TeX. Thus the final hyphen is not considered.

Version:
$Revision: 1.13 $
Author:
Gerd Neugebauer
See Also:
Serialized Form

Field Summary
protected static long serialVersionUID
          The constant serialVersionUID contains the id for serialization.
 
Constructor Summary
LiangsHyphenationTable()
          Creates a new object.
 
Method Summary
 void addPattern(Tokens pattern)
          This methods allows the caller to add another pattern
 void dump(java.util.logging.Logger logger)
          Write the tree to a logger.
protected  de.dante.extex.language.hyphenation.liang.HyphenTree getPatterns()
          Getter for patterns.
 boolean hyphenate(NodeList nodelist, TypesetterOptions context, UnicodeChar hyphen, int start, boolean forall, NodeFactory nodeFactory)
          Insert the hyphenation marks for a horizontal list of nodes.
 boolean hyphenateOne(NodeList nodelist, TypesetterOptions context, int start, UnicodeCharList word, CharNode hyphenNode)
          Hyphenate a single word.
protected  boolean isCompressed()
          Getter for compressed.
protected  void setCompressed()
          Setter for compressed.
 
Methods inherited from class de.dante.extex.language.hyphenation.base.BaseHyphenationTable
addHyphenation, createHyphenation, findWord, getLeftHyphenmin, getLigature, getName, getRightHyphenmin, insertLigatures, insertShy, isHyphenActive, normalize, readResolve, setHyphenActive, setLeftHyphenmin, setLigatureBuilder, setName, setRightHyphenmin, setWordTokenizer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serialVersionUID

protected static final long serialVersionUID
The constant serialVersionUID contains the id for serialization.

See Also:
Constant Field Values
Constructor Detail

LiangsHyphenationTable

public LiangsHyphenationTable()
Creates a new object.

Method Detail

addPattern

public void addPattern(Tokens pattern)
                throws IllegalValueHyphenationException,
                       IllegalTokenHyphenationException,
                       DuplicateHyphenationException,
                       ImmutableHyphenationException
This methods allows the caller to add another pattern

Specified by:
addPattern in interface Hyphenator
Overrides:
addPattern in class BaseHyphenationTable
Parameters:
pattern - a sequence of tokens alternatively of type other and letter. The other tokens must be numbers. The letter tokens period (.) are interpreted as beginning of word or end of word marker.
Throws:
IllegalValueHyphenationException - in case that an other token does not carry a digit
IllegalTokenHyphenationException - in case that an illegal token has been detected in the pattern
DuplicateHyphenationException - in case that a hyphenation pattern is tried to be added a second time
ImmutableHyphenationException - in case that the hyphenation table is immutable; i.e. the compressed flag is set
See Also:
Hyphenator.addPattern( Tokens)

dump

public void dump(java.util.logging.Logger logger)
Write the tree to a logger.

Parameters:
logger - the target logger

getPatterns

protected de.dante.extex.language.hyphenation.liang.HyphenTree getPatterns()
Getter for patterns. This method is meant for testing purposes only.

Returns:
the patterns

hyphenate

public boolean hyphenate(NodeList nodelist,
                         TypesetterOptions context,
                         UnicodeChar hyphen,
                         int start,
                         boolean forall,
                         NodeFactory nodeFactory)
                  throws HyphenationException
Description copied from interface: Hyphenator
Insert the hyphenation marks for a horizontal list of nodes. The hyphenation marks are made up of discretionary nodes.

Specified by:
hyphenate in interface Hyphenator
Overrides:
hyphenate in class BaseHyphenationTable
Throws:
HyphenationException
See Also:
Hyphenator.hyphenate( de.dante.extex.typesetter.type.NodeList, de.dante.extex.typesetter.TypesetterOptions, de.dante.util.UnicodeChar, int, boolean, de.dante.extex.typesetter.type.node.factory.NodeFactory)

hyphenateOne

public boolean hyphenateOne(NodeList nodelist,
                            TypesetterOptions context,
                            int start,
                            UnicodeCharList word,
                            CharNode hyphenNode)
                     throws HyphenationException
Description copied from class: BaseHyphenationTable
Hyphenate a single word.

Overrides:
hyphenateOne in class BaseHyphenationTable
Parameters:
nodelist - the node list to consider
context - the options to use
start - the start index in the nodes
word - the word to hyphenate
hyphenNode - the node to use as hyphen
Returns:
true iff the the word has been found
Throws:
HyphenationException - in case of an error
See Also:
BaseHyphenationTable.hyphenateOne( de.dante.extex.typesetter.type.NodeList, de.dante.extex.typesetter.TypesetterOptions, int, de.dante.util.UnicodeCharList, de.dante.extex.typesetter.type.node.CharNode)

isCompressed

protected boolean isCompressed()
Getter for compressed.

Returns:
the compressed

setCompressed

protected void setCompressed()
Setter for compressed.