de.dante.extex.language.word
Interface WordTokenizer

All Superinterfaces:
java.io.Serializable
All Known Subinterfaces:
Language, ManagedLanguage, ModifiableLanguage
All Known Implementing Classes:
BaseHyphenationTable, ExTeXWords, FutureLanguage, TeXWords

public interface WordTokenizer
extends java.io.Serializable

This interface describes the contract for a tokenizer which is able to split a list of nodes into words. This kind of tokenizer might be language specific.

Version:
$Revision: 1.4 $
Author:
Gerd Neugebauer

Method Summary
 int findWord(NodeList nodes, int start, UnicodeCharList word)
          Extract a word from a node list.
 void insertShy(NodeList nodes, int insertionPoint, boolean[] spec, CharNode hyphenNode)
          Insert hyphenation points into a list of nodes.
 UnicodeCharList normalize(UnicodeCharList word, TypesetterOptions options)
          Normalize a word for the lookup.
 

Method Detail

findWord

public int findWord(NodeList nodes,
                    int start,
                    UnicodeCharList word)
             throws HyphenationException
Extract a word from a node list.

Parameters:
nodes - the nodes to extract the word from
start - the start index
word - the target list for the letters of the word
Returns:
the index of the first node beyond the word
Throws:
HyphenationException - in case of an error

insertShy

public void insertShy(NodeList nodes,
                      int insertionPoint,
                      boolean[] spec,
                      CharNode hyphenNode)
               throws HyphenationException
Insert hyphenation points into a list of nodes.

Parameters:
nodes - the node list to modify
insertionPoint - the index to insert something into the nodes
spec - the specification where to insert hyphenation marks. If spec[i] is true then a hyphen needs to be inserted before the ith character at or after insertionPoint in nodes
hyphenNode - the hyphen as node
Throws:
HyphenationException - in case of an error

normalize

public UnicodeCharList normalize(UnicodeCharList word,
                                 TypesetterOptions options)
                          throws HyphenationException
Normalize a word for the lookup.

Parameters:
word - the word to normalize
options - the options to use
Returns:
the normalized word
Throws:
HyphenationException - in case of an error