things.data.processing
Class PhraseMatcher

java.lang.Object
  extended by things.data.processing.PhraseMatcher

public abstract class PhraseMatcher
extends java.lang.Object

General phrase matcher. This one I've tried to make Unicode compatible. I just hope the reader will yield the right characters.

Version:
1.0

Version History

EPG - Initial - 6 JUL 09
 
Author:
Erich P. Gatejen

Field Summary
static int MAX_PHRASE_SIZE_IN_BYTES
           
protected  char[] phraseBuffer
           
protected  int phraseBufferLength
           
 
Constructor Summary
PhraseMatcher()
           
 
Method Summary
protected abstract  void declarations()
          All declarations should be put here, so they are done with any initialization.
 void declare(java.lang.String phrase, int id, boolean caseSensitive)
          Declare a phrase.
 void init()
          Reinitialize the processor.
protected abstract  void match(int id, char[] phrase, int len, java.io.Writer out)
          This method will be called when a phrase is matched.
 void process(java.lang.String docId, java.io.Reader input, java.io.Writer output)
          Process the reader.
protected abstract  void start(java.lang.String docId)
          Start on a specific document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MAX_PHRASE_SIZE_IN_BYTES

public static final int MAX_PHRASE_SIZE_IN_BYTES
See Also:
Constant Field Values

phraseBuffer

protected char[] phraseBuffer

phraseBufferLength

protected int phraseBufferLength
Constructor Detail

PhraseMatcher

public PhraseMatcher()
              throws java.lang.Throwable
Throws:
java.lang.Throwable
Method Detail

declarations

protected abstract void declarations()
                              throws java.lang.Throwable
All declarations should be put here, so they are done with any initialization.

Throws:
java.lang.Throwable

start

protected abstract void start(java.lang.String docId)
                       throws java.lang.Throwable
Start on a specific document. This gives the implementation a chance to initialize.

Parameters:
docId - The id for the document, data, or whatever. The implementation may choose to ignore it.
Throws:
java.lang.Throwable

match

protected abstract void match(int id,
                              char[] phrase,
                              int len,
                              java.io.Writer out)
                       throws java.lang.Throwable
This method will be called when a phrase is matched. Be sure to write to outs if you want anything preserved! The read() method will supply the read of the header line.

Parameters:
id - The defined id.
phrase - The phrase data as it exactly appears in the stream.
len - The number of valid characters in the phraseBuffer. The offset is always 0.
out - Writer to write the processed data. If null, then the caller asked not to write anything, but it is up to the implementation.
Throws:
java.lang.Throwable

declare

public void declare(java.lang.String phrase,
                    int id,
                    boolean caseSensitive)
             throws java.lang.Throwable
Declare a phrase. White space is not significant other than to break tokens. Punctuation is significant only if declared.

Parameters:
phrase - The phrase.
id - The phrase id. This can be a duplicate. It must be MINIMUM_ID or higher.
caseSensitive - if true, the phrase will be case sensitive.
Throws:
java.lang.Throwable

process

public void process(java.lang.String docId,
                    java.io.Reader input,
                    java.io.Writer output)
             throws java.lang.Throwable
Process the reader. This is designed to take a stream reader, where each read yields a 16-bit character.

Parameters:
docId - The id for the document, data, or whatever. This may be echoed into the entries, depending on the specific implementation.
input - The input reader.
output - The output writer. Set to null if this is a read only process.
Throws:
java.lang.Throwable

init

public void init()
          throws java.lang.Throwable
Reinitialize the processor.

Throws:
java.lang.Throwable


Things.