|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectthings.data.processing.LexicalTool
things.data.processing.rfc822.AddressParser
public class AddressParser
An 822 address parser.
The submitted addreses may have whitespace at either end of the strings. Trim if you wish. Note that CR or LFs will be converted to spaces.
It isn't as much work as it appears. It took me about 30 minutes to map it in a spreadsheet. After another hour, I had the parse language (as seen in the comments below). And another hour after that it was coded and done. I've found only one bug since, which traced back to the original spreadsheet.
Version History
EPG - Initial - 12 FEB 05
Field Summary |
---|
Constructor Summary | |
---|---|
AddressParser()
|
Method Summary | |
---|---|
static void |
parseAndSave(StreamSource source,
AddressListener addressListener)
Parse the source for addresses. |
void |
parser(java.io.InputStream ins,
AddressListener addressListener)
Call with an InputStream. |
void |
parser(StreamSource source,
AddressListener addressListener)
Parse engine grammer.
Lexical elements: ASCII (0->127),
CHAR (32->127 minus WS, SPECIAL),
QUOTE, AT, COLON, SEMICOLON, DOT, OPENBRACK, CLOSEBRACK,
GT, LT, BACKSLASH, COMMA, OPENPAREN, CLOSEPAREN,
WS (space or tab)
CR, LF
|SPECIAL| (includes QUOTE, AT, COLON, SEMICOLON, DOT, OPENBRACK, CLOSEBRACK, GT, LT, BACKSLASH, COMMA, OPENPAREN, CLOSEPAREN),
!OTHER! (meaning anything not listed). |
void |
parser(java.lang.String data,
AddressListener addressListener)
Call with a String. |
Methods inherited from class things.data.processing.LexicalTool |
---|
get822HeadernameType, get822HeadernameTypeWithDollar, get822Type, getClassification, getDNSType, getHexValue, getLower, getName, getUpper, getURIType, getURLFType |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public AddressParser()
Method Detail |
---|
public static void parseAndSave(StreamSource source, AddressListener addressListener) throws java.lang.Throwable
source
- the source data.addressListener
- an address listener for found addresses.
java.lang.Throwable
public void parser(StreamSource source, AddressListener addressListener) throws java.lang.Throwable
Lexical elements: ASCII (0->127),
CHAR (32->127 minus WS, SPECIAL),
QUOTE, AT, COLON, SEMICOLON, DOT, OPENBRACK, CLOSEBRACK,
GT, LT, BACKSLASH, COMMA, OPENPAREN, CLOSEPAREN,
WS (space or tab)
CR, LF
|SPECIAL| (includes QUOTE, AT, COLON, SEMICOLON, DOT, OPENBRACK, CLOSEBRACK, GT, LT, BACKSLASH, COMMA, OPENPAREN, CLOSEPAREN),
!OTHER! (meaning anything not listed).
REG: $GROUP, $FRIENDLY, $ADDRESS, $FLAG_GROUP, $FLAG_GROUP_ENDED
[START]
-> NULL->$FRIENDLY
-> NULL->$ADDRESS
-> NULL->$GROUP
-> NULL->$BUSTED
-> false -> $FLAG_GROUP
-> false -> $FLAG_GROUP_ENDED
-> [OPEN]
-> if (EOF) ^RETURN^
[OPEN]
- WS, COMMA - burn
- CHAR - push, [ACCUMULATE], ^RETURN^
- OPENPAREN - push, [GATHERCOMMENT]
- QUOTE - push, [GATHERQUOTE], [ACCUMULATE], ^RETURN^
- LT - [LTADDRESS], ^RETURN^
- SEMICOLON - if (true = $FLAG_GROUP) then
pop->$ADDRESS, [SUBMIT], false->$FLAG_GROUP_ENDED, ^RETURN^
else
error(Character not allowed in DN.)
- |SPECIAL| - error(meaningless and unquoted special)
- !OTHER! - error(character not allowed in open)
- EOF - ^EXIT^
-> ($FLAG_GROUP_ENDED = true) - false->$FLAG_GROUP_ENDED, ^RETURN^
[ACCUMULATE]
- CHAR - push
- COMMA - pop->$ADDRESS, submit, ^RETURN^
- OPENPAREN - push, [GATHERCOMMENT]
- QUOTE - push, [GATHERQUOTE]
- AT - push, [NAKED_ADDRESS_DN_ONLY], ^RETURN^
- WS - push(SPACE), [ACCUMULATE_WITH_WS], ^RETURN^
- LT - pop->$FRIENDLY, [LTADDRESS], ^RETURN^
- COLON - [GROUP], ^RETURN^
- |SPECIAL| - error(unquoted special)
- !OTHER! - error(character not allowed in open)
- EOF - push, pop->$ADDRESS, submit, ^RETURN^
[ACCUMULATE_WITH_WS]
- CHAR - push, [FRIENDLY], ^RETURN^
- COLON - [GROUP], ^RETURN^
- COMMA - pop->$ADDRESS, [SUBMIT], ^RETURN^
- OPENPAREN - push, [GATHERCOMMENT]
- AT - error(bad address with unquoted whitespace)
- WS - push
- GT - pop->$FRIENDLY, [LTADDRESS], ^RETURN^
- |SPECIAL| - error(unquoted special)
- !OTHER! - error(character not allowed in open)
- EOF - pop->$ADDRESS, [SUBMIT], ^RETURN^
[FRIENDLY]
- CHAR - push
- AT - push
- QUOTE - push, [GATHERQUOTE]
- COLON - [GROUP], ^RETURN^
- COMMA - error(no address present)
- OPENPAREN - push, [GATHERCOMMENT]
- WS - push(SPACE)
- LT - pop->$FRIENDLY, [LTADDRESS], ^RETURN^
- |SPECIAL| - error(unquoted special)
- !OTHER! - error(character not allowed in open)
- EOF - error(no address)
[LTADDRESS]
->[LT_FRONT_ADDRESS_OPEN]
-> NULL->$FRIENDLY
->^RETURN^
[LT_FRONT_ADDRESS_OPEN]
- WS - burn
- OPENPAREN - push, [GATHERCOMMENT]
- CHAR - push, [LT_FRONT_ADDRESS_NORMAL], ^RETURN^
- QUOTE - push, [LT_FRONT_ADDRESS_QUOTED], ^RETURN^
- !OTHER! - error(Not allowed character)
- EOF - error(no address)
[LT_FRONT_ADDRESS_NORMAL]
- OPENPAREN - push, [GATHERCOMMENT]
- CHAR - push
- AT - push, [LT_ADDRESS_DN_ONLY], ^RETURN^
- WS - [LT_CLOSE_ONLY], ^RETURN^
- GT - pop->$ADDRESS, [SUBMIT], [EXPECT_SEPERATOR_OR_EOF], ^RETURN^
- !OTHER! - error(Character not allowed in name.)
- EOF - error(no address)
[LT_CLOSE_ONLY]
- GT - pop->$ADDRESS, [SUBMIT], [EXPECT_SEPERATOR_OR_EOF], ^RETURN^
- CHAR - push, [BUSTEDBRACKETADDRESS], ^RETURN^
- WS - burn
- !OTHER! - error(Cannot put friendly name in address closure)
- EOF - error(must close a non-DN address)
[LT_FRONT_ADDRESS_QUOTED]
-> [GATHERQUOTE]
- OPENPAREN - push, [GATHERCOMMENT]
- AT - push, [LT_ADDRESS_DN_ONLY], ^RETURN^
- !OTHER! - error(broken quoted against @ in address)
- EOF - error(no address)
[LT_ADDRESS_DN_ONLY]
-> [REQUIRE_DN]
- OPENPAREN - push, [GATHERCOMMENT]
- DNSCHAR - push
- WS - [SEEK_GT], pop->$ADDRESS, [SUBMIT], [EXPECT_SEPERATOR_OR_EOF], ^RETURN^
- GT - pop->$ADDRESS, [SUBMIT], [EXPECT_SEPERATOR_OR_EOF], ^RETURN^
- !OTHER! - error(Character not allowed in DN.)
- EOF - error(no address)
[NAKED_ADDRESS_DN_ONLY]
-> [REQUIRE_DN]
- OPENPAREN - push, [GATHERCOMMENT]
- DNSCHAR - push
- SEMICOLON - if (true = $FLAG_GROUP) then
pop->$ADDRESS, [SUBMIT], true->FLAG_GROUP_ENDED, ^RETURN^
else
error(Character not allowed in DN.)
- COMMA - pop->$ADDRESS, [SUBMIT], ^RETURN^
- WS - pop->$ADDRESS, [MAYBE_NOT_AN_ADDRESS], ^RETURN^
- !OTHER! - error(Character not allowed in DN.)
- EOF - pop->$ADDRESS, [SUBMIT], ^RETURN^
[MAYBE_NOT_AN_ADDRESS]
- OPENPAREN - push, [GATHERCOMMENT], [FRIENDLY], ^RETURN^
- WS - burn
- SEMICOLON - if (true = $FLAG_GROUP) then
pop->$ADDRESS, [SUBMIT], true->FLAG_GROUP_ENDED, ^RETURN^
else
error(Group teminator when group not defined.)
- LT - pop->$FRIENDLY, [LTADDRESS], [EXPECT_SEPERATOR_OR_EOF], ^RETURN^
- COMMA - pop->$ADDRESS, [SUBMIT], ^RETURN^
- EOF - pop->$ADDRESS, [SUBMIT], ^RETURN^
- QUOTE - push, [GATHERQUOTE], [FRIENDLY], ^RETURN^
- CHAR - push, [FRIENDLY], ^RETURN^
- !OTHER! - error(addresses not delimited.)
[EXPECT_SEPERATOR_OR_EOF]
- WS - burn
- OPENPAREN - burn, [BURNCOMMENT]
- SEMICOLON - if (true = $FLAG_GROUP) then
true->FLAG_GROUP_ENDED, ^RETURN^
else
error(Group teminator when group not defined.)
- COMMA - ^RETURN^
- EOF - ^RETURN^
- !OTHER! - error(addresses not delimited.)
[SEEK_GT]
- OPENPAREN - push, [GATHERCOMMENT]
- WS - burn
- GT - ^RETURN^
- !OTHER! - error(Character not allowed after whitespace, before '>')
- EOF - error(address not closed with a '>')
[BUSTEDBRACKETADDRESS]
- OPENPAREN - push, [GATHERCOMMENT]
- WS - push(SPACE)
- GT - pop->$BUSTED, [SUBMIT], ^RETURN^
- !OTHER! - push
- EOF - error(address not closed with a '>')
[REQUIRE_DN]
- OPENPAREN - push, [GATHERCOMMENT]
- DNSCHAR - push, ^RETURN^
- OPENPAREN - push, [GATHERCOMMENT]
- !OTHER! - error(bad domain name)
- EOF - error(no address)
[GATHERQUOTE]
- BACKSLASH = burn, [ESCAPE]
- QUOTE = push, ^RETURN^
- !OTHER! = push
- EOF - error(quote left open)
[ESCAPE]
- ASCII = push, ^RETURN^
- EOF - error(escape left open)
[GATHERCOMMENT]
- CLOSEPAREN = push, ^RETURN^
- BACKSLASH = burn, [ESCAPE]
- !OTHER! = push
- EOF - error(comment left dangling.)
[BURNCOMMENT]
- CLOSEPAREN = ^RETURN^
- !OTHER! = burn
- EOF - error(comment left dangling.)
[GROUP]
-> if($FLAG_COLON=true, error(cannot imbed groups)),
-> pop->$GROUP,
-> true->$FLAG_GROUP,
-> [OPEN],
-> NULL->$GROUP,
-> false->$FLAG_GROUP,
[SUBMIT]
-> submit($GROUPm, $FRIENDLY, $ADDRESS)
-> NULL->$FRIENDLY
-> NULL->$ADDRESS
java.lang.Throwable
public void parser(java.io.InputStream ins, AddressListener addressListener) throws java.lang.Throwable
ins
- the source stream.addressListener
-
java.lang.Throwable
public void parser(java.lang.String data, AddressListener addressListener) throws java.lang.Throwable
data
- the StringaddressListener
-
java.lang.Throwable
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |