things.data.processing.http
Class BodyProcessor_FormURLEncodedSTRICT

java.lang.Object
  extended by things.data.processing.LexicalTool
      extended by things.data.processing.http.BodyProcessor_FormURLEncodedSTRICT

public class BodyProcessor_FormURLEncodedSTRICT
extends LexicalTool

Body processor for application/x-www-form-urlencoded.

I know it looks like a lot of work, but it wasn't.

I have since learned that a lot of Ajax asswipes will use naked LFs to terminate values. So this version will choke on a lot of their POSTS.

Version:
1.0
Version History
EPG - Adapted from another project - 12 FEB 2007
Author:
Erich P. Gatejen

Field Summary
 
Fields inherited from class things.data.processing.LexicalTool
ALLOWED, ASCII_HIGH, BAD, BREAKING, CHAR, CHAR_DNSCHAR, CHAR_DNSCHAR_NUMERIC, CHAR_DNSCHAR_POUND, CLASS_ALPHA, CLASS_CONTROL, CLASS_NONE, CLASS_NUMERIC, CLASS_PUNCTUATION, COLONVALUE, CONTROL, CRBYTEVALUE, DASHVALUE, DNSCHAR, DOLLARBYTEVALUE, HEADER_READ_STATE_CHART, HEADER_READ_STATE_CHARTV2, HP____SPECIAL_DEAD, HP____SPECIAL_PAUSE, HP____SPECIAL_WALKING_DEAD, HP_BROKEN, HP_CLEAR_PAUSE, HP_CLEAR_PAUSE_CRLF, HP_CLOSURE, HP_CR, HP_HEAD_CR, HP_HEAD_CRLF, HP_HEAD_LF, HP_LF, HP_LFCR, HP_NOT_USED, HP_PAUSE, HP_PAUSE_CRLF, HP_PAUSE_CRLFCR, HP_READ, HP_START, LEXICAL_HEADER_TERMINATION, LEXICAL_MAP, LEXICAL_MAP_822_HEADERNAME, LEXICAL_MAP_822_TYPE, LEXICAL_MAP_CLASSIFICATION, LEXICAL_MAP_DNS_TYPE, LEXICAL_MAP_HEXVALUE, LEXICAL_MAP_NAME, LEXICAL_MAP_URI_TYPE, LEXICAL_MAP_URLF_TYPE, LFBYTEVALUE, NO_CHARACTER, NOT_ALLOWED, OPENBBYTEVALUE, OTHER, PIPEBYTEVALUE, SLASHBYTEVALUE, SPACEVALUE, SPECIAL, SPECIAL_AMP, SPECIAL_AT, SPECIAL_BACKSLASH, SPECIAL_CHAR_DNSCHAR_DOT, SPECIAL_CLOSEBRACK, SPECIAL_CLOSEPAREN, SPECIAL_COLON, SPECIAL_COMMA, SPECIAL_DOLLAR, SPECIAL_EQ, SPECIAL_GT, SPECIAL_LT, SPECIAL_OPENBRACK, SPECIAL_OPENPAREN, SPECIAL_PERCENT, SPECIAL_PLUS, SPECIAL_QUEST, SPECIAL_QUOTE, SPECIAL_SEMICOLON, SPECIAL_SLASH, SPECIAL_SPLAT, STRING_CRLF, TABVALUE, URLCHAR, URLFCHAR, VALUE_ASCII_BOTTOM, VALUE_ASCII_HIGH_BOTTOM, VALUE_ASCII_HIGH_TOP, VALUE_ASCII_LOW_BOTTOM, VALUE_ASCII_LOW_TOP, VALUE_ASCII_TOP, WS, WS_CR_CONTROL, WS_LF_CONTROL, WS_SPACE, WS_TAB_CONTROL
 
Constructor Summary
BodyProcessor_FormURLEncodedSTRICT()
           
 
Method Summary
 void parser(java.io.InputStream source, HttpRequest request)
          Parse engine grammar.
 
Methods inherited from class things.data.processing.LexicalTool
get822HeadernameType, get822HeadernameTypeWithDollar, get822Type, getClassification, getDNSType, getHexValue, getLower, getName, getUpper, getURIType, getURLFType
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BodyProcessor_FormURLEncodedSTRICT

public BodyProcessor_FormURLEncodedSTRICT()
Method Detail

parser

public void parser(java.io.InputStream source,
                   HttpRequest request)
            throws ThingsException
Parse engine grammar.
 
POST /index.html HTTP/1.0
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, SPLAT/SPLAT
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
UA-CPU: x86
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Host: 192.168.1.160
Content-Length: 357
Pragma: no-cache
Connection: keep-alive
Browser reload detected...
Posting 357 bytes...
Item=Value
Item2=Value+SecondToken+++
 FoldedInfo%0D%0A++MORE
Item+3=HelloHelloHello%0D%0A%0D%0A++++
    ++++

There really is no good spec on this.

URLFCHAR = Let's be forgiving.
A       B       C       D       E       F       G       H       I       J       K       L       M       N       O       P       Q       R       S       T       U       V       W       X       Y       Z
a       b       c       d       e       f       g       h       i       j       k       l       m       n       o       p       q       r       s       t       u       v       w       x       y       z
0       1       2       3       4       5       6       7       8       9       -       _       .       ~
!       *       '       (       )       ;       :       @       $       ,       /       ?       #       [       ]

PERCENT = '%' for escape.
EQU = '=' for name/value seperation.
PLUS = + for space replacement
AMP = & for separation
CR | LF = For item termination
WS = All other whitespace

Flags:
        !Done = if true, we are done.  Set after a terminal closure.

[START]
-> NULL->$Name
-> NULL->$Value
-> FALSE->!Done
-> NULL->$(Hex)Sixteens
-> [OPEN]
-> ^RETURN^

[OPEN]
        - URLFCHAR                                      - push, [NAME]
        - AMP                                           - burn
        - !OTHER!                                       - [DEPLETE], error(Query line started bad.  Must be an allowed character.)
        - CR                                            - [PENDING_LF]
        - !EOF!                                         - ^RETURN^              // Done.  Nothing to do.

[NAME]
        - %                     - [ESCAPE]
        - WS            - [DEPLETE], error(broken name in query)
        - URLFCHAR      - push
        - +                     - push(" ")     
        - EQU           - pop->$Name, [START_VALUE], if (!Done==TRUE) ^RETURN^
        - CR            - [PENDING_LF], [FOLDNAME_OPEN], if (!Done==TRUE) ^RETURN^                      // We're coming out of a fold or line, so start a new name
        - !OTHER!       - [DEPLETE], error(bad characters in query name)
        - AMP           - error(Truncated query.  Name only.)   
        - !EOF!         - error(Truncated query.  Name only.)           

[FOLDNAME_OPEN] 
        - CR | LF       - error(Name broken and without a value.)
        - WS            - [FOLDED_NAME], ^RETURN^
        - !OTHER!       - [DEPLETE], error(bad folding on name, lines aborted)
        - !EOF!         - error(Truncated query while folding name.)    

[FOLDED_NAME]
        - %                     - [ESCAPE], [NAME], ^RETURN^                            // return back to [NAME]
        - URLFCHAR      - push, [NAME],^RETURN^ 
        - +                     - push(" "), [NAME],^RETURN^    
        - EQU           - pop->$Name, [VALUE], ^RETURN^ 
        - CR            - [PENDING_LF], [FOLDNAME_OPEN], ^RETURN^       // Recursion danger!
        - WS            - burn  
        - !OTHER!       - [DEPLETE], error(Bad folding on name, lines aborted)  
        - AMP           - error(Truncated query.  Name only.)   
        - !EOF!         - error(Truncated query while folding name.)            

[VALUE]
        - %                     - [ESCAPE]      
        - URLFCHAR      - push
        - WS            - [DEPLETE], error(broken value in query)       
        - +                     - push(" ")             
        - EQU           - [DEPLETE], error(Second unencoded '=' found in query.)        
        - AMP           - [SAVE], ^RETURN^ 
        - CR            - [PENDING_LF], [FOLDVALUE_OPEN], ^RETURN^                      // Done, so unwind back to OPEN.
        - LF            - [DEPLETE], error(bad character in value-naked LF)
        - !OTHER!       - [DEPLETE], error(bad character in value)      
        - !EOF!         - [SAVE], ^RETURN^                                                                      // Done, so unwind back to OPEN.        

[FOLDVALUE_OPEN]
        - %                     - [SAVE], [ESCAPE], ^RETURN^                                                    // Closure.      Push the char for the NEXT name.
        - +                     - [SAVE], push(" "), ^RETURN^                                                   // Closure.      Push the char for the NEXT name.               
        - URLFCHAR      - [SAVE], push,  ^RETURN^                                                               // Closure.  Push the char for the NEXT name.
        - AMP           - [SAVE], ^RETURN^                                                                              // Closure.  Back to open.
        - EQU           - [SAVE], error(Query entry started with a '='.)                // Closure but an error for the next line.      
        - CR            - [PENDING_LF], [SAVE], [SEEK_MORE], ^RETURN^                   // Closure.  Eat until we get characters.
        - !EOF!         - [SAVE], !Done=TRUE, ^RETURN^                                                  // Absolute closure     
        - WS            - [FOLDED_VALUE], ^RETURN^                                                      // unwind back to NAME  
        - !OTHER!       - [DEPLETE], error(bad folding on value, lines aborted)
        - error(Truncated query while folding name.)    

[FOLDED_VALUE]
        - %                     - [ESCAPE], [VALUE], ^RETURN^                                                   // return back to [VALUE]
        - URLFCHAR      - push, [VALUE], ^RETURN^       
        - AMP           - [SAVE], ^RETURN^                                                                              // Closure.  Back to open.
        - +                     - push(" "), [VALUE], ^RETURN^  
        - EQU           - [DEPLETE], error(Second unencoded '=' found in query.)        
        - CR            - [PENDING_LF], [FOLDVALUE_OPEN], ^RETURN^                              // Recursion danger!
        - WS            - burn  
        - !OTHER!       - [DEPLETE], error(Bad folding on value, lines aborted) 
        - !EOF!         - [SAVE], !Done=TRUE, ^RETURN^                                                  // Absolute Closure     

[SEEK_MORE]
        - %                     - [ESCAPE], ^RETURN^                            // Push the char for the NEXT name.
        - AMP           - burn, ^RETURN^                                        // Next item.
        - URLFCHAR      - push, ^RETURN^                                        // Push the char for the NEXT name.
        - +                     - push(" "), ^RETURN^                           // Push the char for the NEXT name.             
        - CR            - [PENDING_LF]                                          // Eat them
        - !EOF!         - !Done=TRUE, ^RETURN^                          // We are already closed.  And now we are done. 
        - !OTHER!       - [DEPLETE], error(Bad next item in query or a broken fold.)            

[PENDING_LF]
        - LF            - ^RETURN^
        - !EOF!         - ^RETURN^              // Let this one slide.
        - !OTHER!       - [DEPLETE], error(broken CR/LF--missing LF)            

[SAVE]
-> pop->$Value
-> (Set request NV to $Name/$Value
-> ^RETURN^

[ESCAPE]
        - HEX           - ->$Sixteens, ESCAPEONES, ^RETURN^
        - !OTHER!       - error(broken escape)
        - !EOF!         - error(Truncated line with dangling escape.)                   

[ESCAPEONES]
        - HEX           - push( ($SixteensSPLAT16)+HEX ), ^RETURN^
        - !OTHER!       - error(broken escape)  
        - !EOF!         - error(Truncated line with dangling escape.)                   

[DEPLETE]
        - AMP           - burn, , ^RETURN^
        - CR            - [DEPLETE_CR], ^RETURN^
        - !OTHER!       - burn
        - !EOF!         - ^RETURN^  // So what.  Some browsers end abruptly.

[DEPLETE_CR]
        - LF            - ^RETURN^
        - !EOF!         - ^RETURN^  // So what.  Some browsers end abruptly.
        - !OTHER!       - fault(missing LF after CR at end of line: odd characters found, so stream is unreliable.)

[DRAIN]
        - LF            - burn, ^RETURN^
        - CR            - burn, ^RETURN^
        - !EOF!         - error(bad CR/LF line termination: truncated.) 
        - !OTHER!       - fault(bad CR/LF line termination: odd characters found, so stream is unreliable.)             

        

Parameters:
source - the stream source.
request - the request object to fill.
Throws:
ThingsException - If it is a fault, the request should be considered completely invalid. If it is an error, whatever was set in the request might be useful.


Things.