http://xml.apache.org/http://www.apache.org/http://www.w3.org/


Google


NEW DESIGN

Alt Design
co-routines
galleys
footnotes
keeps
space-specifiers

alt.properties
Classes overview
Properties classes
Properties
PropertyConsts
PropNames
AbsolutePosition
VerticalAlign
BorderCommonStyle


Property parsing

Compound properties
Traits
User agent refs


Property expression parsing

Note The following discussion of the experiments with alternate property expression parsing is very much a work in progress, and subject to sudden changes.

The parsing of property value expressions is handled by two closely related classes: PropertyTokenizer and its subclass, PropertyParser. PropertyTokenizer, as the name suggests, handles the tokenizing of the expression, handing tokens back to its subclass, PropertyParser. PropertyParser, in turn, returns a PropertyValueList, a list of PropertyValues.

The tokenizer and parser rely in turn on the datatype definition from the org.apache.fop.datatypes package and the datatype static final int constants from PropertyConsts.

Data types

The data types currently defined in org.apache.fop.datatypes include:

Numbers and lengths 
Numeric  The fundamental numeric data type. Numerics of various types are constructed by the classes listed below.  
  Constructor classes for Numeric 
  Angle  In degrees(deg), gradients(grad) or radians(rad) 
  Ems  Relative length in ems 
  Frequency  In hertz(Hz) or kilohertz(kHz) 
  IntegerType   
  Length  In centimetres(cm), millimetres(mm), inches(in), points(pt), picas(pc) or pixels(px) 
  Percentage   
  Time  In seconds(s) or milliseconds(ms) 
Strings 
StringType  Base class for data types which result in a String.  
  Literal  A subclass of StringType for literals which exceed the constraints of an NCName.  
  MimeType  A subclass of StringType for literals which represent a mime type.  
  UriType  A subclass of StringType for literals which represent a URI, as specified by the argument to url().  
  NCName  A subclass of StringType for literals which meet the constraints of an NCName.  
    Country  An RFC 3066/ISO 3166 country code. 
    Language  An RFC 3066/ISO 639 language code. 
    Script  An ISO 15924 script code. 
Enumerated types 
EnumType  An integer representing one of the tokens in a set of enumeration values.  
  MappedEnumType  A subclass of EnumType. Maintains a String with the value to which the associated "raw" enumeration token maps. E.g., the font-size enumeration value "medium" maps to the String "12pt".  
Colors 
ColorType  Maintains a four-element array of float, derived from the name of a standard colour, the name returned by a call to system-color(), or an RGB specification.  
Fonts 
FontFamilySet  Maintains an array of Strings containing a prioritized list of possibly generic font family names.  
Pseudo-types 
A variety of pseudo-types have been defined as convenience types for frequently appearing enumeration token values, or for other special purposes.  
Inherit  For values of inherit.  
Auto  For values of auto.  
None  For values of none.  
Bool  For values of true/false.  
FromNearestSpecified  Created to ensure that, when associated with a shorthand, the from-nearest-specified-value() core function is the sole component of the expression.  
FromParent  Created to ensure that, when associated with a shorthand, the from-parent() core function is the sole component of the expression.  

Tokenizer

The tokenizer returns one of the following token values:

    static final int
                 EOF = 0
             ,NCNAME = 1
           ,MULTIPLY = 2
               ,LPAR = 3
               ,RPAR = 4
            ,LITERAL = 5
      ,FUNCTION_LPAR = 6
               ,PLUS = 7
              ,MINUS = 8
                ,MOD = 9
                ,DIV = 10
              ,COMMA = 11
            ,PERCENT = 12
          ,COLORSPEC = 13
              ,FLOAT = 14
            ,INTEGER = 15
    ,ABSOLUTE_LENGTH = 16
    ,RELATIVE_LENGTH = 17
               ,TIME = 18
               ,FREQ = 19
              ,ANGLE = 20
            ,INHERIT = 21
               ,AUTO = 22
               ,NONE = 23
               ,BOOL = 24
                ,URI = 25
           ,MIMETYPE = 26
            // NO_UNIT is a transient token for internal use only.  It is
            // never set as the end result of parsing a token.
            ,NO_UNIT = 27
                     ;
	

Most of these tokens are self-explanatory, but a few need further comment.

  • AUTO -
  • Because of its frequency of occurrence, and the fact that it is always the initial value for any property which supports it, AUTO has been promoted into a pseudo-type with its on datatype class. Therefore, it is also reported as a token.
  • NONE -
  • Similarly to AUTO, NONE has been promoted to a pseudo-type because of its frequency.
  • BOOL -
  • There is a de facto boolean type buried in the enumeration types for many of the properties. It had been specified as a type in its own right in this code.
  • MIMETYPE -
  • The property content-type introduces this complication. It can have two values of the form content-type:mime-type namespace-prefix:prefix (e.g. content-type="namespace-prefix:svg"). The experimental code reduces these options to the payload in each case: an NCName in the case of a namespace prefix, and a MIMETYPE in the case of a content-type specification. NCNames cannot contain a "/".

    Parser

    The parser retuns a PropertyValueList, necessary because of the possibility that a list of PropertyValue elements may be returned from the expressions of soem properties.

    PropertyValueLists may contain PropertyValues or other PropertyValueLists. This latter provision is necessitated for the peculiar case of of text-shadow, which may contain whitespace separated sublists of either two or three elements, separated from one another by commas. To accommodate this peculiarity, comma separated elements are added to the top-level list, while whitespace separated values are always collected into sublists to be added to the top-level list.

    Other special cases include the processing of the core functions from-parent() and from-nearest-specified-value() when these function calls are assigned to a shorthand property, or used with a shorthand property name as an argument. In these cases, the function call must be the sole component of the expression. The pseudo-element classes FromParent and FromNearestSpecified are generated in these circumstances so that an exception will be thrown if they are involved in expression evaluation with other components. (See Rec. Section 5.10.4 Property Value Functions.)

    The experimental code is a simple extension of the existing parser code, which itself borrowed heavily from James Clark's XT processor.




    Copyright © 2001-2002 The Apache Software Foundation. All Rights Reserved.