[Note for Netscape, MSIE, Mosaic and most other browsers: get a browser that supports HTML shorttags. This document is littered with these elements, which are an official part of the HTML standard and have been since HTML 1.0. The HTML Widget Set described on these pages does know about them...]

XmHTMLParser provides an Object capable of parsing HTML 3.2 text. It offers both HTML 3.2 verification and repair of non-conforming HTML documents as well as incremental document parsing. XmHTMLParser objects can be used for creating a fully interactive HTML 3.2 parser application thru XmHTMLParser's callback resources.


Class Information

Include File: <Xm/Parser.h>
Class Name: XmHTMLParser
Class Hierarchy: Object->XmHTMLParser
Class Pointer: xmHTMLParserObjectClass
Functions/Macros: XmCreateHTMLParser, XmHTMLParser... routines, XmIsHTMLParser.


New Resources

XmHTMLParser defines the following resources.

XmNmimeType String text/html CSG
XmNparserIsProgressive Boolean False CSG
XmNretainSource Boolean False CSG
XmNstrictHTMLChecking Boolean False CSG
XmNuserData Pointer NULL CSG

XmNmimeType
This resource informs how XmHTMLParser should parse text. XmHTMLParser knows how to handle the following mime types: text/html, text/plain and every image mime type specification that starts with /image/.

XmNparserIsProgressive
Setting this resource to

XmNretainSource
When set to

XmNstrictHTMLChecking
This resource enables

XmNuserData
A pointer to data that the application can attach to the Object. This resource is unused internally.


Callback Resources

XmHTMLParser defines the following callback resources:

XmNdocumentCallback XmCR_HTML_DOCUMENT XmHTMLDocumentCallbackStruct
XmNmodifyVerifyCallback XmCR_HTML_MODIFYING_TEXT_VALUE XmHTMLVerifyCallbackStruct
XmNparserCallback XmCR_HTML_PARSER XmXmHTMLParserCallbackStruct

All callback resources also reference XmAnyCallbackStruct.

XmNdocumentCallback is activated when XmHTMLParser has finished parsing a document and before XmHTMLParserSetString or XmHTMLParserUpdateSource returns.

XmNmodifyVerifyCallback is activated when XmHTMLParser is about to insert or remove text in or from the current source text.

XmNparserCallback is activated when XmHTMLParser encounters a HTML element that is in error. XmHTMLParser detects unknown, unbalanced, badly placed as well as unterminated HTML elements and HTML 3.2 violations.


XmHTMLDocumentCallbackStruct

The XmNdocumentCallback callback resource references the following structure:
typedef struct
{
	int		reason;		/* the reason the callback was called */
	XEvent		*event;		/* always NULL */
	Boolean		html32;		/* True when document was HTML 3.2 conforming */
	Boolean		verified;	/* True when document has been verified */
	Boolean		balanced;	/* True when parser tree is balanced */
	Boolean		terminated;	/* True if parser is terminated prematurely */
	int		pass_level;	/* current parser level count. */
	Boolean		redo;		/* See below */
}XmHTMLDocumentCallbackStruct;
The The

The

The

The

The

Setting the

When no XmNdocumentCallback callback resource is installed, XmHTML will make at most two passes on the current document. See the Parser Description document for more information.


XmHTMLVerifyCallbackStruct

The XmNmodifyVerifyCallback callback resource references the following structure:
typedef struct{
	int 		reason;		/* the reason the callback was called */
	XEvent		*event;		/* always NULL */
	Boolean		doit;		/* unused */
	int		action;		/* type of modification */
	int		line_no;	/* current line number in input text */
	int		start_pos;	/* start of text to change */
	int		end_pos;	/* end of text to change */
	XmHTMLTextBlock	text;		/* describes text to remove or insert */
}XmHTMLVerifyCallbackStruct, *XmHTMLVerifyPtr;
The The

The

The

The

typedef struct{
	String		ptr;		/* pointer to text to remove/insert */
	int		len;		/* length of this text */
}XmHTMLTextBlockRec, *XmHTMLTextBlock;

The


XmHTMLParserCallbackStruct

The XmNparserCallback resource references the following structure:
typedef struct{
	int		reason;		/* the reason the callback was called */
	XEvent		*event;		/* always NULL */
	int		errno;		/* total error count uptil now */
	int		line_no;	/* current line number in input text */
	int		start_pos;	/* start of text in error */
	int		end_pos;	/* end of text in error */
	parserError	error;		/* type of error */
	unsigned char	action;		/* suggested correction action */
	XmHTMLTextBlock	repair;		/* proposed element to insert */
	XmHTMLTextBlock	current;	/* current element */
	XmHTMLTextBlock	offender;	/* offending element */
}XmHTMLParserCallbackStruct, *XmHTMLParserPtr;
The The

The

The

The

current/repair
HTML_BAD An element is completely out of order and the internal autocorrection routines cannot find a proper place for this element. HTML_REMOVE Yes/No
HTML_CLOSE_BLOCK A closing block level element is encountered while it was never opened. HTML_REMOVE No/Yes
HTML_INTERNAL An internal error was encountered. HTML_TERMINATE No/No
HTML_NOTIFY Notification of insertion of an optional opening/closing element. HTML_INSERT No/Yes
HTML_OPEN_BLOCK A new block-level element is encountered while a previous block element is still open. HTML_INSERT Yes/Yes
HTML_OPEN_ELEMENT an unbalanced terminator is encountered. HTML_SWITCH Yes/Yes
HTML_VIOLATION a HTML 3.2 violation was encountered. HTML_INSERT, HTML_KEEP or HTML_REMOVE Yes/Dynamic
HTML_UNKNOWN_ELEMENT an unknown element was encountered. HTML_REMOVE No/No

The HTML_VIOLATION error is a special case. When XmHTMLParser can find a suitable element that will cause the offending element to be no longer in violation of the HTML 3.2 standard, it will propose to insert this new element. When it can't find one, the default action depends on the value of the XmNstrictHTMLChecking resource. When this resource is set to

HTML_ALIAS Replace HTML_UNKNOWN_ELEMENT
HTML_IGNORE Ignore this error, proceed as if nothing happened HTML_BAD, HTML_INTERNAL
HTML_INSERT Insert HTML_CLOSE_BLOCK, HTML_NOTIFY, HTML_OPEN_BLOCK, HTML_VIOLATION
HTML_KEEP Keep HTML_CLOSE_BLOCK, HTML_OPEN_BLOCK, HTML_VIOLATION
HTML_REMOVE Remove all
HTML_SWITCH Switch HTML_OPEN_ELEMENT
HTML_TERMINATE Terminate parser All errors


Inherited Resources

XmHTMLParser inherits the following resources. The resources are listed alphabetically, along with the superclass that defines them.

XmNdestroyCallback Object


Translations

XmHTMLParser does not define any translations.


Action Routines

XmHTMLParser does not define any actions.




©Copyright 1996-1997 by Ripley Software Development
Last update: September 19, 1997 by Koen