Class HTMLPurifier_Lexer_DirectLex

Description

Our in-house implementation of a parser.

A pure PHP parser, DirectLex has absolutely no dependencies, making it a reasonably good default for PHP4. Written with efficiency in mind, it can be four times faster than HTMLPurifier_Lexer_PEARSax3, although it pales in comparison to HTMLPurifier_Lexer_DOMLex.

  • todo: Reread XML spec and document differences.

Located in /lib/core/Parsers/htmlpurifier/HTMLPurifier.standalone.php (line 13764)

HTMLPurifier_Lexer
   |
   --HTMLPurifier_Lexer_DirectLex
Variable Summary
Method Summary
Assoc parseAttributeString ($string $string,  $config,  $context)
void scriptCallback ($matches, $matches)
void substrCount ( $haystack,  $needle,  $offset,  $length)
void tokenizeHTML ( $html,  $config,  $context)
Variables
mixed $tracksLineNumbers = true (line 13767)
  • access: public

Redefinition of:
HTMLPurifier_Lexer::$tracksLineNumbers
Whether or not this lexer implements line-number/column-number tracking.
mixed $_whitespace = "\x20\x09\x0D\x0A" (line 13772)

Whitespace characters for str(c)spn.

  • access: protected

Inherited Variables

Inherited from HTMLPurifier_Lexer

HTMLPurifier_Lexer::$_special_entity2str
Methods
parseAttributeString (line 14093)

Takes the inside of an HTML tag and makes an assoc array of attributes.

  • return: array of attributes.
  • access: public
Assoc parseAttributeString ($string $string,  $config,  $context)
  • $string $string: Inside of tag excluding name.
  • $config
  • $context
scriptCallback (line 13778)

Callback function for script CDATA fudge

  • access: protected
void scriptCallback ($matches, $matches)
  • $matches, $matches: in form of array(opening tag, contents, closing tag)
substrCount (line 14074)

PHP 5.0.x compatible substr_count that implements offset and length

  • access: protected
void substrCount ( $haystack,  $needle,  $offset,  $length)
  • $haystack
  • $needle
  • $offset
  • $length
tokenizeHTML (line 13782)
  • access: public
void tokenizeHTML ( $html,  $config,  $context)
  • $html
  • $config
  • $context

Redefinition of:
HTMLPurifier_Lexer::tokenizeHTML()
Lexes an HTML string into tokens.

Inherited Methods

Inherited From HTMLPurifier_Lexer

HTMLPurifier_Lexer::__construct()
HTMLPurifier_Lexer::CDATACallback()
HTMLPurifier_Lexer::create()
HTMLPurifier_Lexer::escapeCDATA()
HTMLPurifier_Lexer::escapeCommentedCDATA()
HTMLPurifier_Lexer::extractBody()
HTMLPurifier_Lexer::normalize()
HTMLPurifier_Lexer::parseData()
HTMLPurifier_Lexer::removeIEConditional()
HTMLPurifier_Lexer::tokenizeHTML()

Documentation generated on Sun, 06 Mar 2011 00:24:10 -0500 by phpDocumentor 1.4.3