Skip to content Skip to sidebar Skip to footer

Php Based Html Validator

I need to find a PHP based HTML (as in WC3-Like) Validator that can look for invalid HTML or XHTML. I've searched Google a little, but was curious if anyone has used one they parti

Solution 1:

There's no need to reinvent the wheel on this one. There's already a PEAR library that interfaces with the W3C HTML Validator API. They're willing to do the work for you, so why not let them? :)

Solution 2:

While it isn't strictly PHP, (it is a executable) one i really like is w3c's HTML tidy. it will show what is wrong with the HTML, and fix it if you want it to. It also beautifies HTML so it doesn't look like a mess. runs from the command line and is easy to integrate into php.

check it out. http://www.w3.org/People/Raggett/tidy/

Solution 3:

If you can't use Tidy (sometimes hosting service do not activate this php module), you can use this PHP class: http://www.barattalo.it/html-fixer/

Solution 4:

I had a case where I needed to check partial html code for unmatched tags (mostly) and various heavy-duty validators were too much to use. So I ended up making my own custom validation routine in PHP, it is pasted below (you may need to use mb_substr instead of index-based character retrieval if you have text in different languages) (note it does not parse CDATA but can be extended easily):

functioncheck_html($html)
{
    $stack = array();
    $autoclosed = array('br', 'hr', 'input', 'embed', 'img', 'meta', 'link', 'param', 'source', 'track', 'area', 'base', 'col', 'wbr');
    $l = strlen($html); $i = 0;
    $incomment = false; $intag = false; $instring = false;
    $closetag = false; $tag = '';
    while($i<$l)
    {
        while($i<$l && preg_match('#\\s#', $c=$html[$i])) $i++;
        if ( $i >= $l ) break;
        if ( $incomment && ('-->' === substr($html, $i, 3)) )
        {
                // close comment$incomment = false;
                $i += 3;
                continue;
        }
        $c = $html[$i++];
        if ( '<' === $c )
        {
            if ( $incomment ) continue;
            if ( $intag )  returnfalse;
            if ( '!--' === substr($html, $i, 3) )
            {
                // open comment$incomment = true;
                $i += 3;
                continue;
            }

            // open tag$intag = true;
            if ( '/' === $html[$i] )
            {
                $i++;
                $closetag = true;
            }
            else
            {
                $closetag = false;
            }
            $tag = '';
            while($i<$l && preg_match('#[a-z0-9\\-]#i', $c=$html[$i]) )
            {
                $tag .= $c;
                $i++;
            }
            if ( !strlen($tag) ) returnfalse;
            $tag = strtolower($tag);
            if ( $i<$l && !preg_match('#[\\s/>]#', $html[$i]) ) returnfalse;
            if ( $i<$l && $closetag && preg_match('#^\\s*/>#sim', substr($html, $i)) ) returnfalse;
            if ( $closetag )
            {
                if ( in_array($tag, $autoclosed) || (array_pop($stack) !== $tag) )
                    returnfalse;
            }
            elseif ( !in_array($tag, $autoclosed) )
            {
                $stack[] = $tag;
            }
        }
        elseif ( '>' ===$c )
        {
            if ( $incomment ) continue;
            
            // close tagif ( !$intag ) returnfalse;
            $intag = false;
        }
    }
    return !$incomment && !$intag && empty($stack);
}

Post a Comment for "Php Based Html Validator"