GM_safeHTMLParser

Description

This function will safely parse a string of HTML and return an XMLDocument. It cleans the provided HTML by removing tags such as <script>, <style>, <head>, <body>, <title> and <iframe>, and will also remove all JavaScript (including element attributes containing JavaScript).

Arguments

`String` HTMLString

A string of HTML.

`String` BaseURL

Optional. If specified (and valid), this value will be used to resolve partial URLs (e.g. /images/foo.png). If omitted, things with partial URLs might be excluded from the returned XMLDocument.

Returns

`XMLDocument` xmlDoc

An XML document (in the XHTML namespace) representing the parsed HTML.

Note: Certain uses of the returned XML document (e.g. as a context node for an XPath query) will require the use of a namespace resolver. An example of this situation has been provided below.

Example

// GET erikvold.com
GM_xmlhttpRequest({
  method: "GET",
  url: "http://erikvold.com/",
  onload: function(response) {
    // Parse the response to an XML document
    var doc = GM_safeHTMLParser(response.responseText);
    // Query the document to get certain content, and then display it in an alert
    alert(doc.getElementById("hcard-Erik-Vergobbi-Vold").innerHTML);
  }
});

// GET google.com
GM_xmlhttpRequest({
  method: "GET",
  url: "http://google.com/",
  onload: function(response) {
    // Parse the response to an XML document
    var doc = GM_safeHTMLParser(response.responseText);
    // Get the HTML element through the use of `GM_xpath` and the namespace resolver
    var htmlEle = GM_xpath({
      path: "//x:html",
      node: doc,
      resolver: "http://www.w3.org/1999/xhtml"
    });
    // Log the element's String representation
    if (htmlEle) {
      GM_log("The HTML element was found! " + htmlEle);
    } else {
      GM_log("The HTML element was not found...");
    }
  }
});

Related Pages

Manual: API

GM_safeHTMLParser

Description

Arguments

String HTMLString

String BaseURL

Returns

XMLDocument xmlDoc

Example

Related Pages

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

`String` HTMLString

`String` BaseURL

`XMLDocument` xmlDoc