-
Notifications
You must be signed in to change notification settings - Fork 65
GM_safeHTMLParser
This function will safely parse a string of HTML and return an XMLDocument. It cleans the provided HTML by removing tags such as <script>, <style>, <head>, <body>, <title> and <iframe>, and will also remove all JavaScript (including element attributes containing JavaScript).
A string of HTML.
Optional. If specified (and valid), this value will be used to resolve partial URLs (e.g. /images/foo.png). If omitted, things with partial URLs might be excluded from the returned XMLDocument.
An XML document (in the XHTML namespace) representing the parsed HTML.
Note: Certain uses of the returned XML document (e.g. as a context node for an XPath query) will require the use of a namespace resolver. An example of this situation has been provided below.
// GET erikvold.com
GM_xmlhttpRequest({
method: "GET",
url: "http://erikvold.com/",
onload: function(response) {
// Parse the response to an XML document
var doc = GM_safeHTMLParser(response.responseText);
// Query the document to get certain content, and then display it in an alert
alert(doc.getElementById("hcard-Erik-Vergobbi-Vold").innerHTML);
}
});// GET google.com
GM_xmlhttpRequest({
method: "GET",
url: "http://google.com/",
onload: function(response) {
// Parse the response to an XML document
var doc = GM_safeHTMLParser(response.responseText);
// Get the HTML element through the use of `GM_xpath` and the namespace resolver
var htmlEle = GM_xpath({
path: "//x:html",
node: doc,
resolver: "http://www.w3.org/1999/xhtml"
});
// Log the element's String representation
if (htmlEle) {
GM_log("The HTML element was found! " + htmlEle);
} else {
GM_log("The HTML element was not found...");
}
}
});