How saferInnerHTML() works
On Friday, I wrote about saferInnerHTML()
, a new helper function I wrote to more safely inject markup templates into the DOM (on GitHub here).
Today, I wanted to break down how to the script actually works.
How it works
Before digging into the code, I think it’s helpful to understand how saferInnerHTML()
actually works.
You pass in an element to inject content into, and your HTML as a string.
var app = document.querySelector('#app');
saferInnerHTML(app, '<h1>Hello, world!</h1>');
If you want to append your markup to what’s already there instead of completely replacing it, pass in true
as a third argument.
saferInnerHTML(app, '<h1>Hello, world!</h1>', true);
Behind-the-scenes, saferInnerHTML()
does the following:
- Converts your string to actual HTML without rendering it.
- Crawls the HTML and creates an array-based map of every element, its content, and its properties and attributes (element type, classes, data attributes, etc.).
- Loops through each element in the map and creates a new element, adding the attributes and properties. This is where the sanitization happens.
- Appends the new elements into the DOM.
Lets look at each of these steps.
1. Convert a string into HTML
The easiest way to do this is by creating a temporary element and injecting the string as innerHTML
.
var temp = document.createElement('div');
temp.innerHTML = template;
This still exposes you to XSS attacks.
Even if you don’t inject the content into the DOM, onerror
properties will still run and can trigger cross-site scripting attacks. I had to find another way.
After some Googling, I discovered the DOMParser()
method.
It creates an HTML document from whatever string you provide. We can pass our template into it and return the document body
with real HTML elements.
/**
* Convert a template string into HTML DOM nodes
* @param {String} str The template string
* @return {Node} The template HTML
*/
var stringToHTML = function (str) {
var parser = new DOMParser();
var doc = parser.parseFromString(str, 'text/html');
return doc.body;
};
It works back to IE9, but the text/html
format that we need starts at IE10 and up. Any polyfill I’ve found for it uses innerHTML
, which we don’t want, so this is a hard limit on browser support.
2. Map our HTML
To map our HTML into an array, I use element.childNodes
to get each child node for our element (this is all the stuff inside our template).
I pass it into Array.from()
to create an array, then use the Array.forEach()
method to loop through each item.
In the loop, I set up keys for the element’s content, attributes, and node type. I recursively pass the element back into createDOMMap()
to get any child elements of the element.
/**
* Create a DOM Tree Map for an element
* @param {Node} element The element to map
* @return {Array} A DOM tree map
*/
var createDOMMap = function (element) {
var map = [];
Array.from(element.childNodes).forEach(function (node) {
map.push({
content: node.childNodes && node.childNodes.length > 0 ? null : node.textContent,
atts: node.nodeType === 3 ? [] : getAttributes(node.attributes),
type: node.nodeType === 3 ? 'text' : node.tagName.toLowerCase(),
children: createDOMMap(node)
});
});
return map;
};
I’m using a helper function, getAttributes()
, to get the element’s attributes.
It takes the element.attributes
property as it’s argument, converts it to an array, and uses the Array.map()
method to strip out everything but the name
and value
of each attribute.
/**
* Create an array of the attributes on an element
* @param {NamedNodeMap} attributes The attributes on an element
* @return {Array} The attributes on an element as an array of key/value pairs
*/
var getAttributes = function (attributes) {
return Array.from(attributes).map(function (attribute) {
return {
att: attribute.name,
value: attribute.value
};
});
};
¾. Create new elements and inject them into the DOM
When the script runs, it passing the template into the stringToHTML()
helper, then passes that into createDOMMap()
.
The result map is passed into a renderToDOM()
helper function.
renderToDOM(createDOMMap(stringToHTML(template)));
If the content isn’t getting appended, the renderToDOM()
helper wipes any existing content out using innerHTML = ''
.
Then it loops through the map of elements, passing each element into a makeElem()
helper function, and appends the resulting element into the target element.
/**
* Render the template items to the DOM
* @param {Array} map A map of the items to inject into the DOM
*/
var renderToDOM = function (map) {
if (!append) { app.innerHTML = ''; }
map.forEach(function (node, index) {
app.appendChild(makeElem(node));
});
};
Making elements
The makeElem()
helper function creates the actual content.
If the element type
is text
, it uses createTextNode()
. Otherwise, it uses createElement()
.
It adds any attributes with an addAttributes()
helper. Then, if the element has children, it recursively passes them into makeElem()
and appends the results to the new node. Otherwise, it adds any text content using the textContent
property.
Finally, it returns the newly created node.
/**
* Make an HTML element
* @param {Object} elem The element details
* @return {Node} The HTML element
*/
var makeElem = function (elem) {
// Create the element
var node = elem.type === 'text' ? document.createTextNode(elem.content) : document.createElement(elem.type);
// Add attributes
addAttributes(node, elem.atts);
// If the element has child nodes, create them
// Otherwise, add textContent
if (elem.children.length > 0) {
elem.children.forEach(function (childElem) {
node.appendChild(makeElem(childElem));
});
} else if (elem.type !== 'text') {
node.textContent = elem.content;
}
return node;
};
Adding attributes
The addAttributes()
helper is where some of the most important sanitization happens. It strips out things like onerror
properties to prevent XSS attacks.
It loops through each property. If the property is a class, it uses className
to set it. If it’s a data attribute, it uses setAttribute()
.
Otherwise, it sets it as a property on the element itself (elem[propertyName] = value
).
/**
* Add attributes to an element
* @param {Node} elem The element
* @param {Array} atts The attributes to add
*/
var addAttributes = function (elem, atts) {
atts.forEach(function (attribute) {
// If the attribute is a class, use className
// Else if it starts with `data-`, use setAttribute()
// Otherwise, set is as a property of the element
if (attribute.att === 'class') {
elem.className = attribute.value;
} else if (attribute.att.slice(0, 5) === 'data-') {
elem.setAttribute(attribute.att, attribute.value || '');
} else {
elem[attribute.att] = attribute.value || '';
}
});
};
Checking for browser support
To check browser support, first I make sure Array.from
and window.DOMParser
exist. If not, I return false
.
Checking browser support for this is a touch less straightforward than normal, though.
While DOMParser()
works in IE9 and up, the features we need start with IE10. So, we can’t just check if window.DOMParser
exists. We need to actually try to use it with text/html
and see if it throws an error.
To do that, I use try...catch
. If there’s an error, I return false
. Otherwise, I return true
.
var supports = function () {
if (!Array.from || !window.DOMParser) return false;
parser = parser || new DOMParser();
try {
parser.parseFromString('x', 'text/html');
} catch(err) {
return false;
}
return true;
};
Before I run the script, is make sure supports()
returns true
, and if not, throw an error.
// Check for browser support
if (!supports()) throw new Error('safeInnerHTML: Your browser is not supported.');
I also make sure an element to inject into was provided, and if not, throw an error for that.
// Don't run if there's no element to inject into
if (!app) throw new Error('safeInnerHTML: Please provide a valid element to inject content into');