Sanitize HTML Server Connect Module - 1.0.2

This is the first of a few extensions I plan on publishing over the next few months as I work on a rather large project. This extension is rather simple, and implements the Sanitize-HTML package into Wappler.

sanitize-html is tolerant. It is well suited for cleaning up HTML fragments such as those created by CKEditor and other rich text editors. It is especially handy for removing unwanted CSS when copying and pasting from Word.

sanitize-html allows you to specify the tags you want to permit, and the permitted attributes for each of those tags. If an attribute is a known non-boolean value, and it is empty, it will be removed. For example checked can be empty, but href cannot.

If a tag is not permitted, the contents of the tag are not discarded. There are some exceptions to this, discussed below in the “Discarding the entire contents of a disallowed tag” section.

The syntax of poorly closed p and img elements is cleaned up.

href attributes are validated to ensure they only contain http , https , ftp and mailto URLs. Relative URLs are also allowed. Ditto for src attributes.

Allowing particular urls as a src to an iframe tag by filtering hostnames is also supported.

HTML comments are not preserved. Additionally, sanitize-html escapes ALL text content - this means that ampersands, greater-than, and less-than signs are converted to their equivalent HTML character references ( & → &amp; , < → &lt; , and so on). Additionally, in attribute values, quotation marks are escaped as well ( " → &quot; ).


Why would I need this?

Here’s one example: When you’re building a project, you might want to use a WYSIWYG like Quill, Summernote, or CKEditor to handle text boxes with formatting. These text editors allow users to input/create formatted text in an HTML format, which can could potentially contain malicious code. Most of these editors only handle protection on the front end, and require you to implement it on the backend yourself.

While displaying content directly from the database using InnerHTML in Wappler should not directly execute any harmful code. It’s considered good practice to sanitize user input before storing it in the database in the first place just in case.

It’s also useful for making sure users are only in fact using allowed tags, such as if you disable headings in the toolbar of Quill, you might also want to check that the sent code does in fact not contain headings on the backend. Or perhaps you might want to simply check that an Iframe code only allows certain domains, or just clean up broken HTML.


Small Example:

If the HTML text set on the Sanitize HTML action is “<p>text</p> <h3>text 2</h3> <h4>text 3</h4>”, but you only have the allowed tags set as “<p><h4>”, the returned text would be “<p>text</p> text 2 <h4>text 3</h4>”, as “<h3>” is not an allowed tag.


Config:

This server connect extension currently has the following options in the Wappler UI:

disallowedTagsMode, allowedTags, nonBooleanAttributes, allowedIframeHostnames, allowedIframeDomains, allowIframeRelativeUrls

The current version (1.0.0) has been put together relatively quickly. As I use the extension more on my own projects, I might add some more options or change some things around, but from brief testing, it all works fine. You can also contribute on Github


Install:

You can install this extension automatically by following the steps here:

NPM:


Changelog:

  • Fixed data bindings not working (20/03/2024)
10 Likes

This module is an excellent addition to Wappler. Most projects have some form of text submission so being able to easily sanitise it is useful.

2 Likes

Thank you for not-mentioning me.
Shows a real attitude to the Wappler’s community.

Not sure I get you? No part of this uses anything you had provided. It was created going off the sanitize-html docs.

The post before was mainly seeing if anyone had already implemented it/created a module for it since I had never read the docs for sanitize-html at the time of the post. I just assumed the npm package required a lot more than a simple input/output setup, but after reading them (a bit prior before seeing your post as I hadn’t checked back before then) it turned out to indeed be pretty straightforward. It wasn’t for the docs of creating Wappler’s extensions itself or anything like that since I’ve already done some stuff with that in the past on other projects. But I do appreciate you providing a version on that post anyway should I not have been able to create one as well.

1 Like

Nice work, @Digo!

1 Like

Thank you for this @Digo - great contribution.

1 Like

Thanks a lot, will def use this.

Quick question : You mention CK5… Have you managed to do a “good” integration of it? I’d love to use this editor hehe.

1 Like

I had fairly basic functionality working, but I ended up deciding to go with Quill V2 instead, so not really I’m afraid.

No problem, thanks a lot :slight_smile:

1 Like

@Digo Nice one, very simple!

Just a question here.
The output is expected to be on cleanedHtml?
image

Because that won't work.


1 Like

Whoops, I must have messed something up with the hjson in the NPM version, since I remember changing something about that prior to publishing.

The action itself is what returns the output, not the "cleanedHtml". I'll remove this since it's redundant (and a mistake)

1 Like