Go Back

Cleaning Pasted Content in RadEditor

Telerik RadEditor & Sitefinity ASP.NET CMS Sitefinity ASP.NET CMS is a web-based content management platform and Telerik's RadEditor plays a key role in managing the web site's content.  Using RadEditor, content editors can create HTML without knowing HTML.  However, as you'll see below, simple abstractions can have unforeseen consequences. 

In past blog posts I explained how to Disable HTML Editing in RadEditor and Add Custom Styles to RadEditor.  These techniques help ensure that content editors can do no wrong.

This post I'll add to this theme by demonstrating how to automatically remove dangerous & ugly HTML from content pasted into RadEditor.

The Problem with Copy & Paste

RadEditor is designed to generate valid & clean HTML.  However, a lot of web site content isn't composed in RadEditor.  Often content is created in Microsoft Word, Adobe InDesign or various other programs.  This content is then copied to the clipboard and pasted into RadEditor.

Rich-text content pasted from other software comes with baggage.  This baggage then becomes part of your web site.  Here is an example Word document:

Microsoft Word Example - Pasting into Sitefinity

This Word document can be pasted into Sitefinity's RadEditor.

Microsoft Word content pasted into Sitefinity CMS

At first glance, copying & pasting this document into RadEditor looks okay  However, here is the HTML that accompanied this rich-text content:

<h1 style="margin: 24pt 0in 0pt;">
    <v:shapetype id="_x0000_t75" stroked="f" filled="f" path="m@4@5l@4@11@9@11@9@5xe"
        o:preferrelative="t" o:spt="75" coordsize="21600,21600"><v:stroke joinstyle="miter"></v:stroke><v:formulas><v:f eqn="if lineDrawn pixelLineWidth 0"></v:f><v:f eqn="sum @0 1 0"></v:f><v:f eqn="sum 0 0 @1"></v:f><v:f eqn="prod @2 1 2"></v:f><v:f eqn="prod @3 21600 pixelWidth"></v:f><v:f eqn="prod @3 21600 pixelHeight"></v:f><v:f eqn="sum @0 0 1"></v:f><v:f eqn="prod @6 1 2"></v:f><v:f eqn="prod @7 21600 pixelWidth"></v:f><v:f eqn="sum @8 21600 0"></v:f><v:f eqn="prod @7 21600 pixelHeight"></v:f><v:f eqn="sum @10 21600 0"></v:f></v:formulas><v:path o:connecttype="rect" gradientshapeok="t" o:extrusionok="f"></v:path><o:lock aspectratio="t" v:ext="edit"></o:lock></v:shapetype>
    <v:shape id="Picture_x0020_0" style="z-index: 1; position: absolute; margin-top: 42.75pt;
        width: 75pt; height: 75pt; visibility: visible; margin-left: 369pt; mso-wrap-style: square;
        mso-wrap-distance-left: 9pt; mso-wrap-distance-top: 0; mso-wrap-distance-right: 9pt;
        mso-wrap-distance-bottom: 0; mso-position-horizontal: absolute; mso-position-horizontal-relative: margin;
        mso-position-vertical: absolute; mso-position-vertical-relative: margin;" alt="sitefinity-100-100.jpg"
        type="#_x0000_t75" o:spid="_x0000_s1026"><v:imagedata o:title="sitefinity-100-100" src="file:///C:\Users\Gabe\AppData\Local\Temp\msohtmlclip1\01\clip_image001.jpg"></v:imagedata><w:wrap type="square" anchory="margin" anchorx="margin"></w:wrap></v:shape>
    <span style="font-family: cambria; color: #365f91; font-size: 24px;">Title of the Page</span></h1>
<p class="MsoNormal" style="margin: 0in 0in 10pt;">
    <span style="font-family: calibri;"><span style="line-height: 115%; font-size: 14pt;">
        Sitefinity CMS</span> is a <i style="mso-bidi-font-style: normal;">flexible</i>
        ASP.NET <span style="color: #e36c0a; mso-themecolor: accent6; mso-themeshade: 191;">
            content management platform</span> for the </span><span style="font-family: 'arial narrow','sans-serif';">
                construction</span><span style="font-family: calibri;"> and management of <span style="background: #dbe5f1;
                    mso-shading-themecolor: accent1; mso-shading-themetint: 51;">commercial</span> websites,
                    <span style="background: yellow;">community</span> portals, </span>
    <span style="font-family: 'courier new';">intranets</span><span style="font-family: calibri;">,
        and personal blogs. </span>
</p>

There are many problems with this HTML:

  • Way too much HTML is being used.
  • Non-standard XML elements are being used.
  • Extra HTML will cause web pages to load slower.
  • Extra HTML increases the risk of browser incompatibilities.
  • Local image file is referenced and will not display on the public web site.
  • Embedded styles may not match web site theme.
  • Nonstandard web fonts are being used.

Thankfully RadEditor has a feature that can be used to automatically filter this bad HTML.

Automatically Strip Extra Formatting on Paste

RadEditor can be configured to automatically filter bloated/messy HTML by adding a StripFormattingOnPaste property to the <RadEditor> web control tag.

<telerik:RadEditor ID="RadEditor1" StripFormattingOnPaste="All" runat="server">

The following options can be used for the StripFormattingOnPaste property:

  • None: pastes the clipboard content as is.
  • NoneSupressCleanMessage: Doesn't strip anything on paste and does not ask questions.
  • MSWord: strips Word-specific tags on Paste, preserving fonts and text sizes.
  • MSWordNoFonts: strips Word-specific tags on Paste, preserving text sizes only.
  • MSWordRemoveAll: strips Word-specific tag on Paste, removing both fonts and text sizes.
  • Css: strips CSS styles on Paste.
  • Font: strips Font tags on Paste.
  • Span: strips Span tags on Paste.
  • All: strips all HTML formatting and pastes plain text.
  • AllExceptNewLines: Clears all tags except "br" and new lines (\n) on paste.

Additional information can be found on Telerik's Cleaning Word Formatting page.

Modifying RadEditor's StripFormattingOnPaste property is very easy!.  The bigger challenge, in Sitefinity, is accessing the RadEditor used by the Generic Content Control Designer.  Since Sitefinity 3.6, the Control Templates have been embedded directly in the assembly.  These Control Templates must be extracted and mapped before changes can be made.

Below I will briefly describe these steps.

Creating/Mapping an External Control Template

1.  Download the Embedded Templates zip file from the Sitefinity Downloads section.

2.  Unzip this file.

3.  Locate the following file in the unzipped files:
/Sitefinity/Admin/ControlTemplates/Generic_Content/GenericContentDesigner.sft

4.  Copy this file to the following location in your Sitefinity web site:
~/Custom/Admin/ControlTemplates/Generic_Content

5.  Rename this file from *.sft to *.ascx:
~/Sitefinity/Admin/ControlTemplates/Generic_Content/GenericContentDesigner.ascx

6.  Create the following XML file in your Sitefinity web site:
~/App_Data/Configuration/Telerik.Sitefinity.Configuration.ControlsConfig.xml

7.  Add the following viewSettings mapping to this XML file:

<?xml version="1.0" encoding="utf-8" ?>
<controlsConfig>
    <viewMap>
        <viewSettings hostType="Telerik.Cms.Engine.WebControls.Design.GenericContentDesigner" layoutTemplatePath="~/Custom/Admin/ControlTemplates/Generic_Content/GenericContentDesigner.ascx" />
    </viewMap>
</controlsConfig>

View mappings are initialized during application startup.  The web application may need restarted before changes take effect.

Modifying RadEditor's StripFormattingOnPaste Property in Sitefinity

Sitefinity will now use an external template for the Generic Content Control Designer instead of the embedded template.  Customizations can be made to this external template.

Open the following file in Visual Studio (or IDE of choice):

~/Sitefinity/Admin/ControlTemplates/Generic_Content/GenericContentDesigner.ascx

Find the <telerik:RadEditor> tag and add a StripFormattingOnPaste property:

<telerik:RadEditor 
    runat="server" 
    ID="textEditor"
    StripFormattingOnPaste="MSWordRemoveAll"
    ContentAreaCssFile="~/Sitefinity/Admin/Themes/Default/AjaxControlsSkins/Sitefinity/EditorContentArea.css"
    ToolsFile="~/Sitefinity/Admin/ControlTemplates/EditorToolsFile.xml"    
    Skin="WebBlue"    
    NewLineBr="False"
    Height="360px" 
    Width="98%"> 
    <ImageManager ViewPaths="~/Images" UploadPaths="~/Images" DeletePaths="~/Images" />
    <MediaManager ViewPaths="~/Files" UploadPaths="~/Files" DeletePaths="~/Files" />
    <FlashManager ViewPaths="~/Files" UploadPaths="~/Files" DeletePaths="~/Files" />
    <DocumentManager ViewPaths="~/Files" UploadPaths="~/Files" DeletePaths="~/Files" />
    <CssFiles>
        <telerik:EditorCssFile Value="~/Sitefinity/Admin/Themes/Default/AjaxControlsSkins/Sitefinity/EditorCssFile.css" />
    </CssFiles>
</telerik:RadEditor>

Done!  By adding the StripFormattingOnPaste property, RadEditor will automatically apply the Strip Formatting rules to any pasted content.

Some Parting Words...

It's really up to you which setting you use for the StripFormattingOnPaste property.  In the past, I've been very strict regarding the CssStyles applied to my web site content.  In these cases, I used the All setting.  This setting will remove all HTML; web site editors will be forced to reapply their formatting inside RadEditor.

Forcing editors to re-apply their formatting inside Sitefinity's RadEditor is the most sure fire method of controlling the HTML.  However, this technique might not be popular with content editors.  Experiment with the settings above and find the setting the works best for you.

More Resources:

Comments  4

  • Brook 09 Apr

    Thanks Gabe this topic just came up for me this AM.
  • Cormac 26 May

    Gabe, you're a superstar. This has been causing us all sort of issues.
  • Tom 19 Nov

    This doesn't seem to work well for 3.6:

    http://www.sitefinity.com/support/forums/sitefinity-3-x/developing-with-sitefinity/generic-content-radeditor-stripping-word-formatting.aspx#1008552

  • Tom 19 Nov

    This doesn't seem to work well for 3.6:

    http://www.sitefinity.com/support/forums/sitefinity-3-x/developing-with-sitefinity/generic-content-radeditor-stripping-word-formatting.aspx#1008552

Post a comment!
  1. Formatting options
       
     
     
     
     
       
  2. I'm sorry for the CAPTCHA. You have spammers to thank for this: