Work With HTML and the Clipboard
Put HTML on the clipboard that other apps can use.
by Karl E. Peterson

February 2003 Issue  Download the original Classic VB code from this column.

Technology Toolbox: VB6, VB5

Q: Work With HTML and the Clipboard
My application needs to place HTML on the clipboard, but I can't figure out how to do this so that other applications understand that's what it is. I've seen references to the HTML Clipboard Format (CF_HTML), but I can't find the definition for that constant. How should I proceed?

A:
Using the CF_HTML clipboard format with the Windows clipboard is a bit confusing, in part because it's not a native clipboard format; it's a registered format, so it isn't a constant at all, because its value differs from system to system. You can obtain registered clipboard-format values with a simple API call—RegisterClipboardFormat. The first time this function is called with a given string, it returns a unique number in the range C000-FFFF. Each subsequent call that any process running on the system makes returns the same value. The magic string to use for this format is "HTML Format":

Private Declare Function _
   RegisterClipboardFormat _
   Lib "user32" _
   Alias "RegisterClipboardFormatA" _
   (ByVal lpString As String) As Long
Dim CF_HTML As Long
Const RegHtml As String = "HTML Format"
CF_HTML = _
   RegisterClipboardFormat(RegHtml)

 

You must construct a descriptive header and prepend it to the data before you can place your HTML data onto the clipboard. This header provides other applications with the description's version information, with offsets within the data where the HTML starts and stops, and with information about where the actual selection begins and ends. Conceptualize the selection by considering a user who might select a portion of an HTML document or even an element (such as a few rows in a table). Other portions of the page (such as inline style definitions) might be required to render the selection fully. You likely must supply more than the raw selection to put HTML on the clipboard in its full context. A sample header might look like this:

Version:1.0
StartHTML:000000258
EndHTML:000001491
StartFragment:000001172
EndFragment:000001411

 

Applications use the StartFragment and EndFragment attributes to determine which data to paste, and they might or might not use the remaining HTML to help format the selected portion. You must inject HTML comments into the data to identify the selected area further. Obviously, you must do this before you build the final header, because the offsets won't be stable otherwise. The opening/closing comment tags for the selected data are "<!--StartFragment-->" and "<!--EndFragment-->", respectively (see Listing 1).

I don't have enough room here to detail all of this header's aspects, so I'll hit a few highlights and refer you to the sample code and further reading (see Additional Resources). You must keep several critical points in mind. The offsets listed in the header are zero-based, so you must adjust your string-manipulation routines accordingly. Also, if you're reading as well as writing these headers, you must assume that the number of digits is variable (for example, Internet Explorer [IE] uses 9, and Word uses 10).

Finally, if you place only CF_HTML on the clipboard, applications such as Word and FrontPage don't know what to do with it. You must also supply a plain-text rendition of the stylized HTML to the clipboard for these apps to behave as expected. Scads of tools perform HTML-to-text conversions, or the extremely macho might prefer to roll their own parsers. But, no Windows programmer should ever have to hand-parse HTML again. You can call upon the OS instead for this everyday task:

Public Function Html2Text(ByVal Data _
   As String) As String
      Dim obj As Object
      On Error Resume Next
      Set obj = _
         CreateObject("htmlfile")
      obj.Open
      obj.Write Data
      Html2Text = obj.Body.InnerText
End Function

Leveraging IE isn't necessarily the quickest method for parsing HTML, but the expediency it offers is a good tradeoff in this case. —K.E.P.


Additional Resources
•  "HOWTO: Add HTML Code to the Clipboard by Using Visual Basic"
•  "HTML Clipboard Format"
 

About the Author
Karl E. Peterson is a GIS analyst with a regional transportation planning agency and serves as a member of the VSM Editorial Advisory Board. Online, he's a Microsoft MVP and a section leader on several DevX forums. Find more of Karl's VB samples at www.mvps.org/vb.