|
Work With HTML and the Clipboard
Put HTML on the clipboard that other apps can use.
by Karl E. Peterson
Technology Toolbox: VB6, VB5
Q: Work
With HTML and the Clipboard A: Private Declare Function _ RegisterClipboardFormat _ Lib "user32" _ Alias "RegisterClipboardFormatA" _ (ByVal lpString As String) As Long Dim CF_HTML As Long Const RegHtml As String = "HTML Format" CF_HTML = _ RegisterClipboardFormat(RegHtml)
You must construct a descriptive header and prepend it to the data before you can place your HTML data onto the clipboard. This header provides other applications with the description's version information, with offsets within the data where the HTML starts and stops, and with information about where the actual selection begins and ends. Conceptualize the selection by considering a user who might select a portion of an HTML document or even an element (such as a few rows in a table). Other portions of the page (such as inline style definitions) might be required to render the selection fully. You likely must supply more than the raw selection to put HTML on the clipboard in its full context. A sample header might look like this: Version:1.0 StartHTML:000000258 EndHTML:000001491 StartFragment:000001172 EndFragment:000001411
Applications use the StartFragment and EndFragment attributes to determine which data to paste, and they might or might not use the remaining HTML to help format the selected portion. You must inject HTML comments into the data to identify the selected area further. Obviously, you must do this before you build the final header, because the offsets won't be stable otherwise. The opening/closing comment tags for the selected data are "<!--StartFragment-->" and "<!--EndFragment-->", respectively (see Listing 1). I don't have enough room here to detail all of this header's aspects, so I'll hit a few highlights and refer you to the sample code and further reading (see Additional Resources). You must keep several critical points in mind. The offsets listed in the header are zero-based, so you must adjust your string-manipulation routines accordingly. Also, if you're reading as well as writing these headers, you must assume that the number of digits is variable (for example, Internet Explorer [IE] uses 9, and Word uses 10). Finally, if you place only CF_HTML on the clipboard, applications such as Word and FrontPage don't know what to do with it. You must also supply a plain-text rendition of the stylized HTML to the clipboard for these apps to behave as expected. Scads of tools perform HTML-to-text conversions, or the extremely macho might prefer to roll their own parsers. But, no Windows programmer should ever have to hand-parse HTML again. You can call upon the OS instead for this everyday task: Public Function Html2Text(ByVal Data _ As String) As String Dim obj As Object On Error Resume Next Set obj = _ CreateObject("htmlfile") obj.Open obj.Write Data Html2Text = obj.Body.InnerText End Function Leveraging IE isn't necessarily the quickest method for parsing HTML, but the expediency it offers is a good tradeoff in this case. —K.E.P.
About the Author |