Proposed Changes to HTML Helper Mode

Darren Brierton
DZR Web Development
Last modified: Thu Feb 7 15:11:53 GMT 2002

Abstract

HTML Helper Mode is a nice, user-friendly Emacs major mode for HTML. It does not currently include support for many of the elements and attributes contained in HTML 4.01 / XHTML 1.0 and does not support at all CSS Levels 1 and 2. These proposed changes involve adding such support.

It is proposed that the changes are made incrementally, in discrete steps to ensure the least amount of disruption to the existing code, and so that the task can be broken down into tractable sub-tasks.

Introduction

HTML Helper Mode, originally written by Nelson Minar and now maintained by Gian Uberto "Saint" Lauri, is an Emacs major mode for HTML. However, HTML Helper Mode does not currently provide support for

Furthermore, whilst not strictly the job of an HTML/XHTML mode, it would be desirable if HTML Helper Mode provided some support for CSS. Given that one of the current features of HTML Helper Mode is that it provides mode-switching in order to support the inclusion of client-side and server-side scripts in HTML documents, it would be desirable to extend this to also include support for CSS.

Lastly, HTML Helper Mode makes it far too easy to create invalid documents. PSGML is a major mode for Emacs for SGML and XML, which is capable of parsing both internal and external DTDs and then providing context-sensitive menus for inclusion of only valid elements and attributes. However, PSGML mode is difficult to set up correctly, whereas HTML Helper Mode is comparatively much simpler to get up and running with, and frankly using PSGML mode just seems like overkill for writing HTML: there seems to be no real advantage to the overhead of a DTD parser when only six, not particularly long, DTDs are in question. Also, PSGML mode provides no facilities for including the kinds of non-SGML, non-XML content routinely included in HTML documents such as internal style sheets, style attributes, client and server side scripts and common scripting event attributes.

A Utopian Vision of the Perfect HTML Helper Mode

Let me begin as unrealistically as possible. If a genie from a magic lamp granted me a perfect HTML Helper Mode, what would it be like? There are two broad categories of "dream features":

Total support for HTML 4.01 and XHTML 1.0

HTML Helper Mode should fully support the six different "flavours" of HTML: HTML 4.01 strict, transitional and frameset, and XHTML 1.0 strict, transitional and frameset. (Support for older versions of HTML such as 3.2, 2 or 1 is redundant given support for HTML 4.01 transitional.)

  1. HTML Helper Mode's behaviour should be sensitive to which DTD we are using. To that end, HTML Helper Mode needs to know which document type any buffer is supposed to conform to.

    1. On creating a new HTML document, HTML Helper Mode should:

      1. Prompt for which document type we wish to use, offering six numbered choices for HTML 4.01 strict, HTML 4.01 transitional, HTML 4.01 frameset, XHTML 1.0 strict, XHTML 1.0 transitional, and XHTML 1.0 frameset. The user responds by typing a number, 1 to 6, and the following templates are inserted. (I would suggest that html-helper-new-buffer-template be used for this, and therefore that this ceases to be customisable by the user.)

      2. Having inserted one of the above templates, the user should be prompted for a title (that is, prompted for the contents of the <title> element), and then the contents of a new user defined variable, html-helper-new-buffer-body-template, are then inserted within the <body> element.

    2. On opening an existing HTML document HTML Helper Mode should check that the document type declaration matches one of the six supported document types. If it does not, or if there is no document type declaration, HTML Helper Mode should prompt for one as when creating a new document, remove any existing document type declaration, and then insert the appropriate one:

      • For HTML 4.01 strict:

        <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
        
      • For HTML 4.01 transitional:

        <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd">
        
      • For HTML 4.01 frameset:

        <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
        "http://www.w3.org/TR/html4/frameset.dtd">
        
      • For XHTML 1.0 strict:

        <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
        
      • For XHTML 1.0 transitional:

        <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
        
      • For XHTML 1.0 frameset:

        <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
        
    3. On saving a document, HTML Helper Mode should check the encoding of the buffer, and

      1. Check for a
        <meta http-equiv="Content-Type" content="text/html; charset=foo">
        element in <head>, and if one exists check that "foo" matches the actual encoding, and if one does not exist insert one. (Clearly this function will also need to check that there is a <head> element, and if there is not insert one.)

      2. In XHTML documents it should also check for an encoding attribute in the XML declaration and either check that it is correct or insert one. (Again such a function would need to check for an XML declaration in the first place, and insert one if one is not found.)

    For example, the end result then of simply creating a new XHTML 1.0 strict document with Latin-1 encoding and then saving it should result in:

    <?xml version="1.0" encoding="ISO-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    
    <html xmlns="http://www.w3.org/1999/xhtml">
      <head>
        <title>My Title</title>
        <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
      </head>
      <body>
        Content inserted by html-helper-new-buffer-body-template.
      </body>
    </html>
    
  2. Once HTML Helper Mode knows which document type a buffer is an instance of, it should only be possible to insert elements and attributes defined in the corresponding DTD.

    (Note that it should not be the purpose of HTML Helper Mode to actually attempt to validate HTML documents - there are specific tools for that purpose such as Dave Raggett's HTML Tidy that are better suited to that task and are easily called from within Emacs. HTML Helper Mode should simply make it difficult to write invalid documents.)

    1. Which elements can be inserted should be context-dependent. If you are within a <p> element you should not be able to insert, for example, a <table> element or an <ol> element.

      (Agan, HTML Helper Mode should merely make it difficult to produce invalid documents. For example, if the current buffer was supposed to be (X)HTML strict, it needn't in any way attempt to prevent one from writing:

      <blockquote>
        This is a block quotation.
      </blockquote>
      

      even though it is invalid (<blockquote> can only contain other block-level elements in the strict document types). Again, there are other tools better suited to picking out this kind of mistake.)

    2. On inserting an element you should always be prompted for the values of required attributes.

    3. An insert attribute command should call up a list of possible attributes of the element the point is currently within à la PSGML mode. When an attribute is selected which has only a few possible values the user should be prompted for which value he wishes. (For example, in both transitional HTML and XHTML most block-level elements have an optional align attribute, whose possible values are one of "left", "right", "center", or "justify".)

Auxilliary support for scripting and style languages

HTML documents can also contain non-SGML/XML content which is supposed to be interpreted by either the server or the client. Being able to edit this material comfortably is also the job of a good HTML editing mode.

The ideal behaviour would be for the Emacs buffer to temporarily alter mode when the point is within a script or style element, attribute or processing instruction, and then change back once the point is no longer within one. There is an Emacs minor mode, Multiple Major Modes, specifically designed to provide this functionality. The contents of <style> and <script> elements, the values of style and onmouseover etc. attributes, and the insides of processing instructions such as <?php … ?> would be grayed out when the point was not within them. When the point was within one the rest of the document would become grayed out, and the buffer would drop into the appropriate mode, highlighting the code the point was within accordingly and replacing the HTML menu with, for example, a PHP, Javascript or CSS menu.

HTML Helper Mode should also offer the following functionality:

  1. On inserting a <script> or <style> element the user should be prompted for the type attribute. (This is already covered by the requirement that inserting an element always prompts for required attributes.)

  2. On inserting a style attribute or a common scripting event attribute such as onmouseover HTML Helper Mode should check for the existence of
    <meta http-equiv="Content-Style-Type" content="foo">
    or
    <meta http-equiv="Content-Script-Type" content="foo">
    in the <head> element.

    1. If none exists then the user should be prompted for the language and the <meta> element inserted with the appropriate value for the content attribute.

    2. If one does exist than the user should be immediately dropped into the appropriate language mode, thereby helping to avoid more than one type of client-side scripting language or style language being employed in element attributes (which is not permitted).

Other features?

If support for scripts and styles is provided in this way, what about providing support for other kinds of data, such as images or flash animations? (For instance, the height and width attributes could be automatically filled in for images, or the appropriate paramaters added to <object> elements.) Well, this is my utopian vision, my genie, and my lamp, so no!

But seriously, if there were demand for such features these too could certainly be considered.

Breaking the Task down into Realistic Goals

Well, enough day-dreaming for now. What needs to be considered is how we could even begin to implement such features. It seems that the best possible route is to implement as many things as possible without disrupting the existing code of HTML Helper Mode. My recommendation is to break down the task into significantly smaller tasks whilst attempting to keep an eye on long term architectural changes that my eventually need to be made.

Supporting HTML 4.01 and XHTML 1.0

  1. The first step is to add all the elements and defined character entities of HTML 4.01 to HTML Helper Mode's HTML menu, grouped according to their logical category. (Note that this grouping will mean that some elements appear in more than one sub-menu. I believe that this is an advantage, and not a flaw.) The new menu structure, listing all the elements and entities, according to their logical categories is given here.

    • No attempt is made here to distinguish betwen the strict, transitional and frameset vocabularies of HTML 4.01 at this stage: all the elements are added.

    • On inserting an element you will only be prompted for required attributes. No support is available at this stage for inserting any other attributes. These must be manually typed in by hand.

    • The shear number of defined character entities will mean that the sub menus for Latin-1 Entities, Symbols and Greek Letters, and Other Special Characters, will have to be broken up in to, say, Latin-1 Entities (1), Latin-1 Entities (2), etc., in order to avoid the sub-menus becoming unmanageably long.

    • In order to ensure maximum compatability with XHTML all elements and attributes will be lowercase, all attribute values quoted, and no optional closing tags will be omitted.

    • In readiness for XHTML support the templates for empty elements will be changed so that the end of the tag is no longer a string but a variable. In other words,

      (textel  "\C-c\C-m" nil  "Line Break"
               ("<br>\n"))
      

      will instead become,

      (setq html-helper-end-empty-tag ">")
      …
      (textel  "\C-c\C-m" nil  "Line Break"
               ("<br" html-helper-end-empty-tag "\n"))
      

    These initial changes require the least disruption to the existing HTML Helper Mode code. Only mapcar 'html-helper-add-tag, mapcar 'html-helper-add-type-to-alist and html-helper-types-to-install need to be modified. Nevertheless, there is one "key" [uggh] area where there is potentially some considerable disruption: keymaps.

    • The current keyboard shortcuts in HTML Helper Mode are devised to be easily memorable given the existing menu structure. In changing the menu structure, and adding many more menu items, what will become of the current keyboard shortcuts? My current, and by no means final, solution is to leave all existing keyboard shortcuts as they are, and also not to introduce new ones for the new menu items. If someone else would like to come up with a proposal for keybindings I would be extremely pleased.

  2. Unfortunately, the next stage calls for an architectural change. HTML Helper Mode actually needs to define six new modes:

    • html-helper-html4-strict-mode,
    • html-helper-html4-transitional-mode,
    • html-helper-html4-frameset-mode,
    • html-helper-xhtml1-strict-mode,
    • html-helper-xhtml1-transitional-mode and
    • html-helper-xhtml1-frameset-mode.

    On top of that, HTML Helper Mode has to have hooks written for creating a new buffer, and opening and saving an existing file. The first change is required because all the subsequent changes depend upon it, and the second two are simply the most straightforward once the first has been achieved.

  3. HTMl Helper Mode's menu must be changed from one which is statically generated at initialisation to one which is dynamically generated.

  4. Support for determining which element the point is within and then only offering elements and attributes that that element can contain/have on the menu can then finally be added. This table shows what those elements and attributes are.

  5. tempo.el needs to be changed to provide a new kind of template: one which prompts for a number in order to insert an option from a finite list of values. (Required for attribute with a fixed number of possible values such as align.)

Supporting CSS Levels 1 and 2

It is not my present proposal that CSS support be added directly to HTML Helper Mode. Rather my proposal is that we take the only existing CSS mode that I currently know of, css-mode.el, by Lars Marius Garshol and "HTML-Helper-Mode-ise" it. That is, Lars' CSS Mode only provides syntax highlighting. We can add to it a CSS menu and templates for CSS properties and values using tempo templates just as is done in HTML Helper Mode.

I hope to provide soon, as the next stage of this part of the proposal, a definitive listing of all CSS properties and values, broken down into a menu structure, as was done above for HTML. However, one issue needs to be resolved before that is done: should the CSS properties be broken down by level or not, and if so how?

Next comes attempting to integrate the two packages, CSS Mode and HTML Helper Mode. At present, my interest would be in investigating whether Multiple Major Modes can be used to achieve this integration, and if it can, whether we can use that as a prototype for adding support for scripting languages.

Adding Auxilliary Support for Scripting Languages

In one sense this is the most distant part of the current proposal, and yet in another it is the nearest. The former because, considered in isolation to the proposals in this document, it depends upon how support for CSS is implemented (the idea being that support for CSS is used as a test case for support for other languages). The latter because this is the issue that is being most actively worked on by HTML Helper Mode's maintainer, Gian Uberto Lauri. At present, there are no specific proposals here. As work progresses, both on issues specific to this proposal and by Gian, we will have a clearer idea.

Some References


Valid CSS!Valid XHTML 1.0!