Semantically Annotated Web Forms

Specification Metadata
This version:
https://dashlane.github.io/SAWF/
Author:
Daniel Glazman (Dashlane)
Source:
https://github.com/Dashlane/SAWF
Issue Tracking:
https://github.com/Dashlane/SAWF/issues

Abstract

Dashlane is an application that aims at fixing the UX of the Internet. On the Web, it must interact with Web pages to detect and recognize the forms included in Web pages.This document contains a set of guidelines for Web authors to help Dashlane - and the whole ecosystem of browser Web Extensions with it - interact better with their Web site.

Status of this document

Dashlane authored the present document of "Best Practices" thinking at the whole industry of Web Extensions interacting with a Web page, including our competitors. Please follow the link "Issue Tracking" in the header of this document to submit feedback.

1. Forms

1.1. The form element

There are, semantically speaking, many types of forms on the Web. Here is a short list of different forms a user hits on a daily basis:

Most of these forms do materialize as a <form> element in the markup of their Web page host but not always.

Avoid, unless absolutely necessary, form field elements having no enclosing <form> elements.

Detection of the typology of a form requires, if there is no enclosing <form> element, to find the deepest common ancestor of geographically aggregated form field elements and to use it as a form. This is expensive, error-prone, and too often semantically wrong.

1.2. The data-form-type attribute for form elements

A number of [html] or [WAI-ARIA-1.1] attributes carried by <form> elements can hold human- or machine-readable information that go beyond purely programmatic or stylistic data. For instance:

To determine the type, semantically speaking, of a given Web form, a code (for instance living inside a [WebExtension]) has no other choice than trying to read and understand these arbitrary human- or machine-readable data. This is doable but remains extremely expensive, error-prone and, all in all, an educated guess.

We are then suggesting a new attribute compatible with the [html] specification: the data-form-type attribute.

Add a data-form-type attribute to all your forms

The data-form-type attribute carries type information about the form carrying the attribute. It is an unordered list of comma-separated case-insensitive values. Its default value is "other". The valid values for this attribute are:

Additionnally, the step value should be added to the data-form-type attribute if the current form is only one step in a multi-form unique process, and the final value should be added alongside the step value if the current form is the last step in a multi-form unique process.

The billing value represents a form allowing the user to enter a billing address.

The change_password value represents a form allowing the user to change his/her credentials for the current Web page/Web site.

The contact value represents a form allowing to contact the editors of a Web page/Web site and/or provide feedback.

The final extra value should be added to other values if the current form is the last step in a multi-form process where all forms have the same data-form-type  (except final value).

The forgot_password value represents a form allowing a user to retrieve his/her current credentials for the current Web page/Web site.

The identity value represents a form allowing a user to provide identity details (ID, passport or more general information like name, birthdate, etc) outside of another form type (eg. register or contact) to the current Web page/Web form.

The login value represents a form allowing a user to authenticate himself/herself on a given Web page/Web site, providng credentials.

The newsletter value represents a form allowing a user to suscribe to or unsubscribe from a newsletter sent by email.

The other value is used when no other value of the data-form-type attribute can accurately represent the form.

The payment value represents a form allowing a user to enter payment information (credit card, IBAN, etc.).

The register value represents a form allowing a user to create an account on a given Web page/Web site.

The search value represents a search form, basic or advanced.

The shipping value epresents a form allowing the user to enter a shipping address.

The shopping_basket value represents a form allowing a user to visualize the current contents of his/her shopping basket.

The step extra value should be added to other values if the current form is only one step in a multi-form process where all forms have the same data-form-type  (except final value).

The following real-life form should carry data-form-type="login"

login form at walmart.com

The following real-life form should carry data-form-type="login,step"

login form at google.com

The following real-life form should carry data-form-type="search"

login form at airfrance.com

1.3. The data-form-type attribute for form field elements

Similarly, it can be extremely difficult to detect the type, semantically speaking, of a given for field like a <input> or <button> element contained in a form. Let's take some real-life examples.

Most Web sites implement the Show/Hide buttons inside password fields in two different ways:

  1. the type attribute of the <input> element switches from "password" to "text" and back,
  2. a deck of two strictly overlapping <input> elements, one with type="password" and the other with type="text", sharing the same value; the Web page shows only the one corresponding to the user's request.

In both cases, it can be extremely complex for a code to understand that an <input type="text""> element is indeed a password field if nothing in its context (attributes, label, etc.) indicates it.

The French IRS's Web site has a login form that is pretty common for such gouvernmental Web sites. The user must use an ID that is a "tax number" or a fiscal reference to log in. Here is a screenshot and the markup:

impots.gouv.fr login

impots.gouv.fr login markup

As you can see, it is pretty difficult to understand from the markup of the form that the input field should host a user ID:

We are then proposing to reuse the data-form-type attribute described above to provide client codes with semantic information about the real usage of each form field element.

Add a data-form-type attribute to all your form field elements

The data-form-type attribute carries type information about the form field element carrying the attribute. It is an unordered list of comma-separated case-insensitive values and extra values. Its default value is "other".

The valid values for this attribute are the following ones (for each item, the sublist defines the valid extra values for that value):

Values

The action value represents a form field or a link that serves as submission button or UI element allowing to submit the form or move to the next part in the form.

The address value represents a form field element that can contain an physical or postal address or a part of a physical or postal address.

The age value represents a form field that can contain an age or define a range of ages (eg. between 40 and 49 years old).

The company value represents a form field that can contain information about a company, corporation, professional organization, etc.

The consent value represents a form field that is either a consent about Terms, Policies, etc. or a link to such documents.

The date value represents a form field element that can contain a date or part of a date.

The email value represents a form field element that can contain an email address or a part of an email address.

The id_document value represents a form field element that can contain information about an identity document such as an ID card, a passport, a driving license, etc.

The name value represents a form field element that can contain the name of an individual or organization or a part of the name of an individual or organization.

The other value represents a form field element that cannot be classified using another value. This is the default value for the atttribute.

The otp value represents a form field element that can contain a One-Time Password, for instance an confirmation code the user receives on his mobile phone.

The password value represents a form field element that can contain a password or a passphrase.

The payment value represents a form field element that can contain information about a payment method like a bank account, a credit card or wire transfer information.

The phone value represents a form field element that can contain a phone number of a part of a phone number.

The query value represents a form field element that can contain the subject of a search request.

The title value represents a form field element that can contain title or gender information about an individual.

The username value represents a form field element that contains a login name, a pseudo, etc. used to authenticate a user on a website.

The website values represents a form field element that can contain a URL.

Extra values

Action extra values

Code extra values
Company extra values
Consent extra values
Date extra values
Email extra values
ID extra values
Location extra values

Location type extra values

Name extra values
Partial extra values
Payment extra values
Phone extra values
Repetition extra values
Title extra values

Use the "other" value to ignore a field

Should you need to make a form field "ignored" by client codes, you can set its data-form-type attribute to "other".

Examples

Subscription page for the Washington Post in April 2020:

washington post register

On that page, the form should have a data-form-type attribute set to "register".

The two radio buttons at the top don't need to be tagged with a data-form-type since they don't belong to the form, markup-wise.

The first input field should have a data-form-type attribute set to "email". The second should have the attribute set to "password" and the third to "password,confirmation".

MyFrys page for frys.com.

MyFrys page at frys.com

On that page, there are two forms. The first one, on the left-hand side of the screen should have a data-form-type attribute set to "login". The second one, on the right-hand side of the screen should have a data-form-type attribute set to "register".

The attribute values for the various input fields should be:

Login form
Email email
Password password
Checkbox other
Button action,login
Register form
First name name,first
Last name name,last
Email address email
Confirm email address email,confirmation
Create password password
Confirm password password,confirmation
Zip code address,zip
Checkbox consent,newsletter
Button action,register

Register form at darty.fr.

register form at darty.fr

On that page, the form should have a data-form-type attribute set to "register".

The attribute values for the various input fields should be:

Particulier other
Professionnel other
Mme. title,gender
M. title,gender
Nom name,last
Prénom name,first
Adresse email email
Confirmer adresse email email,confirmation
Mot de passe password
Confirmer le mot de passe password,confirmation
Date de naissance date,birth
Téléphone (mobile) phone,mobile
Téléphone (fixe) phone
Adresse address
Nom de la société, cedex... address,extra
Bât. address,building
Esc. address,stairway
Étage address,floor
Porte address,door
Pays address,country
Code postal address,zip
Ville address,city
Checkbox (SMS) consent,newsletter
Checkbox (email) consent,newsletter
Opt-out consent,newsletter
Button action,register

Login form at twitter.com.

login form at Twitter

On that page, the form should have a data-form-type attribute set to "login".

In this form, the first field should have a data-form-type attribute set to "phone,email,username" and the second one to "password".

A. References

A.1. Normative references

[html]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[WAI-ARIA-1.1]
Joanmarie Diggs; Shane McCarron; Michael Cooper; Richard Schwerdfeger; James Craig. 14 December 2017. W3C Recommendation. URL: https://www.w3.org/TR/wai-aria-1.1/

A.2. Informative references

[WebExtensions]
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions