Accessible Technology Test Adapter API

This document presents an architecture and attendant components that together enable interaction between the [[Web-Protocol-Tests]] framework, the [[testharness.js]] library that is part of that framework, and the Accessibility API on any platform. It is loosely based upon work done in the ARIA Working Group and its ancestor Protocols and Formats Working Group, but takes its inspriation from the way that the various Accessibility API Mapping specifications provide for a consistent way to expose and interrogate aspects of essentially any window on any platform.

Testable Statements

A testable statement is a simple assertion about the behavior of a system that, when examined, should evaluate to true. In the case of ARIA specifications, the testable statements are maintained in various wikis. They include this assertion, and also include supporting information to inform the test system which aspects of the underlying systems need to be examined to evaluate the assertion.

It is possible that in some environments an ATTA will not be be available. Consequently, additional properties of the Testable Statement can help a manual tester understand how to perform a similar test without ATTA assistance. Examples include the URL and description properties mentioned below.

Consider the following example:

== Testable Statements for ARIA 1.1 features ==

SpecURL:  https://www.w3.org/TR/wai-aria11

=== gridcell ===

==== Description ====

When an element has a role with a value of 'grid', its aria-rowcount
and aria-colcount attributes descibe the dimensions of the grid
represented by that element and its contents.

Reference: #some_id

<pre>
If given
  <div role='grid' id='test' aria-rowcount='3' aria-colcount='2'>
   <div role='row'>
      <span role='gridcell'>
        <button>hello</button>
     </span>
   </div>
  </div>
then role: grid, celcount: 2, rowcount: 3
</pre>

This example presents a description of the test's purpose, a reference to the section of the specification that defines the behavior under test, and an HTML fragment and conditions that should be met. Beyond this, the testable statement needs to show how these conditions are evaluated through each defined API mapping. An example of one such would be:

  |colspan=5|event test:focus
  |-
  |colspan=5|element test
  |-
  |rowspan=4|ATK
  |property
  |role
  |is
  |ROLE_TABLE
  |-
  |property
  |objectAttributes
  |isType
  |List
  |-
  |property
  |objectAttributes
  |contains
  |xml-roles:grid
  |-
  |property
  |interfaces
  |contains
  |Table
  |-
  |property
  |interface
  |contains
  |Selection
  |-
  |colspan=5|attribute test2:aria-selected "true"

This is an example of a MediaWiki-style table fragment. In this case it is a series of rows and columns that describe the various items that the ATK ATTA will evaluate. The WPT ARIA test generator tool parse tables like this and generates an HTML file containing a JSON representation of the data structured in such a way that it can be processed by the ATTA Communication Module. The JSON for the examples above looks like:

    { "title": 'grid',
      "reference": 'https://www.w3.org/TR/wai-aria11#some_id',
      "description": "When an element has a role with a value of 'grid', its aria-rowcount and aria-colcount attributes descibe the dimensions of the grid represented by that element and its contents.",
      "steps": [
      {
        "type":  "event",
        "title": "step 1",
        "element": "test",
        "event": "focus",
      },
      {
        "type":  "test",
        "title": "step 2",
        "element": "test",
          "test" : {
             "ATK" : [
                [

                   "property"
                   "role",
                   "is",
                   "ROLE_TABLE"
                ],
                [
                   "property"
                   "objectAttributes",
                   "isType",
                   "List"
                ],
                [
                   "property"
                   "objectAttributes",
                   "contains",
                   "xml-roles:grid"
                ],
                [
                   "property"
                   "objectAttributes",
                   "contains",
                   "rowcount:3",
                ],
                [
                   "property"
                   "objectAttributes",
                   "contains",
                   "colcount:2",
                ],
                [
                   "property"
                   "interfaces",
                   "contains",
                   "Table"
                ],
                [
                   "property",
                   "interface",
                   "contains",
                   "Selection"
                ]
             ],
             ....
          }
        },
        {
          "type":  "attribute",
          "title": "step 3",
          "element": "test2",
          "attribute": "aria-selected",
          "value": "true"
        }
      ]
    }

The system will automatically generate [[WPT]] compatible test cases that are formatted as defined in this specification. Test cases can also be written by hand. Examples and a template are available in the wai-aria/tools directory of the WPT github repository.

The architecture of this system allows for multiple "steps" per test (see Test Steps). The current syntax structure for Testable Statements does not allow for the expression of multiple "steps". It is possible that this will be added in the future.

In the future, it is possible that that the detailed per-platform API assertions will be automatically created from a basic assertion by applying the rules from the Accessibility API Mapping document associated with each specification. See Future Plans for more information.

Assertions

Each platform accessibility API has unique facilities. The per-platform API assertion structure is identical, but the contents are necessarily different. The general structure is:

Test Class

The class of thing being tested. Examples include a relation, property, result (of a method call), or event. As a special case, a Test Class of TBD is a placeholder that means the assertions are "to be determined". An ATTA that sees a Test Class of TBD SHOULD respond with a result of "FAIL" and a message that includes sufficient detail about the accessibility tree associated with the element in question that a test developer could review that information and use it to develop the correct assertions for a given platform.

Test Type

The type within the Test Class. Examples include role, name, subrole, etc.

Assertion Type

A verb that indicates how the assertion should be evaluated. These include exists, is, isNot, contains, doesNotContain, isLT, isLTE, isGT, isGTE, isType, and isAny.

Value

A value (or value list) against which to evaluate using ASSERTION TYPE. If there are multiple values, they are enclosed in brackets and separated with commas and optional whitespace (e.g., [value1, value2, value3]). When an Assertion Type of exists is used, legal values are true and false.

At this time commas embedded in values within a value list are NOT supported.

In most cases, the value is untyped. However, when asserting the "type" of something via the isType Assertion Type, legal values include:

Undefined: The system has no value for the item.
Boolean: The item is either true or false.
Constant: The item is represented in the implementation as static reference to a value (e.g., an element of an enum in C, a Symbol in Javascript).
Number: The item is "numeric" in that it is something on which mathematical operations can be performed. There is no distinction between Integer and non-Integer values at this time.
List: The item is an collection (e.g., dictionary, hash, list, array) in which there are zero or more values represented as an index and a value associated with that index. For these purposes, whether something is a List or a reference to a List is an implementation detail. Moreover, while Lists are typically iterable, whether that iteration has predictable order is also an implementation detail.
Object: The item is an "Accessible Object" - a representation of the Accessibility properties associated with a component of system (e.g., a DOM element).
String: The item is a series of characters in a system-defined encoding.

The TEST CLASS of event is special in that the each occurrence of the TEST TYPE type establishes a context for the next event rows in the table. Each subsequent event row is an assertion about an event of the type referenced in that first special row. That context continues until the end of the table, until a row that is NOT an event assertion, or until a row that is an event but with a TEST TYPE of type.

The remainder of this section defines the specific contents.

ATK Assertion Contents

AXAPI Assertion Contents

IAccessible2 Assertion Contents

MSAA Assertion Contents

UIA Assertion Contents

Accessible Technology Test Adapter

Protocol

The ATTA protocol is a simple command/response protocol relying upon HTTP. By default, an ATTA SHOULD listen on port 4119 (A11Y). It is NOT RESTful. Instead, it uses JSON messages and wraps them in a simple request where the path portion indicates that command. A typical conversation between the ATTA and the Framework might look like this:

Communication between Test, Library, and ATTA

Message Structure

Queries and responses are all in JSON.

Commands

start

The start message is designed to ensure that the ATTA is ready, that it can find the window under test, and also to help ATTAcomm understand which API and API version is supported.

test: The name of the overall test case
url: The url the test case was loaded from

The reply for this message takes the form:

status: In indication of whether the ATTA could find the window in which the test will take place given the test name and url. Values are READY or ERROR
statusText: A message about any error
ATTAname: The name of the ATTA
ATTAversion: A version string about the ATTA
API: The name of the Accessibility API supported by ATTA
APIversion: A version string about the Accessibility API
log: optional, ATTA-specific information about the accessibility tree associated with the window being evaluated. This SHOULD be preformatted information that can be embedded by the ATTA Communication Module in the test page to assist a test developer in developing or debugging a test.

      {
        "test":  "name of the overall test case",
        "url":   "url from which the content is loaded"
      }

      {
        "status":      "READY",
        "statusText":  "",
        "ATTAname":    "WPT Sample ATTA",
        "ATTAversion": 1,
        "API":         myAPI,
        "APIversion":  myAPIversion,
        "log":         "formatted logging information"
      }

startlisten

The startlisten message is optional. If it is sent, it means one or more subtests are going to have assertions that need to be evaluated against events, and the ATTA is expected to listen for those events.

events: A list of event names the ATTA should capture for the duration of this test.

The reply for this message takes the form:

status: In indication of whether the ATTA could set up listeners for the requested events. Values are READY or ERROR
statusText: A message about any error
log: optional, ATTA-specific information about the events being listened for. This SHOULD be preformatted information that can be embedded by the ATTA Communication Module in the test page to assist a test developer in developing or debugging a test.

      {
        "events":  [ "focused", "expanded", "selected" ]
      }

      {
        "status":      "READY",
        "statusText":  "",
        "log":         "formatted logging information"
      }

stoplisten

The stoplisten message is optional. If it is sent, it means the ATTA must stop listening for any events that it has been asked to previously listen for.

The reply for this message takes the form:

status: In indication of whether the ATTA stopped listening. Values are READY or ERROR
statusText: A message about any error
log: optional, ATTA-specific information about the operation. This SHOULD be preformatted information that can be embedded by the ATTA Communication Module in the test page to assist a test developer in developing or debugging a test.

      {
        "status":      "READY",
        "statusText":  "",
        "log":         "formatted logging information"
      }

test

The test message contains one or more assertions that the ATTA is to evaluate against the Accessibility API. For each assertion in the request, the ATTA is to return information on the result of checking the assertion.

name

The name of the test

element

The 'id' of an element that will be examined by the ATTA

data

A JSON structure containing the a11y data to check. Note that this is an array where each member maps to a nested array of information from a 'row' in the testable statement. The exact structure and values of the information varies by Accessibility API, but in general have a structure like:

class: The class of assertion to evaluate (e.g., state, property, relation, result)
type: The type of item in the class
assertion: What we are asserting about this class+type
values: The value or values to evaluate with the assertion.

Note that in order to make the comparison of test results across platforms "apples to apples" similar, the 'rows' in the testable statement will be evaluated and their results collected into a single result for the overall 'test statement'. Any failure or other messages will be included in the result for that overall 'test statement'.

The reply for this message takes the form:

status

In indication of whether the ATTA could evaluate the assertions or not. Values are OK and ERROR

statusText

A message about any error

log

optional, ATTA-specific information about the acceswsibility tree associated with the element being evaluated. This SHOULD be preformatted information that can be embedded by the ATTA Communication Module in the test page to assist a test developer in developing or debugging a test.

results

A list of result objects corresponding to each array member in data above. The result objects have the following members:

result: one of PASS or FAIL
message: information that explains a failure (if any). This information is attached to the failure information along with the assertion that was being evaluated so a tester can help to correct the problem.
log: optional ATTA-specific information showing information about the acessibility tree that was examined when evaluating the assertion. When there is log data returned it is augmented with the information about the assertion being examined. Used when debugging or developing tests.

      {
        "name":      "subtest name",
        "element":   "id of element to check",
        "data":      [ [ "thingClass", "thingType", "thingAssertion", "thingValue" ],
                       [ "thingClass", "thingType", "thingAssertion", "thingValue" ]
                     ]
      }

      {
        "status":      "OK",
        "statusText":  "",
        "log":         "",
        "results":     [
                         { "result":  "PASS",
                           "log":     "Logging information about test run"
                         },
                         { "result":  "FAIL",
                           "message": "Some diagnostic about the failure",
                           "log":     "detailed logging"
                         }
                       ]
      }

end

The end message just indicates the current test case is complete. It has no payload and expects nothing in the response. The ATTA SHOULD do a complete cleanup of anything in its environment before it replies to the message.

Terminology

ATTA

The ATTA is a tool that acts as the interface between the WPT child window and the platform accessibility API. It interprets requirements defined in testable statements and evaluates them against the characteristics of the child window to help evaluate the conformance of the platform accessibility API to the requirements in related W3C Recommendations.

WPT

The collection of tests and tools that the W3C relies upon to exercise the Web Platform (where that is the collection of technologies that comprise the World Wide Web).

Testable Statement

A simple (hopefully atomic) definition about the behavior of some component. In the context of this system, a testable statement includes information about what behaviors should or should not be attached to an element with a unique ID within the child window.

Accessibility API

Operating systems and other platforms provide a set of interfaces that expose information about objects and events to assistive technologies. Assistive technologies use these interfaces to get information about and interact with those widgets. Examples of accessibility APIs are Microsoft Active Accessibility [[MSAA]], Microsoft User Interface Automation [[UI-AUTOMATION]], MSAA with UIA Express [[UIA-EXPRESS]], the Mac OS X Accessibility Protocol [[AXAPI]], the Linux/Unix Accessibility Toolkit [[ATK]] and Assistive Technology Service Provider Interface [[AT-SPI]], and IAccessible2 [[IAccessible2]].

Accessibility API Mapping

A specification that defines the relationship between ARIA roles, properties, states, and events to their corresponding aspects of a specific platform's implementation.

Accessibility Subtree

An accessible object in the accessibility tree and its descendants in that tree. It does not include objects which have relationships other than parent-child in that tree. For example, it does not include objects linked via aria-flowto unless those objects are also descendants in the accessibility tree.

Accessibility Tree

Tree of accessible objects that represents the structure of the user interface (UI). Each node in the accessibility tree represents an element in the UI as exposed through the accessibility API; for example, a push button, a check box, or container.

Accessible Description

An accessible description provides additional information, related to an interface element, that complements the accessible name. The accessible description might or might not be visually perceivable.

Accessible Name

The accessible name is the name of a user interface element. Each platform accessibility API provides the accessible name property. The value of the accessible name may be derived from a visible (e.g., the visible text on a button) or invisible (e.g., the text alternative that describes an icon) property of the user interface element. See related accessible description.

A simple use for the accessible name property may be illustrated by an "OK" button. The text "OK" is the accessible name. When the button receives focus, assistive technologies may concatenate the platform's role description with the accessible name. For example, a screen reader may speak "push-button OK" or "OK button". The order of concatenation and specifics of the role description (e.g., "button", "push-button", "clickable button") are determined by platform accessibility APIs or assistive technologies.

Accessible object

A node in the accessibility tree of a platform accessibility API. Accessible objects expose various states, properties, and events for use by assistive technologies. In the context of markup languages (e.g., HTML and SVG) in general, and of WAI-ARIA in particular, markup elements and their attributes are represented as accessible objects.

Activation behavior

The action taken when an event, typically initiated by users through an input device, causes an element to fulfill a defined role. The role may be defined for that element by the host language, or by author-defined variables, or both. The role for any given element may be a generic action, or may be unique to that element. For example, the activation behavior of an HTML or SVG <a> element shall be to cause the user agent to traverse the link specified in the href attribute, with the further optional parameter of specifying the browsing context for the traversal (such as the current window or tab, a named window, or a new window); the activation behavior of an HTML <input> element with the type attribute value submit shall be to send the values of the form elements to an author-defined IRI by the author-defined HTTP method.

Assistive Technologies

Hardware and/or software that:

relies on services provided by a user agent to retrieve and render Web content
works with a user agent or web content itself through the use of APIs, and
provides services beyond those offered by the user agent to facilitate user interaction with web content by people with disabilities

This definition may differ from that used in other documents.

Examples of assistive technologies that are important in the context of this document include the following:

screen magnifiers, which are used to enlarge and improve the visual readability of rendered text and images;
screen readers, which are most-often used to convey information through synthesized speech or a refreshable Braille display;
text-to-speech software, which is used to convert text into synthetic speech;
speech recognition software, which is used to allow spoken control and dictation;
alternate input technologies (including head pointers, on-screen keyboards, single switches, and sip/puff devices), which are used to simulate the keyboard;
alternate pointing devices, which are used to simulate mouse pointing and clicking.

Attribute

In this specification, attribute is used as it is in markup languages. Attributes are structural features added to elements to provide information about the states and properties of the object represented by the element.

Class

A set of instance objects that share similar characteristics.

Deprecated

A deprecated role, state, or property is one which has been outdated by newer constructs or changed circumstances, and which may be removed in future versions of the WAI-ARIA specification. User agents are encouraged to continue to support items identified as deprecated for backward compatibility. For more information, see Deprecated Requirements in the Conformance section.

Desktop focus event

Event from/to the host operating system via the accessibility API, notifying of a change of input focus.

Element

In this specification, element is used as it is in markup languages. Elements are the structural elements in markup language that contains the data profile for objects.

Event

A programmatic message used to communicate discrete changes in the state of an object to other objects in a computational system. User input to a web page is commonly mediated through abstract events that describe the interaction and can provide notice of changes to the state of a document object. In some programming languages, events are more commonly known as notifications.

Expose

Translated to platform-specific accessibility APIs as defined in the WAI-ARIA User Agent Implementation Guide. [[WAI-ARIA-IMPLEMENTATION]]

Graphical Document

A document containing graphic representations with user-navigable parts. Charts, maps, diagrams, blueprints, and dashboards are examples of graphical documents. A graphical document is composed using any combination of symbols, images, text, and graphic primitives (shapes such as circles, points, lines, paths, rectangles, etc).

Hidden

Indicates that the element is not visible, perceivable, or interactive to any user. An element is considered hidden if it or any one of its ancestor elements is not rendered or is explicitly hidden.

Informative

Content provided for information purposes and not required for conformance. Content required for conformance is referred to as normative.

Keyboard Accessible

Accessible to the user using a keyboard or assistive technologies that mimic keyboard input, such as a sip and puff tube. References in this document relate to WCAG 2.0 Guideline 2.1: Make all functionality available from a keyboard [[WCAG20]].

Landmark

A type of region on a page to which the user may want quick access. Content in such a region is different from that of other regions on the page and relevant to a specific user purpose, such as navigating, searching, perusing the primary content, etc.

Live Region

Live regions are perceivable regions of a web page that are typically updated as a result of an external event when user focus may be elsewhere. These regions are not always updated as a result of a user interaction. This practice has become commonplace with the growing use of Ajax. Examples of live regions include a chat log, stock ticker, or a sport scoring section that updates periodically to reflect game statistics. Since these asynchronous areas are expected to update outside the user's area of focus, assistive technologies such as screen readers have either been unaware of their existence or unable to process them for the user. WAI-ARIA has provided a collection of properties that allow the author to identify these live regions and process them: aria-live, aria-relevant, aria-atomic, and aria-busy. Pre-defined live region roles are listed in the Choosing Between Special Case Live Regions ([[WAI-ARIA-PRACTICES]], Section 5.3).

Primary Content Element

An implementing host language's primary content element, such as the body element in HTML.

Managed State

Accessibility API state that is controlled by the user agent, such as focus and selection. These are contrasted with "unmanaged states" that are typically controlled by the author. Nevertheless, authors can override some managed states, such as aria-posinset and aria-setsize. Many managed states have corresponding CSS pseudo-classes, such as :focus, and pseudo-elements, such as ::selection, that are also updated by the user agent.

Nemeth Braille

The Nemeth Braille Code for Mathematics is a braille code for encoding mathematical and scientific notation. See Nemeth Braille on Wikipedia.

Node

Basic type of object in the DOM tree or accessibility tree. DOM nodes are further specified as Element or Text nodes, among other types. The nodes of an accessibility tree are accessible objects.

Normative

Required for conformance. By contrast, content identified as informative or "non-normative" is not required for conformance.

Object

In the context of user interfaces, an item in the perceptual user experience, represented in markup languages by one or more elements, and rendered by user agents.

In the context of programming, the instantiation of one or more classes and interfaces which define the general characteristics of similar objects. An object in an accessibility API may represent one or more DOM objects. Accessibility APIs have defined interfaces that are distinct from DOM interfaces.

Ontology

A description of the characteristics of classes and how they relate to each other.

Operable

Usable by users in ways they can control. References in this document relate to WCAG 2.0 Principle 2: Content must be operable [[WCAG20]]. See Keyboard Accessible.

Owned Element

An 'owned element' is any DOM descendant of the element, any element specified as a child via aria-owns, or any DOM descendant of the owned child.

Owning Element

An 'owning element' is any DOM ancestor of the element, or any element with an aria-owns attribute which references the ID of the element.

Perceivable

Presentable to users in ways they can sense. References in this document relate to WCAG 2.0 Principle 1: Content must be perceivable [[WCAG20]].

Property

Attributes that are essential to the nature of a given object, or that represent a data value associated with the object. A change of a property may significantly impact the meaning or presentation of an object. Certain properties (for example, aria-multiline) are less likely to change than states, but note that the frequency of change difference is not a rule. A few properties, such as aria-activedescendant, aria-valuenow, and aria-valuetext are expected to change often. See clarification of states versus properties.

Relationship

A connection between two distinct things. Relationships may be of various types to indicate which object labels another, controls another, etc.

Role

Main indicator of type. This semantic association allows tools to present and support interaction with the object in a manner that is consistent with user expectations about other objects of that type.

Root WAI-ARIA node

The primary element containing non-metadata content. In many languages, this is the document element but in HTML, it is the <body>.

Semantics

The meaning of something as understood by a human, defined in a way that computers can process a representation of an object, such as elements and attributes, and reliably represent the object in a way that various humans will achieve a mutually consistent understanding of the object.

State

A state is a dynamic property expressing characteristics of an object that may change in response to user action or automated processes. States do not affect the essential nature of the object, but represent data associated with the object or user interaction possibilities. See clarification of states versus properties.

Sub-document

Any document created from a <frame>, <iframe> or similar mechanism. A sub-document may contain a document, an application or any widget such as a calendar pulled in from another server. In the accessibility tree there are two accessible objects for this situation—one represents the <frame>/<iframe> element in the parent document, which parents a single accessible object child representing the spawned document contents.

Target Element

An element specified in a WAI-ARIA relation. For example, in <div aria-controls=”elem1”>, where “elem1” is the ID for the target element.

Taxonomy

A hierarchical definition of how the characteristics of various classes relate to each other, in which classes inherit the properties of superclasses in the hierarchy. A taxonomy can comprise part of the formal definition of an ontology.

Text node

Type of DOM node that represents the textual content of an attribute or an element. A Text node has no child nodes.

Understandable

Presentable to users in ways they can construct an appropriate meaning. References in this document relate to WCAG 2.0 Principle 3: Information and the operation of user interface must be understandable [[WCAG20]].

User Agent

Any software that retrieves, renders and facilitates end user interaction with Web content. This definition may differ from that used in other documents.

Valid IDREF

A reference to a target element in the same document that has a matching ID

Widget

Discrete user interface object with which the user can interact. Widgets range from simple objects that have one value or operation (e.g., check boxes and menu items), to complex objects that contain many managed sub-objects (e.g., trees and grids).

Introduction

Theory of Operation

Testable Statements

Assertions

ATK Assertion Contents

AXAPI Assertion Contents

IAccessible2 Assertion Contents

MSAA Assertion Contents

UIA Assertion Contents

Test Steps

assertions

attribute

event

script

ATTAcomm.js - ATTA communication library

Accessible Technology Test Adapter

Protocol

Message Structure

Commands

start

startlisten

stoplisten

test

end

Terminology

Future Plans