Data Model Issues and Assumptions
Assumptions
- We model only one person (and possibly their assistant), not groups
of individuals. However, presentities can be an aggregate name for a
group of people. For example, the unified presence information for all
sales staff can be represented by the presentity
sales@example.com. An outside observer cannot tell whether
sales@example.com is a single individual or a group whose
presence information has been aggregated into one presentity.
- While the persons describable by SIMPLE presence information can
physically only be in one place, its devices reporting person state may
(inadvertently or intentionally) be in several places and thus an
observer (watcher or composer) can only assign probabilistic estimates
as to where the person "really" is. (We will, however, not attempt to
describe people by Schroedinger equations.)
- We describe presentities as a set of <tuple>, <device> and
<person> elements. (There can be multiple tuples and devices; the
number of person elements is contentious.)
- Activities, mood, place-type, privacy, sphere, status-icon and
time-zone describe a person, even if the information is derived from a
device or service. They are contained within a <person> element.
- <user-input> describes a service (tuple) or device.
- A watcher may be able to resolve the same URI appearing in two
tuples into two different services by mechanisms such as caller
preferences or session-level negotiation.
- The data model MUST work with elementary composers that only
concatenate or replace existing tuples. (It MUST also work with smart
composers to be specified in the future, but that seems less likely to
be a problem.)
- Presence data for a single person or device may be published by
different publishers (e.g., different devices) that are unaware of each
other's existence.
- In some cases, the watcher is smarter than the composer,
particularly because the watcher may be human and the composer is almost
always a program. Such watchers should not be denied information due to
constraints in the data model.
- We cannot assume that a composer will be able to resolve all
contradictions or even recognize a contradiction. (For example, if
activities "on-the-phone" and "in-a-meeting" are reported, this may be
true or conflicting information.)
- Work correctly, including replacement, with composers that do not
understand a certain service URI scheme and, in particular, may not know if
a URI is "special" (AOR, GRUU, whatever).
- Allow the watcher (rather than forcing the composer) to deal with
uncertainty and contradictory information.
- Be able to construct non-lossy composers, i.e., composers that pass
all data to the watcher as in some cases the watcher may have better
information than the composer about the reliability or relevance of
information. (One example is if the watcher is itself part of another
PA that aggregates information from multiple PAs and may have access to
external information or algorithms.)
- Cannot rely solely on publication time to override earlier data since
later information may not be better. (Example: What happens if the phone
stops publishing on-the-phone after a call that overrode 'in-a-meeting'?)
- Alignment with XCAP mechanisms and tuple identification is desirable.
- At some later point, views should be labeled with source
information. We don't have to solve the problem of this metadata now,
but need to provide the right data model so that source labels are
likely to work well.
Proposal
- URIs are not used for comparison and replacement, only element
identifiers for tuples, persons and devices.
- Default composition policy is to take the most recent of tuples with
the same tuple/person-id, and retain all tuples with different
tuple/person-ids (even if they have identical contact URIs), using the
interpretation of multiple <person> elements defined earlier.
- Different composition policies that never publish more than
one <person> record are possible by having the composer discard or merge
information.
- Persons are somewhat different from regular tuples and devices since
person information for one person is very frequently collected
from a variety of device and service sensors that see aspects of the
person. Examples include calendars, phones ("on-the-phone") and devices
(location information, including categorical location information).
Each <person> is labeled with a view-id (or source-id?). The same
rule as for tuples applies, i.e., a <person> element replaces one
with the same view-id.
A watcher treats multiple <person> elements as alternate views of
the state of the person. In the future, source-describing meta data may
enable the watcher to better judge the value of these elements.
Initially, information such as publication time or external information
may help. (Example: my calendar publishes "lunch" at noon every day,
based on EST. If the watcher knows that I'm in Japan, it will discount
that information, even without knowing the source details. It is unlikely
that a composer would be able to do this.)
If a watcher supports caller preferences or other source selection
mechanisms not based on the URI (e.g., Accept headers in HTTP), it can
render the multiple tuples with the same service URI as distinct
contacts. If not, it can merge the OPEN tuples for user interface
purposes since they are indistinguishable to the watcher. The
capabilities are then the union of the capabilities of the tuples.
- Depending on whether it is considered likely that device state for
one device will be published by several entities, the same
considerations apply.
Meta Data
Longer term, presence data should be taggable with meta data
identifying its source, reliability and other information that allows
the recipient to judge which pieces of contradictory data to believe. As
part of the proposal above, one could envision meta data for each
element, e.g., <person>, as in the wholly fictitious example below:
<person view="12xy">
<source>
<provider>calendar</provider>
<provider-domain>yahoo.com</provider-domain>
<input>manual</input>
</source>
... other person information ...
</person>
<person view="12ab">
<source>
<provider>body-sensor</provider>
<provider-domain>bigbrother.com</provider-domain>
<input>sensor</input>
</source>
... other person information ...
</person>
If elements from multiple sources are mixed, the RPID definition
would have to allow multiple instances of each element, including
<status. Also, each element would have to be able to refer to an
external element describing the source information by some name. Such
external references ("pointers") increase the difficulty of data
management, as the source information needs to be removed, for privacy
reasons, if all referring elements have been removed. Conversely, it
introduces additional error cases of dangling pointers.
Critical Open Issues
Non-Critical Open Issues
These issues may be deferrable.
- Does CLOSED for a person imply global unreachability? Is there a
need for a global override?
OMA
PAG spec