Feedback from OASIS UBL TC to Draft Core Components Specification.8 document id Version 0.2 editor Bill Burcham April 8, 2002 Sterling Commerce Much of the contention over element naming in UBL stems from the imprecise treatment of properties in the UN Core Components Technical Specification [CC-UN]. While that specification does talk extensively about property terms which are part of a dictionary entry name for a data element (a la [NAMING-ISO]), we are left to infer the existence and makeup of a first class property concept. The term property is used often in that specification, but it is never formally defined. Additionally, the term child field is used in some of the examples in that specification. That term is used synonymously to property, and is also left undefined. Further, it never appears in any of the conceptual diagrams. We are trying to give property terms to things. What things are we trying to give them to? Well [CC-UN] doesn t tell us. We propose: Proposal The CC model must include the concept of property. Property is the model element named by a property term in the same way as a BIE or a CC is the model element named by an object class (name). Proposal 2 A Property relates an Aggregate Core Component to a Basic Core Components it contains. What is property s relationship to the other elements of the CC meta-model? This simply formalizes the prose in already in the specification. A Property allows us to identify (name) the use of a Basic Core Component by an Aggregate Core Component. This concept (Property) corresponds to field in database models, attribute in ER modeling, member in Java, local element in XML. Proposal 3 The is derived from relationship of Core Component Type to Representation Term is eliminated. A Representation Term describes the role of a Core Component Type as it is used in an Aggregate Core Component, through a Basic Core Component. The new relationship is : Representation Term:Property. Section 5.6 lines 838-85; section 5.6.2 lines 892-94
UBL needs to clearly define the role of CCTs and RTs, since the Core Components spec fails to do this. I want to go over the history of the development of thes concepts, so that we can make sure that we produce a set of definitions that are sufficient to the needs of UBL. Originally, there were *NO* CCTs. All we had in the CC work was a set of semantic "primitives" that were called RTs. The list in the spec has not changed much since then. At one of the ebxml meetings it was realized that although some of the RTs were single bits of data, others were actually data composites. For each RT, one (and exactly one, in theory) CCT was indicated, to suggest the semantics of the properties needed to fully express the use of the RT. As we start trying to use the spec set up in this way (UBL and others), we have been realizing that there are some significant failings in this system. One great example of this is a Location Code, a common bit of data that cannot be described as a set of enumerated values - what we traditionally and typically think of as a 'code list' - even though it's business function is that of a code. Core Components did not account for this phenomenon. Because it functions as a "code", it has a semantic primitive RT of "Code" - this means that it has a "Code" CCT, which allows you to point to a code-list and supporting properties. Unfortunately, this doesn't work. For this type of a "code" you need a different set of properties in your physical expression of the business data. (In the case of location code, this is a pattern, not an enumeration - a different type of simple datatype in XSD, for example.) This is true not only of codes, but also of other kinds of data. We need to specify *both* the semantic primitive of the model (the Representation Term, traditionally) *and* the set of properties by which it is made manifest (the CCT), and there is neither a one-to-one relationship between RTs and CCTs, nor a many-to-one relationship between RTs and CCTs. It is, by the practical dictates of business information, a many-to-many, unless we substantially increase the range of our RTs and CCTs. We must either extend these lists significantly, or we must allow them to be combinatorial. We canot know how to clearly define CCts and RTs until we understand whether they are combinatorial or not. There is another requirement that highlights this need, and it is something that has not been addressed so far in UBL, other than as a comment against the draft order from the LCSC. This is the absolute need for a solid description of the physical representation of data, at a finer level of detail than is currently possible. Take, for example, the degree of precision of a price. This is a fundamental kind of datatype issue, since the degree of precision in prices defines the tolerances used in the calculations for essential business processes such as Order/ASN/Invoice reconciliation (aka "book-keeping"). If we are to describe a "price" using the current system, here is what we would know about it: RT = "Amount" (which is always a monetary amount according to the CC definitions)
CCT = "AmountType" (which gives us a number and a currency code) The degree of precision of the price cannot be specified in the semantic model, given these capabilities. All monetary amounts are the same. But in reality, prices have a very different specificity than some other monetary amounts. This means we cannot simply assume a single precision for all monetary amounts, and make that part of our syntaxbinding. In other words, there is a need to capture in the model some distinction between a price and another kind of monetary amount, since they have different requirements in terms of how they are represented. Please note that, in this example, precision is *not* syntaxspecific. It is a critical property of the business data itself, and can be described equally well in many different syntaxes. SUGGESTED APPROACH AND DEFINITIONS: I would suggest the following approach to solving these difficulties: () Have a set of Representation terms, which function as "semantic primitives," as originally intended by the CC group in ebxml. (The list may need to be altered slightly to include some missing types, but will not undergo wholesale expansion). This indicates what the business purpose of the data is, in an abstract sense, wholly separate from how it will be represented when syntax bound. (2) The list of CCTs should be expanded to reflect the actual needs of expressing business data in a syntax, to cover those cases where the syntax itself is not the determining factor, but rather the representation of the business data (as is the case for numeric precision of prices). Alternately, the CCTs could have properties added to them so that the range of possible representation formats is one of the properties. This would work very neatly for something like datetimes, for ecxample, but might become very confusing and clunky in referring to numeric formats. In either case, the range of expressive possibilities for CCTs should be expanded. (3) Each representation term could be combined with some specified subset of the available CCTs, as determined by the requirements of reality. An "Identifier" RT could be a CodeType, or it could be a TextType, biut we would no longer have ambiguous types like "IdentifierType". The CCT would now define an exact set of properties that would describe the actual representation of the data, rather than describing its semantic or business function. (This is essentially abstracting the idea of simpletypes in XSD up a level, but still referring to the actual physical representation of the data, rather than its semantic.) This would result in a significant increase in the number of CCTs (or their expressive range) and would result in the removal of some of the existing ones. The current list assumed a many-to-one relationship between RTs and CCTs that is not useable in reality. We need to re-work the list to reflect this finding. NOTE: The names "Representation Term" and "CCT" are not particularly good, so if you want to reverse them, you could. I wanted to avoid confusion within CC, however, by not redefining them as their exact opposites. Traditionally, the RT was the semantic
primitive, and the CCT was a construct that encapsulated the property set for representing that primitive. Proposal 4 A CCTProperty relates a Core Component Type to the (Content and Supplimentary) Components it contains. For the same reasons a property is needed to relate an ACC to the BCC s it contains, a property is needed to relate a CCT to the components it contains. As a result of P0-P3, Figure 6. Core Components Metamodel should now be as shown in the Core Components Metamodel box in this diagram: XML Schema CC For the same reasons ACC needs a Property to relate it to its constituents, CCT needs a property to relate it to its constituents. CCTProperty..* Content Component TypeName -identifies -describes Type CCT -represents Supplementary Component There is a derived association (..*,..*) between RT and CCT through Property and BCC. TypeDefinition -contains BCC Property..* Representation Term ElementDeclaration -describes ACC Core Components Metamodel This concept is not explicitly present in the CC technical specification. It is mentioned extensively, but never really defined. -implements TagName -describes Element..* -child -parent XML Instance XML Implementation
The preceding diagram shows how the proposed Core Components metamodel (with properties modeled) is syntax bound to XSD by UBL. The syntax binding process to XSD involves creating XSD complex types for ACC s and CCT s. These complex types consist of (local) element declarations one for each property of the source ACC/CCT. The element s tag name is identical to the name of the source property (Property/CCTProperty). Representation Terms may be modeled in XSD via various mechanisms. For example an attribute called RepresentationTerm could be defined on elements of complex types representing ACC s and CCT s. The attribute could be given a default value selected from a distinguished list of terms one for each Representation Term. Once we identify and describe properties, what shall we call them? Could a set of rules around role definition satisfy our need to capture recurring component usage patterns (and name them)? Perhaps the central tenet would be: P4: Role-based property naming: a Property s name ( property term in the dictionary entry name) should reflect the role played by that property s content relative to the ACC in which that property is declared. Similarly a CCTProperty s name should reflect the role played by that property s content relative to the CCT in which that property is declared. References CC-UN NAMING-ISO UN/CEFACT Draft Core Components Specification, Part, 8 February, 2002, version.8 ISO/IEC 79, Final committee draft, Parts -6.