Lessons learned from (teaching) ontology design in OIL

Vojtech Svatek, working document, 6 June 2002

The following is the summary of my fresh experience from training myself + some 20 undergraduate students (of my course on Knowledge Modelling) to develop simple ontologies in OIL. The observations range from rather trivial modelling errors to (probably) inherently hard problems.

In the table below, I try to present the general problems, concrete examples, and tentative solutions/explanations I have presented to the students.

The target audience of this draft are primarily people involved in the SIG on Ontology Language Standards of the OntoWeb project. Since the students had similar background (mainly in Business Information Systems) as the prospective Semantic Web ontology developers in industries, I hope that some of the observations might be helpful in developing a list of FAQs on practical issues in OIL-based modelling, in the future. Such list could extend or complement the current OIL FAQ at http://www.ontoknowledge.org/oil/oil-faq.html.

Note that not only the students but I also consider myself as relative newcomer (in particular, as regards the logical background of OIL) and the document has been written a bit hastily. My "answers" to problems should thus be taken cautiously, and definitely deserve to be amended by real experts. Any corrections or comments (to svatek@vse.cz) are heartily welcome!


No Problem Example Tentative solution/explanation
1 Effort to formalise all concepts from the previously-built informal glossary of a broad domain. Since the time needed for proper formalisation is usually limited, the resulting ontology is just a shallow taxonomy with few slots. An ontology containing the whole variety of professions of employees in a large company. Although the formal listing of concepts of interest might be of some value, the true power of OIL is in the reasoning over interconnected classes and slots. It seems to be useful to formalise smaller but compact clusters of knowledge separately. Further, the informal glossary should not be replaced by the formal model. It will not be used by computers but still could ease the mutual understanding of humans. It may keep a superset of concepts present in the formal model.
2 Inclusion, "within" the classes, of unrestricted (string- or number-valued) slots corresponding to data that might be processed in a hypothetical information system. Slots such as first_name, last_name, street_name flat_number. An ontology is a knowledge-level model, not a data model, it should be a base for inferencing rather than for computation. Unrestricted slots are of limited value and could be ignored.
3 Attributes with a small, closed set of values (even boolean) always modelled by datatype slots. Boolean slot subject_to_taxation used with class income. For important, "conceptual" properties, it is worth considering to model the attribute as collections of complementary subclasses (such as income_subject_to_taxation) of the given class. This will enable their use in further definitions.
4 Semantically vague slots based on surface natural language. (child) has (parent); could easily become symmetric! Slot is not just a word, it is a conceptual relation...
5 Improper reification of properties. Class skill-level with instances high, low. Model either as slot or as a "real" class such as highly-skilled-worker, decision based upon no.3.
6 How to accomodate relations with arity >2 (possibly imported from Ontolingua, CML, Prolog...)? Relation between the "driver", "delivery-unit" and "vehicle" (or, more abstract, "agent", "patient" and "means") in a delivery-service application. The relation has to be decomposed and partly reified: e.g. the driver (class) carries-out (slot) a delivery-action (class), the last being equipped with slots for the delivery unit and vehicle. This is however often not a natural modelling committment...
7 Inclusion of "demo-individuals" "John" and "Peter" as instances of person Individuals should be included only if significant for the ontology, in particular if needed in class expressions.
8 Ad hoc classes, used only as the range of a single slot Anonymous classes (a distinctive feature of languages based on description logic) would suffice. Unfortunately, there are few realistic examples of really complex class expressions in existing demo ontologies!
9 Ad hoc class definitions. Similar to this one from the demo "people" ontology: animal lover is a person who has at least 3 pets... Unclear if it is bad or not. My standpoint is that a serious ontology should contain stable and generally-agreed-upon knowledge. This should however not be required for student/demo ontologies.
10 Attempts to model higher-order relations. A driver can_drive a vehicle iff s/he has_document driving licence, and the category_of the licence is equal to one of the values required by the type of vehicle. Probably several ways (but I have not properly tested). We can (1) model vehicle types as classes and licence categories as individuals, and impose specific constraints on each class such as "lorry requires category A"; (2) use slot "(driver) requires_licence_to_drive (vehicle)" with subslots such as requires_licence_A_to_drive; (3) model even the vehicle types as individuals, and use relation instances instead of constraints; the last option is however not conceptually clean and ignores the possible hierarchy of vehicle types.

In the end a really tasty bit: in a social-benefit ontology, a student included the slot was-born (with domain person, range unspecified) - and in order to use the "advanced modelling" feature, he also added died as its inverse slot!

Top of the page


Vojtech Svatek - homepage

Vojtech Svatek , updated Jun 6, 2002