Literate Documentation for XML Schema

The Certificate of Advanced Study Project of Kevin Reiss at the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign.

Friday, June 22, 2007

Next Steps

6/21 Conversation with Allen

METS-ODD Project Next Steps

  1. Examine publicly available METS Profiles for interesting constraints expressed in natural language
    1. This is not a scientific, line-by-line survey, but an informal review
    2. Don’t get bogged down trying differentiate between syntactic and semantic constraints
    3. Any constraint that can possibly be represented in a schema can be considered syntactic
  2. Choose 4-12 interesting constraint patterns to examine in detail
    1. All profile constraints are currently expressed by natural language statements of varying length and detail.
    2. Consider what schema language could be used to express the constraint

i. Ignore those constraints that can be represented with a DTD

ii. Consider whether W3C Schema, Relax NG, or Schematron can represent more challenging constraints

iii. Don’t worry about trying to definitely prove whether a constraint cannot be represented in a particular schema language, this is interesting but beyond the immediate scope of this project.

  1. Create a table containing an analysis of the selected constraints. Each constraint listing will include a (1) brief, identifying phrase, (2) a short description and account of how the constraint could represented in an existing schema language, and (3) appropriate examples taken from the public METS profile showing how the constraint is currently expressed in natural language.
  2. Create small XML vocabularies that allow authors to represent the selected constraints in the TEI ODD literate programming format.
    1. Extend the TEI ODD vocabulary to include these vocabularies
    2. Create an example ODD fragment for each XML vocabulary
  3. Create templates that will process the ODD fragments to generate a machine readable expression of these constraints. Experiment with two ways of doing this:
    1. XSLT
    2. Schematron

Tuesday, June 12, 2007

Digital Humanities Poster Materials

Poster Title: Literate Documentation for XML: TEI ODD - METS
Presented at Digital Humanities 2007 on June 6, 2007 in Urbana-Champaign, Illinois.

Monday, November 13, 2006

Detailed Abstract

Detailed Project Abstract 11-15

This document contains a brief description of the project rationale and a sketch of the demonstration application I plan to complete.

Thursday, September 07, 2006

Project Notes and Outline

Detailed Project Outline (Unfinished)

The major hole in this document is the identification of some instructive examples from real-world document instances, schemas, and documentation from XML applications. I hope to identify some shortly to help guide the creation of the project demonstration application.

The Document Types I'm looking at are:
  1. METS
  2. Atom
  3. OAI-PMH
  4. Open Office Document Format
  5. XHTML/TEI/Docbook - in order to provide a document centric example
I also need to develop a much more fully realized discussion of how this system relates to other proposals to deal with semantics and XML.

I also need more discussion regarding the capabilities of the TEI One Document Does it All (ODD) literate programming system as a documentation tool for important semantic characteristics of XML.

Proposed Application Sketch

Use the TEI existing Tagset Documention (Chapter 27 of the Guidelines) as the Base of the application. Extend this to support the general purpose authoring of XML schema using the ODD format. This extension will contain the prescribed DSS elements outlined in the Project Outline.

The major component of this will be structured natural language prose that clearly describes the semantics of the markup language. This extension will also involve modifying the TEI sytlesheet package to make sure that well-formatted documentation for the extensions to the ODD will created.

Several Extreme Papers from this year's conference seemed highly relevant to this project. I plan to give these more thorough review. They are:

Implementing XML Schema Naming and Design Rules

A Natural Language Approach to Modeling: Why is some XML so difficult to write?

Thursday, July 13, 2006

Project Documents

Current Project Documents: