Date: Tue, 22 Oct 2002 17:31:37 +1000 From: Brian White To: htdig-dev@lists.sourceforge.net Subject: [htdig-dev] Patch for defaults.xml - first cut Ok. I have the first cut of the defaults.xml patch. It is all in the attached file - it is patched against htdig-3.2.0b4-20021013. The remainder of this email is the README file from the tar file. Regs Brian White =============================================================== Documentation attached to initial version of defaults.xml patch =============================================================== 1. Overview of what it does ==================================== * Adds defaults.xml and defaults.dtd plus manage_attributes.pl for managing access to them * Addition of make_defaults_cc.pl for creating defaults.cc * Complete rewrite of the cf_generate.pl that creates attrs.html, cf_byprog.html and cf_byname.html * Reducing the size of the ConfigDefaults structure to just have "name" and "value" The version of defaults.xml as it exists in this patch is valid against defaults.dtd. The patch is done against htdig-3.2.0b4-20021013 2. Affected Files ==================================== New Files: * htcommon/defaults.dtd * htcommon/manage_attributes.pl * htcommon/make_defaults_cc.pl Replaced Files: * htcommon/defaults.xml * htdoc/cf_generate.pl (Note that most of the patches are only 1 or 2 lines - the biggest is probably about 10 ) Patched Files: * htcommon/Makefile.am.patch * htdoc/Makefile.am.patch * htdoc/attrs_head.html.patch * htdoc/attrs_tail.html.patch * htdoc/cf_byname_head.html.patch * htdoc/cf_byprog_head.html.patch * htlib/Configuration.h.patch * htdb/htdb_dump.cc.patch * htdb/htdb_load.cc.patch * htdb/htdb_stat.cc.patch Files to removed from CVS * defaults.cc 3. Creating Descriptions ==================================== The description is essentially a html snippet, with the following differences * It is limited to p,br,ol,ul,table,em, strong,code and a elements, with two additions: 1) This is used to provide block code or html snippets. An example of this would be <SELECT NAME="search_algorithm"> <OPTION VALUE="exact:1 prefix:0.6 synonyms:0.5 endings:0.1" SELECTED>fuzzy <OPTION VALUE="exact:1">exact </SELECT> 2) This is used to link to programs, faqs and other attributes. Some examples are: build_select_lists htdig 4.1 The purpose of doing this is to allow the info to be reused, and remove the dependency on html files in a particular place. * The only allowed attributes in the description are: table : border, width td,th : align, valign, rowspan, colspan dl : compact="true" 4. A Discussion of XML Validation ==================================== Ideally the code should validate the XML against the DTD, and should check for well formedness. Unfortunately that requires an XML parser, and I did not want to add an extra dependency at this stage! What I did as a compromise was to create an API that is used to load and then query the XML data - this API is documented and implemented in htcommon/manage_attributes.pl. At the moment the internal data structures are populated using standard perl pattern matching - it *assumes* that defaults.xml is valid against defaults.dtd, but is does not test it. The aim is that when an XML parser is readily available, that can be used to populate the internal data structures - and everything else should just work! 5. The Current state of defaults.xml ==================================== The version of defaults.xml that is presented in this patch is valid against defaults.dtd, but is desperately in need of a cleanup. However: a) I don't have time to clean it up at the moment b) It is currently completely generated from the old defaults.cc, which makes it easier to adjust until it's form becomes stable What I would like to do is get the patch in place and once it has stablized embark on a bit of a cleanup 6. Possible Issues ==================================== * Examples are just entered as values - the "name : " before are now automatically generated. This may seem limiting, but it is exactly what is in the docs at the moment. * There are a few remaining links to other parts of the documentation that I have left a elements, simply because I couldn't see an obvious way to include them ------------------------- Brian White Step Two Designs Pty Ltd Knowledge Management Consultancy, SGML & XML Phone: +612-93197901 Web: http://www.steptwo.com.au/ Email: bwhite@steptwo.com.au