X-Git-Url: http://gitweb.michael.orlitzky.com/?a=blobdiff_plain;f=doc%2FREADME.schemagen;h=d32075ba0083f9fe3a1bcfa59997bf66e7743ab5;hb=8047508d31d70c7a8f8050fdccaa3aa0a2038865;hp=8dbd3a91829343e512da82ff074a5df4842b7511;hpb=88fcb0daa973b6e5321b9c8ebadc943e9b41697f;p=dead%2Fhtsn-import.git diff --git a/doc/README.schemagen b/doc/README.schemagen index 8dbd3a9..d32075b 100644 --- a/doc/README.schemagen +++ b/doc/README.schemagen @@ -9,7 +9,7 @@ construct a database into which to insert the XML. How do we know if to know how many times it can appear. So we need some form of specification. And reading all of the XML files one at a time to count the number of s is impractical. So, we would like to generate -the DTDs manually. +the DTDs automatically. The process should go something like, @@ -39,5 +39,11 @@ root. XML-Schema-learner will be invoked on each subfolder of folder. Most of the production schemas are generated this way; however, a few -needed manual tweaking. Any hand-modified schemas can be found in the -"schema" folder in the project root. +needed manual tweaking. The final, believed-to-be-correct schemas for +all supported document types can be found in the "schema" folder in +the project root. Having the "correct" DTDs available means you +don't need XML-Schema-learner available to install htsn-import. + +As explained in the man page, there is a second type of weatherxml +document that we don't parse at the moment. An example is provided as +schemagen/weatherxml/20143655.xml.