From: Michael Orlitzky Date: Fri, 4 Jul 2014 07:23:07 +0000 (-0400) Subject: Move the deployment section from README.development to the man page. X-Git-Tag: 0.0.6~38 X-Git-Url: https://gitweb.michael.orlitzky.com/?a=commitdiff_plain;h=5e06d6a189fd5bc1cbc67a349bbee5e168d3bf24;p=dead%2Fhtsn-import.git Move the deployment section from README.development to the man page. Remove another entry from the TODO. --- diff --git a/doc/README.development b/doc/README.development index b1df698..e58b418 100644 --- a/doc/README.development +++ b/doc/README.development @@ -27,23 +27,3 @@ If there's an error, you'll see something like the following: contents: IRL - Firestone 600 - Final Results Texas Motor Sp... [] - - -== Creating the Database Schema (Deployment) == - -When deploying for the first time, the target database will most -likely be empty. The schema will be migrated when a new document type -is seen, but this has a downside: it can be months before every -supported document type has been seen once. This can make it difficult -to test the database permissions. - -Since all of the test XML documents have old timestamps, one easy -workaround is the following: simply import all of the test XML -documents, and then delete them. This will force the migration of the -schema, after which you can set and test the database permissions. - -Something as simple as, - - $ find ./test/xml -iname '*.xml' | xargs htsn-import -c foo.sqlite - -should do it. diff --git a/doc/TODO b/doc/TODO index fefb5da..9b065db 100644 --- a/doc/TODO +++ b/doc/TODO @@ -65,14 +65,5 @@ * WNBA_Individual_Stats_XML * WNBATeamScheduleXML -6. Add a note about the NULL vs. empty string policy in the man page. - -7. Create an XmlImportFkTeams class and use it for JFile, Odds, and - ScheduleChanges. - -8. Consolidate all of the make_game_time functions. - -9. Document how to get an empty database set up (import test xml then - delete). - -10. Add/update dbschema diagrams for JFile, Odds, and ScheduleChanges. +6. Consolidate all of the make_game_time functions which take a + date/time and produce a combined time. diff --git a/doc/man1/htsn-import.1 b/doc/man1/htsn-import.1 index 352eb2b..7a215b1 100644 --- a/doc/man1/htsn-import.1 +++ b/doc/man1/htsn-import.1 @@ -106,6 +106,29 @@ type are provided with the \fBhtsn-import\fR documentation, in the should be considered a bug if they are incorrect. The diagrams are created using the pgModeler tool. +.SH NULL POLICY +.P +Normally in a database one makes a distinction between fields that +simply don't exist, and those fields that are +\(dqempty\(dq. Translating from XML, there is a natural way to +determine which one should be used: if an element is present in the +XML document but its contents are empty, then an empty string should +be inserted into the corresponding field. If on the other hand the +element is missing entirely, the corresponding database entry should +be NULL to indicate that fact. +.P +This sounds well and good, but the XML must be consistent for the +database consumer to make any sense of what he sees. The feed XML uses +optional and blank elements interchangeably, and without any +discernable pattern. To propagate this pattern into the database would +only cause confusion. +.P +As a result, a policy was adopted: both optional elements and elements +whose contents can be empty will be considered nullable in the +database. If the element is missing, the corresponding field is +NULL. Likewise if the content is simply missing. That means there +should never be a (completely) empty string in a database column. + .SH XML SCHEMA GENERATION .P In order to parse XML, you need to know the structure of your @@ -264,6 +287,28 @@ it would greatly complicate things. The first form is more common, so that's all we support for now. An example is provided as schemagen/weatherxml/20143655.xml. +.SH DEPLOYMENT +.P +When deploying for the first time, the target database will most +likely be empty. The schema will be migrated when a new document type +is seen, but this has a downside: it can be months before every +supported document type has been seen once. This can make it difficult +to test the database permissions. +.P +Since all of the test XML documents have old timestamps, one easy +workaround is the following: simply import all of the test XML +documents, and then delete them using whatever script is used to prune +old entries. This will force the migration of the schema, after which +you can set and test the database permissions. +.P +Something as simple as, +.P +.nf +.I $ find ./test/xml -iname '*.xml' | xargs htsn-import -c foo.sqlite +.fi +.P +should do it. + .SH OPTIONS .IP \fB\-\-backend\fR,\ \fB\-b\fR