From: Michael Orlitzky Date: Fri, 10 Jan 2014 04:17:02 +0000 (-0500) Subject: Add an empty man page. X-Git-Tag: 0.0.1~111 X-Git-Url: http://gitweb.michael.orlitzky.com/?a=commitdiff_plain;h=ec6c5b56f8e3096786e8f0a0d3c5c3c1610b69f7;p=dead%2Fhtsn-import.git Add an empty man page. --- diff --git a/doc/man1/htsn-import.1 b/doc/man1/htsn-import.1 new file mode 100644 index 0000000..0816be3 --- /dev/null +++ b/doc/man1/htsn-import.1 @@ -0,0 +1,33 @@ +.TH htsn-import 1 + +.SH NAME +htsn-import \- Import XML files from The Sports Network into an RDBMS. + +.SH SYNOPSIS + +\fBhtsn-import\fR [OPTIONS] [FILES] + +.SH DESCRIPTION + +.SH DATABASE SCHEMA +.P +At the top level, we have one table for each of the XML document types +that we import. For example, the documents corresponding to +\fInewsxml.dtd\fR will have a table called \(dqnews\(dq. +.P +These top-level tables will often have children. For example, each +news item has zero or more locations associated with it. The child +table will be named _, which in this case +corresponsds to \(dqnews_locations\(dq. +.P +To relate the two, a third table exists with name __. Note the two underscores. This prevents +ambiguity when the child table itself contains underscores. As long we +never go more than one level down, this system should suffice. The +table joining \(dqnews\(dq with \(dqnews_locations\(dq is thus called +\(dqnews__news_locations\(dq. +.P +Wherever possible, children are kept unique to prevent pointless +duplication. This slows down inserts, and speeds up reads (which we +assume are much more frequent). The current rate at which the feed +transmits XML is much too slow to cause problems inserting.