X-Git-Url: http://gitweb.michael.orlitzky.com/?a=blobdiff_plain;f=doc%2Fman1%2Fhtsn-import.1;h=b0b4f9c3050ed17d0f009bfdc25469421d4430e9;hb=83902c16cf946f81ea733f707d432632aa124084;hp=66506c6febaf300dd8be82ce7e5f9c1b624627fb;hpb=65b7b8eccef710e67fa6050dd4bdaffbbea708a7;p=dead%2Fhtsn-import.git diff --git a/doc/man1/htsn-import.1 b/doc/man1/htsn-import.1 index 66506c6..b0b4f9c 100644 --- a/doc/man1/htsn-import.1 +++ b/doc/man1/htsn-import.1 @@ -68,19 +68,27 @@ that we import. For example, the documents corresponding to These top-level tables will often have children. For example, each news item has zero or more locations associated with it. The child table will be named _, which in this case -corresponsds to \(dqnews_locations\(dq. +corresponds to \(dqnews_locations\(dq. .P -To relate the two, a third table exists with name __. Note the two underscores. This prevents -ambiguity when the child table itself contains underscores. As long we -never go more than one level down, this system should suffice. The -table joining \(dqnews\(dq with \(dqnews_locations\(dq is thus called +ambiguity when the child table itself contains underscores. The table +joining \(dqnews\(dq with \(dqnews_locations\(dq is thus called \(dqnews__news_locations\(dq. .P -Wherever possible, children are kept unique to prevent pointless -duplication. This slows down inserts, and speeds up reads (which we -assume are much more frequent). The current rate at which the feed -transmits XML is much too slow to cause problems inserting. +Where it makes sense, children are kept unique to prevent pointless +duplication. This slows down inserts, and speeds up reads (which are +much more frequent). There is a tradeoff to be made, however. For a +table with a small, fixed upper bound on the number of rows (like +\(dqodds_casinos\(dq), there is great benefit to de-duplication. The +total number of rows stays small, so inserts are still quick, and many +duplicate rows are eliminated. +.P +But, with a table like \(dqodds_games\(dq, the number of games grows +quickly and without bound. It is therefore more beneficial to be able +to delete the old games (though an ON DELETE CASCADE, tied to +\(dqodds\(dq) than it is to eliminate duplication. A table like +\(dqnews_locations\(dq is somewhere in-between. .P UML diagrams of the resulting database schema for each XML document type are provided with the \fBhtsn-import\fR documentation. @@ -101,7 +109,7 @@ Default: Sqlite .IP \fB\-\-connection-string\fR,\ \fB\-c\fR The connection string used for connecting to the database backend -given by the \fB\-\-baclend\fR option. The default is appropriate for +given by the \fB\-\-backend\fR option. The default is appropriate for the \fISqlite\fR backend. Default: \(dq:memory:\(dq @@ -135,12 +143,6 @@ not work. Default: disabled -.IP \fB\-\-username\fR,\ \fB\-u\fR -Your TSN username. A username is required, so you must supply one -either on the command line or in a configuration file. - -Default: none - .SH CONFIGURATION FILE .P Any of the command-line options mentioned above can be specified in a