]> gitweb.michael.orlitzky.com - dead/htsn-import.git/blob - htsn-import.cabal
Add a schema diagram for Auto_Racing_Schedule_XML.
[dead/htsn-import.git] / htsn-import.cabal
1 name: htsn-import
2 version: 0.0.3
3 cabal-version: >= 1.8
4 author: Michael Orlitzky
5 maintainer: Michael Orlitzky <michael@orlitzky.com>
6 category: Utils
7 license: GPL-3
8 license-file: doc/LICENSE
9 build-type: Simple
10 extra-source-files:
11 doc/dbschema/*.png
12 doc/htsn-importrc.example
13 doc/man1/htsn-import.1
14 doc/README.dbschema
15 doc/README.schemagen
16 doc/TODO
17 makefile
18 schema/*.dtd
19 schemagen/Auto_Racing_Schedule_XML/*.xml
20 schemagen/Heartbeat/*.xml
21 schemagen/injuriesxml/*.xml
22 schemagen/Injuries_Detail_XML/*.xml
23 schemagen/newsxml/*.xml
24 schemagen/Odds_XML/*.xml
25 schemagen/weatherxml/*.xml
26 test/shell/*.test
27 test/xml/*.xml
28 test/xml/*.dtd
29 synopsis:
30 Import XML files from The Sports Network into an RDBMS.
31 description:
32 /Usage/:
33 .
34 @
35 htsn-import [OPTIONS] [FILES]
36 @
37 .
38 The Sports Network <http://www.sportsnetwork.com/> offers an XML feed
39 containing various sports news and statistics. Our sister program
40 /htsn/ is capable of retrieving the feed and saving the individual
41 XML documents contained therein. But what to do with them?
42 .
43 The purpose of /htsn-import/ is to take these XML documents and
44 get them into something we can use, a relational database management
45 system (RDBMS), loosely known as a SQL database. The structure of
46 relational database, is, well, relational, and the feed XML is not. So
47 there is some work to do before the data can be inserted.
48 .
49 First, we must parse the XML. Each supported document type (see below)
50 has a full pickle/unpickle implementation (\"pickle\" is simply a
51 synonym for serialize here). That means that we parse the entire
52 document into a data structure, and if we pickle (serialize) that data
53 structure, we get the exact same XML document tha we started with.
54 .
55 This is important for two reasons. First, it serves as a second level
56 of validation. The first validation is performed by the XML parser,
57 but if that succeeds and unpicking fails, we know that something is
58 fishy. Second, we don't ever want to be surprised by some new element
59 or attribute showing up in the XML. The fact that we can unpickle the
60 whole thing now means that we won't be surprised in the future.
61 .
62 The aforementioned feature is especially important because we
63 automatically migrate the database schema every time we import a
64 document. If you attempt to import a \"newsxml.dtd\" document, all
65 database objects relating to the news will be created if they do not
66 exist. We don't want the schema to change out from under us without
67 warning, so it's important that no XML be parsed that would result in
68 a different schema than we had previously. Since we can
69 pickle/unpickle everything already, this should be impossible.
70 .
71 Examples and usage documentation are available in the man page.
72
73 executable htsn-import
74 build-depends:
75 base == 4.*,
76 cmdargs >= 0.10.6,
77 configurator == 0.2.*,
78 directory == 1.2.*,
79 filepath == 1.3.*,
80 hslogger == 1.2.*,
81 htsn-common == 0.0.1,
82 hxt == 9.3.*,
83 groundhog == 0.4.*,
84 groundhog-postgresql == 0.4.*,
85 groundhog-sqlite == 0.4.*,
86 groundhog-th == 0.4.*,
87 MissingH == 1.2.*,
88 old-locale == 1.0.*,
89 tasty == 0.7.*,
90 tasty-hunit == 0.4.*,
91 time == 1.4.*,
92 transformers == 0.3.*,
93 tuple == 0.2.*
94
95 main-is:
96 Main.hs
97
98 hs-source-dirs:
99 src/
100
101 other-modules:
102 Backend
103 CommandLine
104 Configuration
105 ConnectionString
106 ExitCodes
107 OptionalConfiguration
108 TSN.Codegen
109 TSN.Database
110 TSN.DbImport
111 TSN.Picklers
112 TSN.XmlImport
113 TSN.XML.Heartbeat
114 TSN.XML.Injuries
115 TSN.XML.InjuriesDetail
116 TSN.XML.News
117 TSN.XML.Odds
118 TSN.XML.Weather
119 Xml
120
121 ghc-options:
122 -Wall
123 -fwarn-hi-shadowing
124 -fwarn-missing-signatures
125 -fwarn-name-shadowing
126 -fwarn-orphans
127 -fwarn-type-defaults
128 -fwarn-tabs
129 -fwarn-incomplete-record-updates
130 -fwarn-monomorphism-restriction
131 -fwarn-unused-do-bind
132 -O2
133
134 ghc-prof-options:
135 -prof
136 -fprof-auto
137 -fprof-cafs
138 -- The following unbreak profiling with template haskell. We have
139 -- to build the program twice; once without profile and again with
140 -- these flags.
141 -hisuf hi_p
142 -osuf o_p
143
144
145 test-suite testsuite
146 type: exitcode-stdio-1.0
147 hs-source-dirs: src test
148 main-is: TestSuite.hs
149 build-depends:
150 base == 4.*,
151 cmdargs >= 0.10.6,
152 configurator == 0.2.*,
153 directory == 1.2.*,
154 filepath == 1.3.*,
155 hslogger == 1.2.*,
156 htsn-common == 0.0.1,
157 hxt == 9.3.*,
158 groundhog == 0.4.*,
159 groundhog-postgresql == 0.4.*,
160 groundhog-sqlite == 0.4.*,
161 groundhog-th == 0.4.*,
162 MissingH == 1.2.*,
163 old-locale == 1.0.*,
164 tasty == 0.7.*,
165 tasty-hunit == 0.4.*,
166 time == 1.4.*,
167 transformers == 0.3.*,
168 tuple == 0.2.*
169
170 -- It's not entirely clear to me why I have to reproduce all of this.
171 ghc-options:
172 -Wall
173 -fwarn-hi-shadowing
174 -fwarn-missing-signatures
175 -fwarn-name-shadowing
176 -fwarn-orphans
177 -fwarn-type-defaults
178 -fwarn-tabs
179 -fwarn-incomplete-record-updates
180 -fwarn-monomorphism-restriction
181 -fwarn-unused-do-bind
182 -O2
183
184
185 test-suite doctests
186 type: exitcode-stdio-1.0
187 hs-source-dirs: test
188 main-is: Doctests.hs
189 build-depends:
190 base == 4.*,
191 -- Additional test dependencies.
192 doctest == 0.9.*
193
194 -- It's not entirely clear to me why I have to reproduce all of this.
195 ghc-options:
196 -Wall
197 -fwarn-hi-shadowing
198 -fwarn-missing-signatures
199 -fwarn-name-shadowing
200 -fwarn-orphans
201 -fwarn-type-defaults
202 -fwarn-tabs
203 -fwarn-incomplete-record-updates
204 -fwarn-monomorphism-restriction
205 -fwarn-unused-do-bind
206 -rtsopts
207 -threaded
208 -optc-O3
209 -optc-march=native
210 -O2
211
212
213 -- These won't work without shelltestrunner installed in your
214 -- $PATH. Maybe there is some way to tell Cabal that.
215 test-suite shelltests
216 type: exitcode-stdio-1.0
217 hs-source-dirs: test
218 main-is: ShellTests.hs
219
220 build-depends:
221 base == 4.*,
222 cmdargs >= 0.10.6,
223 configurator == 0.2.*,
224 directory == 1.2.*,
225 filepath == 1.3.*,
226 hslogger == 1.2.*,
227 htsn-common == 0.0.1,
228 hxt == 9.3.*,
229 groundhog == 0.4.*,
230 groundhog-postgresql == 0.4.*,
231 groundhog-sqlite == 0.4.*,
232 groundhog-th == 0.4.*,
233 MissingH == 1.2.*,
234 old-locale == 1.0.*,
235 process == 1.1.*,
236 tasty == 0.7.*,
237 tasty-hunit == 0.4.*,
238 time == 1.4.*,
239 transformers == 0.3.*,
240 tuple == 0.2.*
241
242
243
244 source-repository head
245 type: git
246 location: http://michael.orlitzky.com/git/htsn-import.git
247 branch: master