]> gitweb.michael.orlitzky.com - dead/htsn-import.git/blob - htsn-import.cabal
Add extra docs to the source tarball.
[dead/htsn-import.git] / htsn-import.cabal
1 name: htsn-import
2 version: 0.0.1
3 cabal-version: >= 1.8
4 author: Michael Orlitzky
5 maintainer: Michael Orlitzky <michael@orlitzky.com>
6 category: Utils
7 license: GPL-3
8 license-file: doc/LICENSE
9 build-type: Simple
10 extra-source-files:
11 doc/dbschema/*.png
12 doc/htsn-importrc.example
13 doc/man1/htsn-import.1
14 doc/README.dbschema
15 doc/README.schemagen
16 doc/TODO
17 makefile
18 test/xml/*.xml
19 test/xml/*.dtd
20 schema/*.dtd
21 schemagen/Heartbeat/*.xml
22 schemagen/injuriesxml/*.xml
23 schemagen/Injuries_Detail_XML/*.xml
24 schemagen/newsxml/*.xml
25 schemagen/Odds_XML/*.xml
26 schemagen/weatherxml/*.xml
27 synopsis:
28 Import XML files from The Sports Network into an RDBMS.
29 description:
30 /Usage/:
31 .
32 @
33 htsn-import [OPTIONS] [FILES]
34 @
35 .
36 The Sports Network <http://www.sportsnetwork.com/> offers an XML feed
37 containing various sports news and statistics. Our sister program
38 /htsn/ is capable of retrieving the feed and saving the individual
39 XML documents contained therein. But what to do with them?
40 .
41 The purpose of /htsn-import/ is to take these XML documents and
42 get them into something we can use, a relational database management
43 system (RDBMS), loosely known as a SQL database. The structure of
44 relational database, is, well, relational, and the feed XML is not. So
45 there is some work to do before the data can be inserted.
46 .
47 First, we must parse the XML. Each supported document type (see below)
48 has a full pickle/unpickle implementation (\"pickle\" is simply a
49 synonym for serialize here). That means that we parse the entire
50 document into a data structure, and if we pickle (serialize) that data
51 structure, we get the exact same XML document tha we started with.
52 .
53 This is important for two reasons. First, it serves as a second level
54 of validation. The first validation is performed by the XML parser,
55 but if that succeeds and unpicking fails, we know that something is
56 fishy. Second, we don't ever want to be surprised by some new element
57 or attribute showing up in the XML. The fact that we can unpickle the
58 whole thing now means that we won't be surprised in the future.
59 .
60 The aforementioned feature is especially important because we
61 automatically migrate the database schema every time we import a
62 document. If you attempt to import a \"newsxml.dtd\" document, all
63 database objects relating to the news will be created if they do not
64 exist. We don't want the schema to change out from under us without
65 warning, so it's important that no XML be parsed that would result in
66 a different schema than we had previously. Since we can
67 pickle/unpickle everything already, this should be impossible.
68 .
69 Examples and usage documentation are available in the man page.
70
71 executable htsn-import
72 build-depends:
73 base == 4.*,
74 cmdargs >= 0.10.6,
75 configurator == 0.2.*,
76 directory == 1.2.*,
77 filepath == 1.3.*,
78 hslogger == 1.2.*,
79 htsn-common == 0.0.1,
80 hxt == 9.3.*,
81 groundhog == 0.4.*,
82 groundhog-postgresql == 0.4.*,
83 groundhog-sqlite == 0.4.*,
84 groundhog-th == 0.4.*,
85 MissingH == 1.2.*,
86 old-locale == 1.0.*,
87 tasty == 0.7.*,
88 tasty-hunit == 0.4.*,
89 time == 1.4.*,
90 transformers == 0.3.*,
91 tuple == 0.2.*
92
93 main-is:
94 Main.hs
95
96 hs-source-dirs:
97 src/
98
99 other-modules:
100 Backend
101 CommandLine
102 Configuration
103 ConnectionString
104 ExitCodes
105 OptionalConfiguration
106 TSN.Codegen
107 TSN.Database
108 TSN.DbImport
109 TSN.Picklers
110 TSN.XmlImport
111 TSN.XML.Heartbeat
112 TSN.XML.Injuries
113 TSN.XML.InjuriesDetail
114 TSN.XML.News
115 TSN.XML.Odds
116 TSN.XML.Weather
117 Xml
118
119 ghc-options:
120 -Wall
121 -fwarn-hi-shadowing
122 -fwarn-missing-signatures
123 -fwarn-name-shadowing
124 -fwarn-orphans
125 -fwarn-type-defaults
126 -fwarn-tabs
127 -fwarn-incomplete-record-updates
128 -fwarn-monomorphism-restriction
129 -fwarn-unused-do-bind
130 -rtsopts
131 -threaded
132 -optc-O3
133 -optc-march=native
134 -O2
135
136 ghc-prof-options:
137 -prof
138 -fprof-auto
139 -fprof-cafs
140 -- The following unbreak profiling with template haskell. We have
141 -- to build the program twice; once without profile and again with
142 -- these flags.
143 -hisuf hi_p
144 -osuf o_p
145
146
147 test-suite testsuite
148 type: exitcode-stdio-1.0
149 hs-source-dirs: src test
150 main-is: TestSuite.hs
151 build-depends:
152 base == 4.*,
153 cmdargs >= 0.10.6,
154 configurator == 0.2.*,
155 directory == 1.2.*,
156 filepath == 1.3.*,
157 hslogger == 1.2.*,
158 htsn-common == 0.0.1,
159 hxt == 9.3.*,
160 groundhog == 0.4.*,
161 groundhog-postgresql == 0.4.*,
162 groundhog-sqlite == 0.4.*,
163 groundhog-th == 0.4.*,
164 MissingH == 1.2.*,
165 old-locale == 1.0.*,
166 tasty == 0.7.*,
167 tasty-hunit == 0.4.*,
168 time == 1.4.*,
169 transformers == 0.3.*,
170 tuple == 0.2.*
171
172 -- It's not entirely clear to me why I have to reproduce all of this.
173 ghc-options:
174 -Wall
175 -fwarn-hi-shadowing
176 -fwarn-missing-signatures
177 -fwarn-name-shadowing
178 -fwarn-orphans
179 -fwarn-type-defaults
180 -fwarn-tabs
181 -fwarn-incomplete-record-updates
182 -fwarn-monomorphism-restriction
183 -fwarn-unused-do-bind
184 -rtsopts
185 -threaded
186 -optc-O3
187 -optc-march=native
188 -O2
189
190
191 test-suite doctests
192 type: exitcode-stdio-1.0
193 hs-source-dirs: test
194 main-is: Doctests.hs
195 build-depends:
196 base == 4.*,
197 -- Additional test dependencies.
198 doctest == 0.9.*
199
200 -- It's not entirely clear to me why I have to reproduce all of this.
201 ghc-options:
202 -Wall
203 -fwarn-hi-shadowing
204 -fwarn-missing-signatures
205 -fwarn-name-shadowing
206 -fwarn-orphans
207 -fwarn-type-defaults
208 -fwarn-tabs
209 -fwarn-incomplete-record-updates
210 -fwarn-monomorphism-restriction
211 -fwarn-unused-do-bind
212 -rtsopts
213 -threaded
214 -optc-O3
215 -optc-march=native
216 -O2
217
218
219 source-repository head
220 type: git
221 location: http://michael.orlitzky.com/git/htsn-import.git
222 branch: master