X-Git-Url: http://gitweb.michael.orlitzky.com/?a=blobdiff_plain;f=doc%2Fman1%2Fhtsn-import.1;h=07e8c32240a8710bdabf5c60bb2f1223bccde95e;hb=7f1806d4303f413a434c7f53cc9533e30a2ffa2d;hp=7a215b142c420e2931045a895daa1d3ebba974c6;hpb=5e06d6a189fd5bc1cbc67a349bbee5e168d3bf24;p=dead%2Fhtsn-import.git diff --git a/doc/man1/htsn-import.1 b/doc/man1/htsn-import.1 index 7a215b1..07e8c32 100644 --- a/doc/man1/htsn-import.1 +++ b/doc/man1/htsn-import.1 @@ -268,6 +268,21 @@ construct the DTDs ourselves, the results are sometimes inconsistent. Here we document a few of them. .IP \[bu] 2 +\fInewsxml.dtd\fR + +The TSN DTD for news (and almost all XML on the wire) suggests that +there is a exactly one (possibly-empty) element present in each +message. However, we have seen an example (XML_File_ID 21232353) where +an empty followed a non-empty one: + +.fi +Odd Man Rush: Snow under pressure to improve Isles quickly + +.nf + +We don't parse this case at the moment. + +.IP \[bu] \fIOdds_XML.dtd\fR The elements here are supposed to be associated with a set of @@ -285,7 +300,19 @@ There appear to be two types of weather documents; the first has contained within . While it would be possible to parse both, it would greatly complicate things. The first form is more common, so that's all we support for now. An example is provided as -schemagen/weatherxml/20143655.xml. +test/xml/weatherxml-type2.xml. + +We are however able to identify the second type. When one is +encountered, an informational message (that it is unsupported) will be +printed. If the \fI\-\-remove\fR flag is used, the file will be +deleted. This prevents documents that we know we can't import from +building up. + +Another problem that comes up occasionally is that the home and away +team elements appear in the reverse order. As in the other case, we +report these as unsupported and then \(dqsucceed\(dq so that the +offending document can be removed if desired. An example is provided +as test/xml/weatherxml-backwards-teams.xml. .SH DEPLOYMENT .P