From: Michael Orlitzky Date: Wed, 30 Jul 2014 08:26:15 +0000 (-0400) Subject: Make the game_info schedule_id optional and add a test case for it. X-Git-Tag: 0.1.1~8 X-Git-Url: http://gitweb.michael.orlitzky.com/?p=dead%2Fhtsn-import.git;a=commitdiff_plain;h=b0fac40d71bca72312293eb33c33c0f0933d0a28 Make the game_info schedule_id optional and add a test case for it. --- diff --git a/src/TSN/Parse.hs b/src/TSN/Parse.hs index 9fe5125..d707f5f 100644 --- a/src/TSN/Parse.hs +++ b/src/TSN/Parse.hs @@ -113,6 +113,30 @@ parse_message_int child xmltree = +-- | Parse an optional 'Int' from a direct descendent of the +-- (top-level) \ element in an XmlTree. This is just like +-- 'parse_message_int', except we expect the element/value to be +-- missing sometimes. +-- +-- To handle the fact that the element/value is optional, we pattern +-- match on the 'ParseError' that comes back in case of failure. If +-- we didn't find anything, we turn that into a \"successful +-- nothing\". But if we find a value and it can't be parsed, we let +-- the error propagate, because that shouldn't happen. Of course, if +-- the parse worked, that's nice too: we wrap the parsed value in a +-- 'Just' and return that wrapped in a 'Right' +-- +parse_message_int_optional :: String + -> XmlTree + -> Either ParseError (Maybe Int) +parse_message_int_optional child xmltree = + case (parse_message_int child xmltree) of + Left (ParseNotFound _) -> Right Nothing + Left pm@(ParseMismatch {}) -> Left pm + Right whatever -> Right (Just whatever) + + + -- | Extract the \"XML_File_ID\" element from a document. If we fail -- to parse an XML_File_ID, we return an appropriate 'ParseError' -- wrapped in a 'Left' constructor. The reason should be one of two @@ -135,53 +159,21 @@ parse_xmlfid = parse_message_int "XML_File_ID" -- | Extract the \ element from within the top-level -- \ of a document. These appear in the "TSN.XML.GameInfo" --- documents. Unlike the \ and \ --- elements, the \ can be missing from GameInfo --- documents. So even the 'Right' value of the 'Either' can be --- \"missing\". There are two reasons that the parse might fail. --- --- 1. No such elements were found. This is expected sometimes, and --- should be returned as a 'Right' 'Nothing'. --- --- 2. An element was found, but it could not be read into an --- 'Int'. This is NOT expected, and will be returned as a --- 'ParseError', wrapped in a 'Left'. --- --- Most of implementation for this ('parse_message_int') is shared, --- but to handle the fact that game_id is optional, we pattern match --- on the 'ParseError' that comes back in case of failure. If we --- didn't find any game_id elements, we turn that into a --- \"successful nothing\". But if we find a game_id and it can't be --- parsed, we let the error propagate, because that shouldn't --- happen. Of course, if the parse worked, that's nice too: we wrap --- the parsed value in a 'Just' and return that wrapped in a 'Right' +-- documents. Unlike the \ elements, the \ +-- can be missing from GameInfo documents, so for our implementation +-- we use 'parse_message_int_optional' instead. -- parse_game_id :: XmlTree -> Either ParseError (Maybe Int) -parse_game_id xml = - case (parse_message_int "game_id" xml) of - Left (ParseNotFound _) -> Right Nothing - Left pm@(ParseMismatch {}) -> Left pm - Right whatever -> Right (Just whatever) +parse_game_id = parse_message_int_optional "game_id" -- | Extract the \ element from within the top-level --- \ of a document. These appear in the --- "TSN.XML.GameInfo" documents. If we fail to parse a schedule_id, --- we return the reason wrapped in an appropriate 'ParseError'. The reason --- should be one of two things: --- --- 1. No such elements were found. --- --- 2. An element was found, but it could not be read --- into an Int. --- --- Both of these are truly errors in the case of schedule_id. The --- implementation for this ('parse_message_int') is shared among a --- few functions. +-- \ of a document. Identical to 'parse_game_id' except +-- for the element name. -- -parse_schedule_id :: XmlTree -> Either ParseError Int -parse_schedule_id = parse_message_int "schedule_id" +parse_schedule_id :: XmlTree -> Either ParseError (Maybe Int) +parse_schedule_id = parse_message_int_optional "schedule_id" @@ -262,6 +254,7 @@ parse_tests = "TSN.Parse tests" [ test_parse_game_id, test_parse_missing_game_id, + test_parse_missing_schedule_id, test_parse_schedule_id, test_parse_xmlfid ] where @@ -312,3 +305,15 @@ test_parse_missing_game_id = let actual = parse_game_id xmltree let expected = Right Nothing actual @?= expected + + +-- | The schedule_id element can be missing, so we test that too. +-- +test_parse_missing_schedule_id :: TestTree +test_parse_missing_schedule_id = + testCase "missing schedule_id is not an error" $ do + let path = "test/xml/gameinfo/recapxml-no-game-schedule-ids.xml" + xmltree <- unsafe_read_document path + let actual = parse_schedule_id xmltree + let expected = Right Nothing + actual @?= expected diff --git a/src/TSN/XML/GameInfo.hs b/src/TSN/XML/GameInfo.hs index 2830295..d165c19 100644 --- a/src/TSN/XML/GameInfo.hs +++ b/src/TSN/XML/GameInfo.hs @@ -102,8 +102,9 @@ data GameInfo = -- They provide foreign keys into any tables storing -- games with their IDs. - schedule_id :: Int, -- ^ Required foreign key into any table storing a - -- schedule along with its ID. + schedule_id :: Maybe Int, -- ^ Optional key into any table storing a + -- schedule along with its ID. We've noticed + -- them missing in e.g. recapxml.dtd documents. time_stamp :: UTCTime, xml :: String } deriving (Eq, Show) @@ -190,7 +191,7 @@ test_accessors = testCase "we can access a parsed game_info" $ do let a4 = game_id t let ex4 = Just 39978 let a5 = schedule_id t - let ex5 = 39978 + let ex5 = Just 39978 let a6 = take 9 (xml t) let ex6 = "" let actual = (a1,a2,a3,a4,a5,a6) diff --git a/test/xml/gameinfo/recapxml-no-game-schedule-ids.xml b/test/xml/gameinfo/recapxml-no-game-schedule-ids.xml new file mode 100644 index 0000000..07037e0 --- /dev/null +++ b/test/xml/gameinfo/recapxml-no-game-schedule-ids.xml @@ -0,0 +1 @@ + 21246048 AAD;RECAP-SEA-TAM-P1 Recaps MLB American League Game Summary - Seattle at Tampa Bay Seattle

St. Petersburg, FL (SportsNetwork.com) - Endy Chavez's two-out single broke a scoreless tie and started a five-run ninth inning that helped the Seattle Mariners beat the Tampa Bay Rays, 5-0, in the rubber match of a three-game series at Tropicana Field on Sunday.

Brad Miller smacked a two-out triple into the right-field corner on an 0-2 slider from Grant Balfour (0-2). After a Willie Bloomquist walk, Chavez's sharp grounder on another 0-2 pitch went under the glove of diving shortstop Yunel Escobar. James Jones then tripled over the head of right fielder Kevin Kiermaier, who was playing shallow, to make it a 3-0 advantage.

Robinson Cano was walked on four pitches before Kyle Seager's two-run double to right increased the cushion to five for the Mariners, winners in seven of their last eight contests.

Yoervis Medina (3-1) walked one in the eighth inning to earn the win after taking over for Seattle starter Felix Hernandez. King Felix struck out a career-high 15 batters and allowed just four hits in seven innings in a no- decision.

June 8, 2014, at 05:00 PM ET
\ No newline at end of file