From: Michael Orlitzky
Date: Sat, 3 Oct 2009 19:20:58 +0000 (-0400)
Subject: Added some possible optimizations to the project overview.
X-Git-Url: https://gitweb.michael.orlitzky.com/?a=commitdiff_plain;h=5bdd6e837232451fe5ae6dd9f726763587eb631e;p=dead%2Fcensus-tools.git
Added some possible optimizations to the project overview.
---
diff --git a/doc/project_overview/index.xhtml b/doc/project_overview/index.xhtml
index 051e567..3201609 100644
--- a/doc/project_overview/index.xhtml
+++ b/doc/project_overview/index.xhtml
@@ -269,5 +269,79 @@
States.
+ Possible Optimizations
+
+ There are a number of possible optimizations that can be made
+ should performance ever become prohibitive. To date, these have
+ been eschewed for lack of flexibility and/or development time.
+
+
+ De-normalization of TIGER/SF1 Block Data
+
+ Currently, the TIGER/Line block data is stored in a separate table
+ from the Summary File 1 block data. The two are combined at query
+ time via SQL
+ JOINs. Since we import the TIGER data first, and use a custom
+ import script for SF1, we could de-normalize
+ this design to increase query speed.
+
+
+
+ This would slow down the SF1 import, of course; but the import
+ only needs to be performed once. The procedure would look like the
+ following:
+
+
+
+ -
+ Add the SF1 columns to the TIGER table, allowing them to be
+ nullable initially (since they will all be NULL at first).
+
+
+ -
+ Within the SF1 import, we would,
+
+ - Parse a block
+ -
+ Use that block's blkidfp00 to find the corresponding row in
+ the TIGER table.
+
+ -
+ Update the TIGER row with the values from SF1.
+
+
+
+
+ -
+ Optionally set the SF1 columns to NOT NULL. This may have
+ some performance benefit, but I wouldn't count on it.
+
+
+ -
+ Fix all the SQL queries to use the schema.
+
+
+
+
+ Switch from GiST to GIN Indexes
+
+ When the TIGER data is imported via shp2pgsql, a GiST
+ index is added to the geometry column by means of the
+ -I flag. This improves the performance of the
+ population calculations by a (wildly-estimates) order of
+ magnitude.
+
+
+
+ Postgres, however, offers another type of similar index —
+ the GIN
+ Index. If performance degrades beyond what is acceptable, it
+ may be worth evaluating the benefit of a GIN index versus the GiST
+ one.
+
+