Tuesday, May 19, 2015

Where Does Land Value Live?

VALUE DIG HISTORY -- Back in 2005, I had a real estate investment client who asked this rough question:

  "Hey, where does *REAL* land value 'LIVE' in Kendall County Texas?"

With the added conditional:

  "And don't give me those appraisal 'compares' -- they are just made up by some guy at tax office."

The actual question was bit more complex -- yet basically boiled down to the above question and conditional. 

KEY PURSUIT -- This "real" value question drills down almost to the core mission of this blog: "How do we know -- what we think we know!"  This posting will tickle the edges for key insights.

Mission Probable: 
Buying the Desired Answer?

SALES vs TAX APPRAISAL -- It is true that most property tax appraisal districts "make up" some -- but not all -- of the property values.  Recent and past "actual" sales are also used to help determine changes of property "value."  The key difference?  And this was buried in my client's question -- how to DISTINGUISH "real" sales values from "made up" values for tax appraisal purposes.  Was there a method to pull these different kinds of value apart for examination?

'MARK TO MARKET' PROBLEM -- Sales Trends vs Long Term Holdings and "Pollution" of values. 

To be more fair to tax appraisal districts -- consider the long term value trend identification issue: For some properties -- that remain in the same hands for 10 / 20 / 40 years -- and fail to manifest a "real market price" -- via a realized market sale -- and in order to collect "fair and balanced" tax values -- tax appraisal districts must "guess" the property value for taxes -- roughly every year. 

While the guess (estimation) process is heavily vetted and subject to many rules -- and in most tax jurisdictions attempts to be "fair" and "reasonable" -- the estimation process can be biased -- either accidentally -- thru blunders or poor attention to details -- or for more mischievous reasons (for revenue / political influence).  "Vet" and "Zen" of "real" property value verses price Skullduggery can be tricky to identify -- when just looking at one to 10 lots in the field.  Sometime larger patterns can be "seen" when a few 1000 parcels are compared -- via one possibe approach -- outlined below.

Skullduggery?  Yes, who can "out-guess" the official "guesser" for a property "value?"   Whether the "guesser" is "official" or "consultant" -- the raw tax data routinely fails to identify the guesser's orientation -- and the bias factors will change from year-to-year -- and decade-to-decade.  A "price bias" -- either high or low -- can remain in the data for many years.

LAST SALE OVER-RIDE -- Some county GIS tax databases *do* flag "real" sales vs some other valuation process -- and will sometimes report a "last sale" date -- presumably to trigger some "superior" valuation process -- like "Hey, this tax value is too old -- we need an updated estimate for next year's appraisal."   

However -- not all county tax GIS-databases retain "last sale" information -- and if retained -- there are no guarantees the "last sale" dates are realistic or were maintained in a careful manner.  Plus an adjoiner sale value -- *NOW* -- will often influence or over-ride any past considerations.  Experience with tax appraisal databases from AK / PA / LA / ND / NM / TX / and others -- indicate "last sale" maintenance -- and tax value significance -- varies much from county-to-county -- and state-to-state (much seems to depend on local tax jurisdiction policies, "personality,"  data absorption and appraisal skills).

UPSHOT?  REAL GOAL? -- My client wanted to DIFFERENTIATE between a "real sale price" -- verses -- systemic valuation bias (i.e. county wide guesses) in the reported property values.   My client wanted the "data behind the data" to aid his investment search.

RAW DATA and QUICKLOOK -- GIS data was obtained from a public source for Kendall county Texas (KCAD Data).  The data was "fresh" and up to date -- in the sense that a very dedicated and skilled GIS camper had brought the KCAD tax data -- and the parcel data -- up to good digital standard.  

Review of the data -- and "testing" on an "example" sub-division -- produced an interesting 1st cut surprise.  Maps and details to anchor the eyeballs and brain for "Alamo Springs" 

2004 Kendall County TX Parcel Map
and Location of Alamo Springs in Green
(click for larger)

And a Log-Log Plot of the raw ACRES (x-axis) -- verses -- "Sale or Tax Value" (y-axis) yielded an interesting scatter plot pattern:

Log-Log Value Chart
Acres (x) vs Sale or Tax Value (y)
Green Dots: Alamos Springs Parcels
(click for larger)

Dot Value-vs-Color Legend

1ST BRUSH SIGNIFICANCE -- The clustering of Alamo Springs sub-division parcels (green dots) along the X-axis was no surprise -- in year 2004 -- the sub-division "lots" were roughly between 2 and 15 acres.  

The clustering in dollar value -- along the Y-axis -- was more interesting -- especially when "compared" to other parcel valuations across the county (blue to red dots):  The lot values were scatter clustered between $10k and $100k -- and were "stabbed" by a lower-left to upper-right linear clustering line pattern.  

This linear clustering pattern was found -- via stats and regression analysis -- to generally be along a trend line of approx $7000 per acre.  Rough descriptive trend equation was found to be of the form:

  Value (dollars) = $7000 x Acres(-0.98)

Where the exponent of Acres -- the value "-0.98" -- is very close to "-1.00" -- and where a negative 1.0 exponent means "per acre."

Other trend lines -- above and below -- were also expressed -- and numerically explored.  These trend line scatter patterns suggested some "hidden hand" in the property market was "setting" land value vs acres in some deliberate and mathematically deterministic way.  My client wanted to know:  Was this trend line from tax appraisal estimated values?  Or a "real" market "signal" in sale prices?  A trend that could be leveraged for better "buy in" decisions.

DOTS with ID -- Each of the Log-Log scatter plot dots were "addressable" -- and each dot had a "link back" to the original parcel data attributes -- parcel ID number, landowner name, last sale date, land acres, improvements, etc.

VALUE SPACE QUERY FOR REAL SPACE LOCATION -- In short -- and this is the key point of this posting -- dots on the Log-Log value chart could be interrogated in Log-Log space -- and the significance of their map geo-location -- if any -- could be determined.  Basically a rapid technique for flipping back and forth between "comparables" -- in terms of land value -- to map location -- and noting any significant acre-vs-value patterns.

1st TEST -- SMALL ACRES, MEDIUM VALUE Compares -- My client wanted to examine small acreages -- usually highly developed and in urbanized "towns" -- verses -- the much more ranch-land rural parts of Kendall county.  Some more surprise -- a Value-vs-Acres selection -- in the Log-Log chart space -- like thus:


Chart: "Value" Selection in Dollars 
vs Acres in Log-Log Scatter Cluster Space
(click for larger)


MANIFEST IN THE "MAP SPACE" -- This blob selection "Lit Up" the Kendall county parcel map -- in yellow -- thus:


Kendall County Map: Parcel Selection by
1st "Blob Query" in Log-Log Value Space
(click for larger)

INTERESTING POINT -- The Log-Log selection blob "found" parcels -- on the map -- that were generally in urban and "developed" areas.  Lots located in the towns of Boerne and Comfort Texas.

2nd TEST - LARGER ACRES, MEDIUM VALUE Compares -- Searching for a "Bargain" property.  A second "Blob" selection in the Log-Log value space "found" Kendall parcels in very different map locations:


Chart: "Value" Selection in Dollars 
vs Acres in Log-Log Scatter Cluster Space
(click for larger)


IN THE "MAP SPACE" -- This 2nd blob selection "Lit Up" the Kendall county parcel map -- in yellow -- for larger sub-division areas -- outside the "town" areas:


Kendall County Map: Parcel Selection by
2nd "Blob Query" in Log-Log Value Space
(click for larger)

MYSTERY COMPARES -- Significant Outliers -- In the Log-Log value space plots:  There were other, more mysterious "offset" points -- out near the edges.  The best example called out in the charts:  the Benedictine Sisters Orphanage located on the former Kronkosky property -- a significant outlier with a very large valuation:  Approx $19.6 million for some 40 acres -- $488,400 per acre in 2004.  This large valuation would make NO fiscal sense for a private landowner -- as the taxes would be out of proportion for offset "comparables."  However -- since the orphanage is a charity -- and probably tax exempt -- there would be no tax bill.

Negotiation Leverage?  One possible interpretation for this significant outlier?  Sharing this example with various appraisal experts -- the speculation?  This large valuation is used to self-report and to establish a trend:  In the event the orphanage is forced to sell -- eminent domain or related for public space, road or utility corridor -- or chooses to sub-divide and sell a lot or two -- the orphanage negotiators can begin with a very high price history -- and appear price flexible -- as the negotiations unfold.

SUMMARY -- The above Log-Log value space analysis has been repeated for Bexar County Texas -- a county with approx 600,000 parcels -- with some surprisingly similar results -- and some unique differences.  Because Bexar county is "more urban" -- and more "mixed use" than Kendall county -- Value space chart patterns are very different for Bexar county -- and exhibit many "localized" value personalities.  Stay tuned for a future example.

Monday, May 18, 2015

Enumerating ESRI Shapefile DBF Attributes

CHALLENGE -- Mystery, cold-boot ingest and analysis of ESRI shapefiles often requires analysis and reporting tools that fail to exist in the ArcGIS-ArcMAP GUI interface.  Or are "buried" under a really obscure workflow that is tricky to repeat -- over and over again -- with dozens of mouse clicks.  Lots of room for repeat failure.


SOLUTION -- Simple Python command line enumeration of ESRI Shapefile DBF attributes -- reporting column name, column length and data type -- the DBF "schema" -- inside a "naked" Win32 command line environment.  Suitable for Cut-n-Paste reporting and analysis.




APPROACH -- Python Script -- heavily modified from ESRI web site hints and kinks -- listed below.

# =====================================================================

#  Abstract:  Script to Enumerate ESRI Shapefile Attributes 
#             for Analysis and Reporting
#
#  Revised:   2010-11-18 -- New create from ESRI web hints 
#                           and examples

#  Version:   Update for ArcGIS 10.x and ArcPY 
#             "GeoProcessing" imports

#  Contact:   R.Marten
#             Q7 GeoComputing
#             24165 IH-10 West #217
#             SATX 78257
#             Email: Q7GeoComputing_at_WarpMail_dot_Net

#  From:
#   http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#\
#              //000v0000001p000000

# =====================================================================


# import arcgisscripting, os, sys   # -- ArcGIS 9.x imports
# gp = arcgisscripting.create(9.3)  # -- ArcGIS 9.x instantiation

import arcpy, sys, os   # -- ArcGIS 10.x imports

# For each field in the feature class (shapefile or GDB), 
#  print the field name, type, and length.
#
# ----- GDB data store Approach ----
# fieldList = arcpy.ListFields("C:/Data/Municipal.gdb/Hospitals")
#
# ----- Shapefile data store Approach ----
# fieldList = arcpy.ListFields("c:\gis\usgs\usgs_tx_24k_centroids_wgs84.shp")

# --- Check the number of command line arguments.
if len(sys.argv) == 1:
  print "Script requires a file argument -- feature class (Shp or GDB) "
  print "Try Local Dir: %s my_file.shp" % ( sys.argv[0] )
  print "Try Full Path: %s c:\gis\usgs\usgs_tx_24k_centroids_wgs84.shp " % ( sys.argv[0] )
else:
  shp_file = sys.argv[1]
  # fieldList = gp.ListFields( shp_file ) # --- version 9.x and earlier
  fieldList = arcpy.ListFields( shp_file )
  print "Field Name  Len  Name "
  print "==========  ===  ==============================="
  for field in fieldList:
    print("{0:10}  {1:3}  {2:8}"
      .format(field.name, field.length, field.type ))  

# === end script ===

USE & PROCEDURE STEPS -- Rough invocation steps are:

(1) Cut and paste the above script into a simple ASCII text file -- and save in a "safe" and easy to find folder location.  Suggest "c:\Downloads\ArcPY_Examples"

(2) On a WinXP or Win7 platform -- with ArcGIS 10.x installed -- create a "safe" working folder -- something like "c:\Downloads\ArcPY_Examples."  This maybe accomplished from the GUI / Windows "explorer" -- or via command line thus -- with typed commands in bold:


 c:\where_ever_cmd_starts> mkdir c:\Downloads
 c:\where_ever_cmd_starts> mkdir c:\Downloads\ArcPY_Examples


(3) Open a Win32 command window -- and type the following commands:


 c:\where_ever_cmd_starts> cd c:\Downloads\ArcPY_Examples
 c:\Downloads\ArcPY_Examples> 
     z_Enum_Shp_Fields_v10.py Parcels_20130730_nad27.shp


RESULTS #1 -- Screen capture of command line invocation of above script --- and targeting an example suite of tax parcels:


click for larger view

RESULTS #2 -- Screen capture of command line invocation and script results -- note the bracketed output:

click for larger view

CHALLENGES and POTENTIAL FRICTION POINTS --

(a) The 1st time this script is invoked -- especially on a freshly booted machine -- the ArcGIS Python modules and libraries will not be "memory resident" -- or fresh in the disk drive buffer queue -- and script startup may require 20 secs to 2 mins -- time depends upon many system factors.  After 1st invocation -- the script will execute promptly for any other shapefile -- as the necessary Python modules and libraries will be "handy" to the Operating System.

(b) ArcGIS License?  Got one?  If the ArcGIS installation is "boot-legged" -- the script may fail to start -- as the import of the ArcPY modules also performs a valid license check.

OTHER EXAMPLES -- The above is just one, simple, low-impact example using ArcPY at the "raw" command line -- and extract the DBF "database" schema.  Other examples developed in the past -- ideas to add to problem solving toolbox:

(a) Generating regular measurement and field sample grids via ArcGIS FISHNET -- useful for GPS and seismic field layouts and cross measurement checks.  Many oil & gas seismic layouts maybe generated -- or "re-created" and/or "repaired" from design parameters via ArcGIS FISHNET.

Typical Seismic Design 
Parameters

(b) Exporting shapefile "layers" based upon some categorical attribute data in a DBF table. -- many AutoCAD folks will struggle to understand shapefile categorization via DBF attributes -- and will ask for a shapefile "layer" that has "only the NO ENERGY" -- or "only the NO MULCH" -- or "only the NO FLY" -- or whatever.   When there are more than 3-5 attribute categories -- a script is a more SURE FIRE (deterministic) and REPEATABLE method of exporting each category value as a shapefile layer.

Categorical Map vs Legend "Layers"

(c) Generation of XY "shotgun scatter" pseudo-random points for low-bias statistical sampling and "attribute pulls."  Basically setting up a map -- with randomly scattered XY locations -- to "pickup" a third / fourth / fifth / etc attribute -- Z1 / Z2 / Z3 / etc -- and perform some summary stats for "hot spots" and "heat maps" and "peaks & pits" identificaton.

XY Scatter Search for Peaks & Pits

SUMMARY -- Sometimes there is no replacement for a deterministic, repeatable and document-able script to capture some key workflow in data mining.

Friday, May 8, 2015

Google "BigBox" Store & the $10 Cover Charge

DIGITAL PROPERTY TRANSACTION COSTS: Recently I designed, developed and sold a specialized web site and digital property for an number crunch & marketing client.  The web property was primarily a cross-communication portal to enable certain kinds of field operations.  After determining my client's needs and interests -- and evaluating the site security -- and a balanced need for search engine exposure -- I used the "Google Cloud" of services to create and effect this web property.

ODD BILLING -- What was striking about this web site -- and digital property -- "inside" the Google "Walled Garden" -- was the need for a $10 "cover charge" (Google Play gift card) to cover the "small noise" billing.  While most Google services are Ad supported -- and generally considered "free" -- some are not -- and are "billed" in very, very small increments -- that accumulate -- and can surprise & trip workflows "nested" inside the Google cloud.


2009 Book: The Google Way 

THE GOOGLE WAY: Now there is the "right way" -- and "the wrong" way -- to build a web site and digital property.  And then there is the "Google Way."  

The "Google Way" is not necessarily "better" or "worse" than any other hosting and content management "cloud solution" -- the "Google Way" is just one more "walled garden" on the internet -- as is Apple -- or Facebook -- or MS-Bing -- or Yahoo -- or Zoho -- or whatever.

Perhaps the BigBox store metaphor will aid understanding.  Consider, when you choose to drive to / visit / enter / shop in a BigBox store -- be they WalMart / Target / HEB / Home Depot / etc -- you are presented with a "Walled Garden" marketing & shopping approach.  And -- indirectly -- you "bought into" the BigBox store's idea of what a shopping "experience" in a "Walled Garden" should be.  

Why "Walled Garden?"   In short -- A "Walled Garden" is "pretty" -- and "nice" -- and "predictable" -- and "comfortable" -- and things are "easy" to find.  And most important for the store owner -- customer interests and whims are almost always focused and "folded back into" the store presentation.  Internet walled gardens attempt to manifest a similar presentation "style" -- and "dominate" your approach to internet communications / shopping / web marketing / presentation solutions.

SMALL INCREMENT MONEY: The "Google Way" and "Google Cloud" suite of software services manifests a peculiarity similar to a BigBox store:  Small charges "rack up" as you walk around the store: Stop at the coffee bar -- pay for a single cup on the spot.  Ditto for the "juice bar" and the "sandwich cafe."   Billing for stuff you consume on the spot -- stuff too difficult to track as you walk around the store -- and out the door.

Yet unlike a BigBox store -- the Google Cloud billing increments are so small as to be almost un-measureable.  And Google seems unsure exactly how small these billing increments really are -- or should be.  It is a tricky problem: How to bill clients in increments of 0.000000000001 cents? (i.e. one trillionth of a penny).  Or 1.0e-18 pennies? (one "atto-penny")  How do you "pay" on the spot?


From Wikipedia: "Short Scale" Orders of Magnitude

Why are these billing increments so small?  Why is it easy to "price" a transaction in coffee or sandwiches -- yet difficult to price Google services?  Contrast history in technology development can help.

COST of Moving Bits -- HISTORY -- Like Little Orphan Annie -- back in the dark-ages of 1986 -- when I was a newly minted instrumentation and Geophysics nerd -- I was helping a Daddy Warbucks project in the wilds of New Mexico.  My project leader was building instruments for space probes -- ultimately "tasked" to support something called -- at the time -- Partial Nuclear Test Ban Treaty

From Wikipedia

And yes -- the "bad guys" will test a nuke bomb in the high atmosphere -- maybe low Earth orbit -- to escape attention -- by the watchful "BigBro" Nations.  Thus the need for specialized space instruments to "sniff out" the nuke bomb "poo" generated by sneak test "proliferators."  Nuke test mischief was just possible in 1979 -- see wiki hints below.  Almost impossible in 1987.  And very probably impossible in 2015.


Approximate Location of 1979 Vela Incident
From Wikipedia

ORBIT COST:  In 1987 -- our monitoring & nuke bomb sniffing instruments "cost" Daddy Warbucks something like $1.2 million USD per kilogram (2.2 pounds) to design / promote / develop / test / "make ready" for launch / boost into orbit.  Developing a "bad" instrument was both financially bad -- and a career changer.  

COST AVOIDANCE vs BANDWIDTH BURN RATE: Each preliminary "realization" of a nuke bomb sniffing instrument -- in metal -- cost some $50,000 in machine shop drilling & milling costs (1987 dollars).  My project leader wanted to test many dozens and dozens of instrument designs -- via simulations in the computer -- before cutting any "test metal" in the machine shop -- and especially before committing to final "flight design" hardware. 

My task was running 1000s of small instrument shape changes and design modifications thru the Cray super computers -- to "test" -- at least with physics and numerical simulations -- and help determine the "best" instrument shape and "front end" design that would speak to the design goals -- goals set for test ban treaty monitoring and mission requirements.

The input for 1000s of small design tests did not require massive storage.  Nor did the output results.  Design changes were handled via a single sheet of paper -- as were the Cray super computer results.  The bandwidth "dial up" burn time -- and design "hatch time" -- these costs were a different matter.

In 1986-87 -- I was using a 9600 baud speed modem -- to "dial in" and access the super computers via telephone.  I was "online" some 6-7 hours per day -- five to six days per week -- shepherding the 100s of small design changes thru the super computers.  


Telephone "Wireline Centers" Across USA
Map From Wikipedia

LONG DISTANCE -- Due to some really strange telephone regulatory LATA rules governing "local" exchanges -- my "home & compute bunker phone" was "long distance" back to the super computers.  Daddy Warbucks provided a "calling card" to cover the long distance billing -- and I saw these bills in the project management reports -- $1200 to $5000 per month.  A very big bandwidth cost.  Yet -- in the long run -- "cheaper" than a "bad" design going into orbit.

CALC: 1987 BACK TO THE FUTURE -- Assuming that my "dial in" to the super computers was "averaging" $2500 per month -- and was "full bore" 9600 baud 40 hours per week -- I would have been moving Bytes (ASCII characters) at the rate of:

 (960 bytes / sec) * (3600 sec/hr) * (40 hrs/wk) * (4 wks/month) = 55,296,000 bytes/month

55.3 megabytes per month.  Or an average of:

  (55.3e6 bytes / 2500 $ ) = 22,100 bytes / dollar

 22,100 bytes per dollar -- or as a reciprocal -- 4.52 millionths (4.52e-06) of a dollar per binary bit

NEAR ZERO? -- 4.52e-06 dollars per bit may seem like an amazingly small number -- yet please consider -- since 1987 -- the cost of moving bits has collapse to almost zero -- almost to an un-measureable level -- mostly due to technology advances -- and some due to statutory and regulatory changes regarding how telephone circuits are "priced" and "managed."   

YES, NEAR ZERO -- 640000x PRICE REDUCTION RATIO -- As a simple compare example -- at my 2015 compute bunker -- my "fully price loaded" DSL internet connection costs about $55 per month (taxes / entitlement burdens / licenses included).  And is "on" 24x7 -- not like old time dial-up service.  

At full bore -- the DSL line can sustain 3.0 megabits per second -- 7.76e12 bits per month -- about 776 gigabytes per month -- if data were moving 24 hours per day for 30 days.  This approx 14,138 megabytes per dollar transported.  Or in the reciprocal, about 7 trillionths of a dollar per binary bit.  Relative to 1987 "dial-up" bandwidth cost -- this is:

  (14.138e9 bytes / dollar) * ( 22,100 bytes / dollar ) = 639728

a 640,000 factor cost reduction.  In other words -- my DSL internet provider is moving bits 640,000 times less expensive than the equivalent bit-and-byte motion in 1987.  And my compute bunker is a "poor" comparison to bit-and-byte movement inside "Walled Gardens" like Google.  And you can bet that my internet provider and telephone company are moving the bits inside their wireline system far, far, far less expensive that at my "retail" full price.  

HOW TO AMORTIZE?  ASSUMING that a similar process is afoot -- and much greater price collapse has occurred inside "Walled Gardens" like Google -- it is not surprising that Google -- and others -- are not sure how much to charge for the transport of one megabyte -- or one gigabyte -- or even a few dozen terabytes.  Google -- and the other walled gardens -- DO KNOW know the cost of the buildings / computers / power / air-conditioning -- and costs to install -- and operate.  The only open question: How to estimate the amortization of the infrastructure on a per megabyte / gigabyte / terabyte / whatever basis.  And then "price" this amortization "base cost" -- and "marked up" profit -- out to the customer like me.  

SUMMARY?  Just guessing from this simple compare calc -- Google and other "cloud providers" -- really do not know what their internal "cost of production" is -- or should be -- and are just guessing how to price their systems and services.  Thus the surprise web site & hosting trip points -- and the need for the Google "cover charge" -- via a $10 gift card purchase for this specific web property. 

And this collapsing cost-per-bit of movement has really just begun ... stay tuned.

VIDEO "Extra Treat" -- BASE COST Moving Bits -- One possible realization of a "Cloud Data Center" by Google -- 45,000 "servers" -- housed in 1000 "containers" -- with a 10 megawatt IT equipment load:  Video Container Based Cloud System

Tuesday, May 5, 2015

How Far for How Many Folks??

Real Estate Market Potential vs Distance

KICKSTART QUESTION:  Back in late 2007 -- I had a real estate investor client ask: 

Q: "Well how many folks are in range of the site prospect?  I do not want to spend big bucks on marketing for a 'bum' site!  It is 'OK' to 'get in' -- or better to just move onto another prospect."   

Preliminary answer required a couple of days to aggregate the demographic and map data and provide an answer the question.  Eyeball anchor graphics and interesting results below.

HISTORY -- Demographics is Destiny -- as the saying goes.  Yet a little econo-metric background will help others understand the context of the question -- and see the magic biz decision tradeoffs.  

TEXAS POPULATION SHIFTS -- Reaching back to 1850s Texas -- this animated graphic should help others see the changing population patterns -- and how real estate investors must "map bet" on the "right spot" -- to catch a population wave:


SIGNIFICANCE and CHALLENGE -- A population shift that "washes over" your real estate prospect will float your boat to success.  Rising tide raises the ship.  If the investor picks wrong -- the population tide rolls away from his boat -- and leaves his investment stuck in the mud.  My real estate investor wanted to "see the wave" relative to his potential site -- sorted by ZIP code -- to aid marketing cost analysis.  Data resolved to county level is OK for state-wide demographics -- yet not so good for ZIP code marketing.  Please recall there were no ZIP codes before 1960s.

ALTERNATE SNAPSHOT APPROACH -- Time snapshot "wave" -- like a strobe light catching a frozen body position of a dancer: US Census data can be a rough snapshot of a frozen population wave.  In order to simulate a "data wave" of population -- the ZIP code population data needed to be "sorted" and made a "function" of radial distance from my client's site -- to see the "time wave" snapshot.


Zip Code Lines and Regions in Reno Nevada

ODD DETAILS:  US Census data is roughly "anchored" to the map via something called "Zip Code Tabulation Areas" -- ZCTA -- by the US Census.  

The kick in the head?  ZIP Codes are not really "areas" -- as the US Post office uses ZIP Codes for lines (roads / trails / rails lines) -- and for points (single buildings or foreign post office exchange) -- and sometimes for areas.  In other words Zip Codes are "optimized" for US Post office mail delivery -- and not for population demographics and mapping.  

The US Census has derived ZCTA that more-or-less "anchor" a ZIP Code to an area -- and "blurs" the line and point details.  ZCTA make it *mostly* possible to tie a ZIP Code to a lat-long point on the globe (database table "Join" or "GeoLink" via ZIP Code). 

RESULTS -- Results below -- and is a quick-look demographic from 2004 populations within a given radial range or distance from a selected site -- in this example -- a parking lot site [29.7771º N, 98.8071º W] -- well known to the client -- and roughly near Boerne Texas.


Smoke Rings of Population Accumulation
Click Image for Full Size Map

SMOKE RINGS -- Expanding numerical 'smoke rings' of distance were swept over the 2004 Census zip code centers -- and population values associated with each zip center -- were treated as a differential impulse to aggregate population within various rings of range or distance.   Each impulse was "added up" (cumulative distribution curve).  Result chart:


Red: ZCTA Population Impulses vs Range
Blue: Sum of Population Impulses
Click on Image for Full Size Chart

SUMMARY -- My real estate investor client used the above chart (and various generated XLS tables and maps) to estimate marketing outreach costs for his prospect site.  This was a "knife edge" quick look test before placing and option on the prospect.  Marketing costs were a "real" cost to "get in" and promote the site -- and "range analysis" -- such as the above often helps with the problem question "Hey, is the market wave coming my way -- or not?"