Example with PHP and Spatialite, part 2

This post is part of a series [1][2][3][4]

So I’m ready to take the next steps with my little project. This is a continuation of my previous post about my little journey. At this point I am connecting to a physical database file that I loaded with some sample data (12-digit watersheds). Now I’m going to practice with queries and you can see the results below.

Here are the base data.

12-digit US watersheds (HUC12) and the example data set used here. Found in Southern Maine, Cumberland

And here are some close ups of the data. These are fairly dense polygons.

Example of polygon vertices

Example of polygon vertices

In fact, it looks like this query is testing the relationship between the point and polygons formed by 144,700 coordinate pairs (vertices) by scanning without the help of an index.

Lots of little points to check

Lots of little points to check

At this point I’m just going to perform basic queries without using spatial indexes. You will almost always want to use spatial indexes, but I’m going to practice this in phases so these examples won’t use indexes.

Note that unlike tradition database indexes, spatial databases like Spatialite and PostGIS (and their GiST/R-tree indexes) do not use indexes for spatial queries unless you explicitly tell them to (though PostGIS seems to use them by default some of the time). You must smartly craft the use of indexes the same way that you do the SQL itself… or at least it seems this way to me.

In Spatialite, the indexes are just tables and you have to add subqueries to your query to grab the bounding rectangles from the Rtrees to pre-filter your queries for the index-driven speed-up.

And here are the spatial indexes that spatialite sees.

spatialite_indexes

Spatial indexes used in spatialite

So what’s in these indexes? Boxes…as we see below. Hopefully you can imagine how we get boxes from Xmin, Ymin, Xmax, Ymax extents. There is one box for each polygon HUC12 feature (note the PK_UID is the primary key of the main geometry layer). These simple boxes are much simpler to test for spatial relationships that the multitude of vertices we saw above. But also much less accurate; especially for funny shaped things like watersheds. But, we can use these simple boxes to pre-filter the number of features that have to be tested by the more accurate (but lengthy) spatial test – hence speeding up the overall operation in many cases.

What is in a name... or an Rtree index.

What is in a name... or an Rtree index.

 

Below is an example of the spatial query used in the code below. Translated, it says, “show me the name of the HUC12 that contains this point.”

The free gui provided by spatialite and a spatial query

The free gui provided by spatialite and a spatial query

Here are the files in the project so far. Of course you’re not normally going to be putting a  loadable extension library (libspatialite.so) in a web server file directory. But, this is just practice.

Files so far for this project

Files so far for this project

Here’s the code of db.php. This isn’t using spatial indexes, so it’s scanning 183 features and a whole bunch of vertices to figure out which polygon actually contains that point… and doing a couple simpler things like opening a connection, asking some simple questions, and closing the connection all in about 0.4 seconds.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
<html>
<head>
<title>Testing SpatiaLite on PHP</title>
</head>
<body>
<h1>Testing SpatiaLite on PHP</h1>

<?php
// Start TIMER
// -----------
$stimer = explode( ' ', microtime() );
$stimer = $stimer[1] + $stimer[0];
// -----------
try {
/*** connect to SQLite database ***/
$db = new SQLite3('db.sqlite');

/*** a little message to say we did it ***/
echo 'database connected';
}
catch(PDOException $e)
{
echo $e->getMessage();
}
# loading SpatiaLite as an extension
$db->loadExtension('libspatialite.so');

# reporting some version info
$rs = $db->query('SELECT sqlite_version()');
while ($row = $rs->fetchArray())
{
print "<h3>SQLite version: $row[0]</h3>";
}
$rs = $db->query('SELECT spatialite_version()');
while ($row = $rs->fetchArray())
{
print "<h3>SpatiaLite version: $row[0]</h3>";
}

/* SELECT HU_12_NAME FROM huc12 WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802));
*/

/*
* Create a query
*/

$sql = "SELECT DISTINCT Count(*), ST_GeometryType(Geometry), ";
$sql .= "ST_Srid(Geometry) FROM huc12";
$rs = $db->query($sql);
while ($row = $rs->fetchArray())
{
# read the result set
$msg = "There are ";
$msg .= $row[0];
$msg .= " entities of type ";
$msg .= $row[1];
$msg .= " SRID=";
$msg .= $row[2];
print "<h3>$msg</h3>";
}

$sql = "SELECT HU_12_NAME FROM huc12 WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802))";
$rs = $db->query($sql);
while ($row = $rs->fetchArray())
{
# read the result set
$msg = "Your point is in the HUC12: ";
$msg .= $row[0];
print "<h3>$msg</h3>";
}
/*
* do not forget to release all handles !
*/


# closing the DB connection
$db->close();

// End TIMER
// ---------
$etimer = explode( ' ', microtime() );
$etimer = $etimer[1] + $etimer[0];
echo '<p style="margin:auto; text-align:center">';
printf( "Script timer: <b>%f</b> seconds.", ($etimer-$stimer) );
echo '</p>';
// ---------

?>

</body>
</html>

 

Not too bad, but I want this faster because I want to feed it much larger data in my final project.

Results of the first try at this

Results of the first try at this

Example with PHP and Spatialite, part 1

This post is part of a series [1][2][3][4]

So, PHP supports SQLite out of the box (http://www.sqlite.org/, now at 3.7.10), making it a nice combo if you want to do some reads from your page. My impression is that SQLite is not recommended if you want stuff with database writes and you have more than a couple visitors. But reading seems to be fine.

I think this fits my use cases just fine as I just want to hang out some very basic utility services that can run on the “single beige box in the corner” or the “beige cloud in the sky” with few resources needed. First, I want to create a simple REST service that, when passed a pair of long/lat coordinates, will do nothing but return that name of the county they are in. Then I’ll do one for watershed identifiers (USDA WBD HUC12 to be exact), and eventually maybe I’ll work up to a nearest place service (http://www.geonames.org/) and so on. Maybe even some downstream/upstream routing with Spatialite’s network utilities [1], [2], [3].

Of course, these are spatial functions so I’m using the spatial extension to SQLite called Spatialite found at http://www.gaia-gis.it/gaia-sins/. I find Spatialite to be a profoundly elegant amalgam of existing projects (SQLite and GEOS) and new, efficient and pragmatic programming that fills an empty niche. Here that niche is don’t make me deploy anything more than I need. I simply want some basic, re-usable services that don’t do much, including have their data updated – and I don’t want to run a heavy spatial infrastructure just to cheaply answer some basic questions on a $6/month virtual, private, cloud LAMP box with 256MB of memory or this little Pentium 4 appliance running in my snack drawer at work.

So, I began compiling spatialite, and at the time I was using 3.0beta1a, so I just kept running with it. I’m still learning the basics of spatialite so I dinked around a bit. Then I followed the instructions for getting spatialite running within PHP  [here] and/or [here] (not sure which one is the official guide. The site has been migrating to a new infrastructure lately).

After making way too many typos, I got it working and am getting the expected output. I also added some timer code which tells me that from my Ubuntu VM running on my 6-month-old laptop I’m completing these ~30,000 operations in about 6 seconds against the in-memory database, including opening and closing the connection to a database and tables that are created each page load.

Sample spatialite with PHP screen

Sample spatialite with PHP screen

My next exercise will be to figure out how to connect to an existing disk-based DB and try some simpler operations. My goal will be to get my operations out the door in about 1 second on modest hardware under no load.

[code below]
Continue reading

Notes from Where Camp Boston 2011

I had a great time at this little event and it was good prep for the Maine GIS User group’s Ignite meeting tonight. Hopefully I’ll be able to attend more stuff like this in the future. But, before then, I’d better get some materials together about Spatialite in return for the kind gentleman giving the spatial sql talk at my request (from the session board).

So what follows are my notes from Where Camp Boston 2011, and in generally chronological order.

Session board

The session board at Where camp Boston 2011

sailing, snow, halloween, boston, crazy, awesome!

sailing, snow, halloween, boston, crazy, awesome!

Crazy folks out sailing. When I left at 6pm-ish it was raining hard with snow mixed in and nearly dark out – they were still sailing. I take my hand off to them.

Tom McWright keynote speaker: technology, spatial curmudgeon

<Soapbox>: Data without metadata are useless crap

There was some talk about various projects using (1) publicly available historical data, or (2) data from a range of public, crowd-sourced, etc sources. There were also related discussions about collecting data for broad public use (use case 2). But all the discussions left out an important topic — which is a common malady in many discussions in the open, free, non-government (not necessarily related, but often coincident concepts) circles: documentation/metadata. I love open data. But when I use it,  I need to have convenient, parse-able, understandable, hopefully standards-based information about this stuff I am about to use. It doesn’t have to be elaborate, but it needs to be present.

 

I wish the spatial community would agree on some simple, general, common approach to embed metadata right into the data so that the two could never be separated. When you create/maintain/manage/supply data, you’d better assume responsibility for the metadata and the data, as they should be two sides of the same coin. So, in your spatial DBs, maybe always include a table called.. DOCUMENTATION, link to the information_schema and create records contains Dublin Core RDF, or SOMETHING. Now I just need to follow my own advice…

</Soapbox>

See:

Scratch pad

EPA Supports WATERS Geospatial data through OGC WMS/WFS Services

EPA appears to be using ESRI’s ArcIMS 9.2 OGC connectors to publish OGC WMS/WFS services of their WATERS data.

See below for threads
that mention the limitations and issues with working with these semi-standard
services from ESRI.

http://mailman-viper.python-hosting.com/pipermail/users/2006-July/000103.html

http://openlayers.org/pipermail/users/2007-May/001571.html

http://www.mail-archive.com/users@openlayers.org/msg04240.html
- Little thread that I started a while ago to check on support for pure ESRI
services.

Good review of Mapnik for generating online maps

In my opinion you can’t beat UMN Mapserver or Mapnik for making gorgeous online maps. Mapserver has installed and run for me flawlessly every time I’ve tried it — without days of arguing with it. Mapnik has taken me nearly a week to get running sometimes. But, just like you wouldn’t want to shovel snow with a pitch fork, — always use the right tool for the right job. Mapnik is a focused tool for quickly turning geographic data (think shapefiles) into good looking static images (tiles) for serving to modern internet tiling maps.

If you need something like this (and a few other things) then read this excellent article.

http://mike.teczno.com/notes/mapnik.html

Plus it seems to be a good blog to check out from time to time.

History of Rasters in PostGIS and a glimpse into the mechanics of an Open Source Project

I love this post thread so much I’m going to stick it into my blog. First because if you read into it and follow the links you can see the history of raster support in PostGIS (or reasons for lack thereof) for the PostGIS newbie readers. You can also see Paul’s encouragement and explanation of how this open source project works. Open Source developers develop stuff for their own needs. However, if you can make a well educated and clear argument for why they should spend their time putting your functionality into the project, they might just do it. Otherwise, they usually say, “it’s open source. code it yourself.”

——-

----- Original Message ----From: Paul Ramsey 
To: Pierre Racine 
Cc: warmerdam@pobox.com; nicolas.gignac@msp.gouv.qc.ca; smarshall@wsi.com; PostGIS Users Discussion 
Sent: Monday, July 14, 2008 10:11:05 PMSubject: Re: [postgis-users] PostGIS WKT Raster

Pierre,

Firstly, let me commend you for your understanding of the open sourcecommunity process! Not being shy, not taking "no" for an answer, andneedling the people you need to move are all excellent ways to keepthe wheels turning, particularly when done with good humor as you haveshown.  Other first timers could/should learn from your example!

So now I owe you a reply, no doubt:

You have gotten over my first hurdle, in that you have an actual usecase for raster, that is more involved than "I want to my images,because I hear database are faster".

I have long thought that analysis, in particular a fusing of vectorand raster analysis was the only really compelling use case for rasterin database, so again, you're on the side of the angels (me).

So, is your *design* good?

Probably as good as possible, but let me point out places of concernand see what you think:

- You propose to do this because their is a "great demand for it", butthe great demand is generally the stupid demand for images-in-database"just because". You must either somehow *stop* those people, or ensureyour solution is capable of meeting their needs.  Since you propose tomeet the demand, presumably you aim for the latter.  This *will*involve a good deal of non-analytical pre-processing infrastructure toconvert people's multi-resolution, overlapping/underlapping filelibraries into a uniform resolution coverage.  I suggest you ignorethese people.- You are going to carry a certain amount of information into thedatabase and process data into bands, potentially for externalstorage.  Why not store external data in a format like TIFF that canhold all the bands and pyramids, etc?- Once you have the idea of external storage, you could as easily moveto internal storage with the BLOB interface.  Not that I recommendthis, it's easier to back up and restore a filesystem of files than ahuge database full of BLOBs.- If you step back a bit and don't even bother splitting up therasters into tiles, you can use an existing raster access library likeGDAL to work with serialized data.- Or you could use GRASS as your serialization format and hook intothat.  Added bonus: free algorithms.- Basically the less you muck with creating a "disk format for raster"(solved problem) and the more you muck with "integrating raster andvector analysis in SQL" (unsolved problem) the more leverage you willget.- I like your external storage idea. What are the implications forCREATE TABLE foo AS SELECT...?

Summary: your proposal is better than any I have seen, addressessolving problems that if solved will provide actual new functionalityand benefit to users, and clearly you've thought this through oversome time.

So, I look forward to seeing your initial plan of attack for development  :)  

As you begin looking at the code base, you'll find a few stubs ofraster support that Sandro built at my request for rasterizing vectorsinto CHIPs in the database... basically the first tip toe down thepath you have written up so completely.

Go with God.

Paul
On Mon, Jul 14, 2008 at 12:43 PM, Pierre Racine wrote: Paul Ramsey hasn't yet said my design was: -a "meme" (http://blog.cleverelephant.ca/2008/06/x-my-l.html), -a "waste of time" (http://postgis.refractions.net/pipermail/postgis-devel/2007-July/002653.html) -"useless" (http://postgis.refractions.net/pipermail/postgis-users/2007-May/015578.html) -that it was "pointless" (http://postgis.refractions.net/pipermail/postgis-users/2007-October/017250.html) -or that I was a "fool" (http://postgis.refractions.net/pipermail/postgis-users/2007-October/017239.html) So I conclude it's a good design!!! :-) He asked for a good design some time ago for raster in the database (http://postgis.refractions.net/pipermail/postgis-users/2006-October/013628.html) Where is the clever elephant? Pierre

—–Message d’origine—–

De : postgis-users-bounces@postgis.refractions.net

[mailto:postgis-users-bounces@postgis.refractions.net] De la

part de Pierre Racine

Envoyé : 7 juillet 2008 11:30

À : postgis-users@postgis.refractions.net

Cc : warmerdam@pobox.com; nicolas.gignac@msp.gouv.qc.ca;

smarshall@wsi.com

Objet : [postgis-users] PostGIS WKT Raster

Hi raster people,

I’m starting a 2-3 year project to develop a web-based application to

automate certain GIS tasks commonly needed in ecological research. The

tool will support queries over VERY large extent based on a Canada-wide

assemblages of raster and vector data layers. The queries will

typically

involve intersecting those layers with buffers defined around points or

transects.

We would like to implement this system with PostGIS, but it currently

does not to support raster data. We could convert all our data to a

single format (raster or vector) and use other tools, however PostGIS

seems to us the best and most powerful vector analysis tool available

and we would prefer to use it. We would like to develop a unified

toolkit so that the application mostly need not worry about

whether base

layers are in vector or raster format. We are then strong proponents of

having raster functionality in PostGIS. I have reviewed every thread on

this list on the subject. I have analysed them and I have compiled them

in the wiki

(http://postgis.refractions.net/support/wiki/index.php?RasterNotes).

The argument goes like this: The geodatabase paradigm has been a major

recent enhancement in GIS technology, I feel that the seamless

integration of raster and vector analysis should be one of the next.

Spatial analysis is emerging, beside the making and publishing of maps,

as one of the main desktop and web-GIS applications. Rastor/vector

seamless integration is already done for display (in most GIS and with

MapServer or ArcIMS on the web) but definitely not for

analysis. Desktop

analysts must still learn to use two distinct toolsets within most

(all?) GIS packages: one toolset for raster and another for

vector data.

Would not it be easier to build and use applications if we had a unique

data query and analysis toolset independent of the data model? I don’t

know any tool having this approach right now. A PostGIS foundation that

addressed this problem would offer better directions to application

developers than the dichotomic one proposed by ESRI since their

beginning. It would provide the necessary abstraction to develop GIS

with ONE set of analysis tools. I feel this is one good reason to

integrate raster in PostGIS. There is “GIS” in the word “PostGIS”.

PostGIS should then provide a COMPLETE GIS data foundation (read

“base”!) for geoapplication developers. I think Steve Marshall’s design

is a good step in this direction

(http://postgis.refractions.net/pipermail/postgis-users/2006-De

cember/01

4059.html).

At the same time, we would have a chance to reconsider the raster model

as a coverage instead of a series of images. Spatial databases got rid

of map sheets by allowing complete vector coverages to be stored as

single table (e.g. the entire United States) This approach should works

for raster data as well. Isn’t this the approach taken by ArcSDE?

In summary, I think we must stop thinking about PostGIS as a

mere vector

data repository to support mapping applications. This is the way most

objectors to raster integration seem to view it. We must think about

PostGIS as a powerfull indexed data analysis tool. Seamless

raster/vector analysis in the database could lead to a major

simplification of geoapplications.

I prepared a PowerPoint presentation with a complete specification of

raster in PostGIS and an analysis of what should be the result of

overlay operation between raster and vector layers. You can download

this PPT at: http://www.cef-cfr.ca/uploads/Membres/WKTRasterPostGIS.ppt

Please have a look at it, feel free to answer questions I ask and

comment.

Here are the main propositions of this design:

-Like a vector feature, rasters are stored as a subtype of “geometry”(a

new WKB/WKT geometry type called RASTER instead of POINT or POLYGON)

-There is no distinction between a raster and a tile. A tile

is a raster

and a table of rasters can be seen as a tile coverage. Hence, contrary

to Oracle GeoRaster, only one table is needed to store a coverage and

there is only one type.

-It support raster in the database AND raster out of the database. This

mailing list has shown that both approaches have their pros and cons.

Let’s not impose one approach over the other.

-It supports: multi-band, pyramids and variable nodata value, raster

size, pixel size and pixel type for each row.

-It supports import/export from/to Tiff and JPEG.

But moreover seamless raster/vector integration is materialized by

theses propositions:

-A lot of existing vector-only functions are adapted to work

with raster

also.

-Most new raster functions are also implemented to work with vector.

Only basic raster derivation functions are proposed.

-Most vector/vector existing functions are adapted to work seamlessly

with raster/vector or raster/raster

I also want to ask:

-What is the status of PGRaster? Is any development now underway?

We have some resources to devote to this project over the next

few years

and are very interested in forming collaborations to move this work

forward.

Pierre Racine

GIS/Programmer Analyst

University Laval

http://www.cef-cfr.ca/index.php?n=Membres.PierreRacine

Putting your own content into a website with Google Maps

So I’m doing a little research tonight on some requirements for a client:

1. Embed a map in the client’s home page (non-database driven ASP.Net which is essentially HTML) that depicts the county’s watersheds and some number of additional layers
2. Let users click each watershed polygon to trigger an event that leads to them learning more about the conditions within the polygon (think old school image maps). This could be as simple as take the user to a static HTML page about the watershed.

The client currently supports ArcIMS (v9.1) and Google Maps. There is no immediate plan to move to ArcGIS server, though we will learn more next week.

I think that embedding ESRI in their pages for such a simple need is overkill and the client has already won awards for their use of Google Maps. So I’m researching options to use the Google Maps API and Google Gadgets to meet their needs