Dropping columns in sqlite/spatialite

SQLite doesn’t have a nice ALTER TABLE DROP COLUMN command and neither does spatialite. Instead, you get to run a long sequence of commands like this. Here I wanted to drop all the extra columns from my huc12 layer for the country.

So, starting with a table that looks like this

CREATE TABLE "huc12" (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
"OBJECTID" INTEGER,
"HUC_8" TEXT,
"HUC_10" TEXT,
"HUC_12" TEXT,
"ACRES" DOUBLE,
"NCONTRB_A" DOUBLE,
"HU_10_GNIS" TEXT,
"HU_12_GNIS" TEXT,
"HU_10_DS" TEXT,
"HU_10_NAME" TEXT,
"HU_10_MOD" TEXT,
"HU_10_TYPE" TEXT,
"HU_12_DS" TEXT,
"HU_12_NAME" TEXT,
"HU_12_MOD" TEXT,
"HU_12_TYPE" TEXT,
"META_ID" TEXT,
"STATES" TEXT,
"GlobalID" TEXT,
"SHAPE_Leng" DOUBLE,
"SHAPE_Area" DOUBLE, "Geometry" MULTIPOLYGON)

and ending with

CREATE TABLE "huc12" (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
"HUC_12" TEXT,
"ACRES" DOUBLE,
"HU_12_GNIS" TEXT,
"HU_12_NAME" TEXT,
"HU_12_MOD" TEXT,
"HU_12_TYPE" TEXT,
"GlobalID" TEXT,
"Geometry" MULTIPOLYGON)

I ran this stuff in the middle.

BEGIN TRANSACTION;
CREATE TEMPORARY TABLE "huc12sm" (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
"HUC_12" TEXT,
"ACRES" DOUBLE,
"HU_12_GNIS" TEXT,
"HU_12_NAME" TEXT,
"HU_12_MOD" TEXT,
"HU_12_TYPE" TEXT,
"GlobalID" TEXT,
"Geometry" MULTIPOLYGON);
INSERT INTO huc12sm SELECT PK_UID, HUC_12, ACRES, HU_12_GNIS, HU_12_NAME, HU_12_MOD, HU_12_TYPE, GlobalID, Geometry FROM huc12;
DROP TABLE huc12;
CREATE TABLE "huc12" (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
"HUC_12" TEXT,
"ACRES" DOUBLE,
"HU_12_GNIS" TEXT,
"HU_12_NAME" TEXT,
"HU_12_MOD" TEXT,
"HU_12_TYPE" TEXT,
"GlobalID" TEXT,
"Geometry" MULTIPOLYGON);
INSERT INTO huc12 SELECT PK_UID, HUC_12, ACRES, HU_12_GNIS, HU_12_NAME, HU_12_MOD, HU_12_TYPE, GlobalID, Geometry FROM huc12sm;
DROP TABLE huc12sm;
COMMIT;
VACUUM;
HUC_12, ACRES, HU_12_GNIS, HU_12
SELECT COUNT(*), GeometryType("Geometry"), Srid("Geometry"), CoordDimension("Geometry")
FROM "huc12"
GROUP BY 2, 3, 4;

SELECT RebuildGeometryTriggers('huc12', 'Geometry')

I wish someone would package this stuff into a nice function… hint hint.

 

 

Of file sizes and nearest neighbors

I’m continuing my exploration of Spatialite and decided to try the classic problem of identifying the nearest feature to a selected feature. This is commonly known as a nearest neighbor question and Regina and Leo at Boston GIS cover the issues quite well in their posts here, here, and here.

But, of course each computer system is a little different and generic solutions that work on many systems are rarely the most efficient. In spatial databases, how one uses indexes is a common difference and Spatialite doesn’t implement spatial indexes in the same was as say PostGIS. Therefore, the approach for accessing the indexes through SQL statements also differs.

So, I’ll keep running with my classic point that I’ve been using in this series. The new question is:

 “What is the nearest “place feature” (another point) to the point location at -70.25 E, 43.802 N?”

As with my HUC12 question, I want to try to make Spatialite slow by asking a simple question against a lot of data. I want to push its boundaries or determine that its performance is essentially always within the “excellent” range when faced with any reasonable amount of data.

So, I needed to find a big point data set.. and hopefully oen that I might actually use later. While I think the Geonames dataset has like 7 million points, I decided to keep things closer to home and use the US national dataset from GNIS at USGS. In the US national GNIS file there are about 2.2 million points of varying types.

Lots of dots

Lots of dots


All the dots

All the dots

and here is the dot density for New York City area.

Recognize this shape?

Recognize this shape?

Which leads me to a <digression>

These GNIS data are available as a pipe-delimited text file. But The first few characters in the first row (right before the first column name, or maybe it’s some kind of magic binary header) are always screwed up.. which confounds some software that starts with Arc… .

The offending characters in the GNIS text files

The offending characters in the GNIS text files

So, I always end up having to bring the data through Microsoft Access to cleanse them.. then back out again usually as a comma-delimited, double-quoted text file (CSV). In this case, I ran that “fixed” text file through ArcCatalog to make a Shapefile that I could import into Spatialite.. because I seem to like to torture myself. Of course, halfway through the import Spatialite informed me that it couldn’t proceed because one of the geometries was corrupted. Throwing my hands up, I switched to using Spatialite’s heavenly text import routine to import the pipe-delimited file into a table.

 

Then I simply ran these commands from a tutorial at Spatialite’s site:

1
 SELECT AddGeometryColumn('XYGNIS', 'Geometry',  4326, 'POINT', 'XY', 0);

Where the value explanations are :

1
AddGeometryColumn( TABLE String , COLUMN String , srid INTEGER , geom_type String , dimension String [ , not_null INTEGER ] ) : INTEGER

Voila. This created the needed structures in the database to hold the indexes, setup needed triggers, etc. Then I created the point geometry with a recipe like the following:

1
UPDATE XYGNIS SET Geometry = MakePoint(PRIM_LONG_DEC, PRIM_LAT_DEC,4326);

Lastly, I had to tell Spatialite to build the spatial indexes. I just did this through the GUI by right-clicking on my new Geometry field.

Why am I telling you all this? Well, for it was interesting to see the differences in file sizes representing each data type. Each has varying amount of indexes (the Access .accdb has none, but Spatialite and the Shapefiles have spatial indexes). But it’s still interesting to see the differences in the amount of room needed to store the same records in various file formats. For education I also made an ArcGIS 9.3 Personal Geodatabase and File Geodatabase as shown below. I’m not going to test performance differences here. But maybe when I finally upgrade to ArcGIS 10.X and I have a handy SQL layer option I’ll try something like that.

 </digression>

 

But now back to my question. Here is the performance without using my indexes.

1
2
3
4
5
SELECT feature_name, feature_class, ST_Distance(Geometry,
MakePoint(-70.250, 43.802)) AS Distance
FROM XYGNIS
WHERE distance &lt; 0.1
ORDER BY distance LIMIT 1
No indexes, nearest point feature within 0.1 degrees

No indexes, nearest point feature within 0.1 degrees

 

44.5 seconds. Not as bad one would think. I’m using a radius of 0.1 degrees to limit how many results I get back. I played around with this value and you should too. To find the very nearest feature, I LIMIT the number of my results to 1, and because I’ve sorted, or ordered my results ascending by distance, I can get a single answer that is the very nearest feature.

 

I still want my queries to finish in less than a second, so with a little help from an email response from Sandro, here is the same query against 2.2 million points that finishes in 0.036 seconds. Barely enough time to open the database connection I think.

1
2
3
4
5
6
7
8
9
10
SELECT feature_name, feature_class, ST_Distance(Geometry,
MakePoint(-70.250, 43.802)) AS Distance
FROM XYGNIS
WHERE distance &lt; 0.1
AND ROWID IN (
SELECT ROWID FROM SpatialIndex
WHERE f_table_name = 'XYGNIS'
AND search_frame =
BuildCircleMbr(-70.250, 43.802, 0.1))
ORDER BY distance LIMIT 10
Still fast enough

Still fast enough

Pretty cool. BTW, if I wanted more than one result, I can just change the LIMIT to a higher number, like 10.

When 1 isn't enough

When 1 isn't enough

Spatialite Speed Test

This post is part of a series [1], [2], [3], [4]

Based on my earlier tests I felt confident that I could expand the size of the dataset to my intended extent. So I grabbed 12-digit HUCS for the entire United States. I was confident this would crush spatialite and finally make the response time for my question extend to close to 1 second. But just in case it didn’t, I would make my question harder at the same time.

While I’m playing with these queries I need a reference guide. Of course the starting point is the spatialite SQL guide found here http://www.gaia-gis.it/gaia-sins/spatialite-sql-3.0.0.html. But, while this is a nice list of functions, I need more help than it provides. Thankfully, spatialite shares many technical concepts with PostGIS (follows OGC, etc. SQL and data standards, and they both use the GEOS spatial library). So, I’ve been successfully poaching help from the PostGIS documentation (http://www.postgis.org/docs/ST_GeomFromText.html) … which is quite good and gives plenty of use examples.

So, getting back to the data, here is my new queryable data set.

12-digit hydrologic units for the entire US

12-digit hydrologic units for the entire US

The pink blob on the right my new, harder question.. searching with not one pair of coordinates, but a bunch of pairs in the form of a pink polygon. Pink scares computers, so this should hurt it a little. It might also be scary that my database file representing the HUCs is now 1.9GB in size (lots of coordinates and the indexes to describes them).

Pink test polygon

Pink test polygon

Because I’m likely going to be using coordinate pairs passed in from some kind of Web application, I converted the polygon to well-known text using

1
SELECT ST_AsText(geom) FROM test_polys;

which of course gives us

select a geometry (polygon) as well-known text (wkt)

With the handy text string to describe my polygon given to me, I’m able to just copy and paste it into my text SQL. So let’s do that and the first query should really hurt because I’m not going to use an index. Note that I switched from Contains.. to Intersects since I want to detect anything that touches my pink poly.

1
2
SELECT HU_12_NAME FROM huc12
WHERE ST_Intersects(Geometry, ST_GeomFromText('POLYGON((-70.286127 43.839038, -70.305482 43.823696, -70.30855 43.798912, -70.292028 43.773421, -70.243169 43.769644, -70.231367 43.774601, -70.193601 43.811186, -70.186048 43.836206, -70.264648 43.839274, -70.264648 43.839274, -70.274089 43.840218, -70.286127 43.839038))')) = 1

How did it do? Surprisingly well. 40 seconds give or take. Of course, that won’t work for my application, so I’m keeping my fingers crossed that the index rescues me.

Polygon query with no index

Polygon query with no index

Now how about with the index? Here’s the query.

1
2
3
4
5
6
7
SELECT HU_12_NAME FROM huc12
WHERE ST_Intersects(Geometry, ST_GeomFromText('POLYGON((-70.286127 43.839038, -70.305482 43.823696, -70.30855 43.798912, -70.292028 43.773421, -70.243169 43.769644, -70.231367 43.774601, -70.193601 43.811186, -70.186048 43.836206, -70.264648 43.839274, -70.264648 43.839274, -70.274089 43.840218, -70.286127 43.839038))')) = 1
AND ROWID IN (
SELECT ROWID
FROM SpatialIndex
WHERE f_table_name = 'huc12'
AND search_frame = ST_GeomFromText('POLYGON((-70.286127 43.839038, -70.305482 43.823696, -70.30855 43.798912, -70.292028 43.773421, -70.243169 43.769644, -70.231367 43.774601, -70.193601 43.811186, -70.186048 43.836206, -70.264648 43.839274, -70.264648 43.839274, -70.274089 43.840218, -70.286127 43.839038))'));

And survey says! 0.186 seconds! Oh yeah.

Pink polygon query with spatial index

Pink polygon query with spatial index

So this all well and good, but the real reason why these queries are so fast is because the test geometry (the pink polygon) is so small. So lets push that a little.

Multipart polygon with lotsa geometry

Multipart polygon with lotsa geometry

So, here I’ve made a single multipart polygon with lots of vertices to keep my query simple. I’ll spare you the geometry and the query, but the pink polygon above, querying a whole country of HUC12s with the spatial index, took 1.4 seconds. So, we finally broke our time limit with enough testing geometry.

Spatialite and Spatial Indexes

This post is part of a series [1][2][3][4]

Tonight I continued dabbling with my little project and experimenting with spatial indexes in Spatialite. I quickly realized that while using indexes benefitted the queries, the questions were too easy and the queries were finishing very quickly regardless of using indexes or not (nice problem to have). Therefore, the benefits of using the indexes were being swamped out by little errors in timings.

So, I made a bigger dataset. As shown below, the number of features I’m testing against went from 183 to 2429. You’d think that would be enough, but stay tuned…

Bigger sheds

A bigger, badder testing dataset. Note the previous data set at 183 features (purple) and the much larger count of features in the new, blue polys.

So, armed with this larger dataset I proceeded to test sensitivity to using indexes.

I dinked around with trying to write own my R-tree index queries (the “some index” examples below) and they helped. Then Sandro (author of spatialite) replied to a post I made seeking guidance. There he re-wrote my query (best examples below)and suggested I read about a new, easier way to use spatial indexes here http://www.gaia-gis.it/gaia-sins/SpatialIndex-Update.pdf

The Original Query

So here is the original query.

1
2
SELECT HU_12_NAME FROM huc12
WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802)) = 1

Faced with 2429 polygons, the original query takes about 1.9 seconds on average. My goal is to get results back from anywhere in the country in 1 second or less. Clearly this type of query isn’t going to cut it.

Timings without index

Timings without index

I’m also including the explain plans here. Obviously without an index the query has to do a full scan of the table to figure out which records it needs. Scans are bad. Scans of many records are very bad.

Explain plan with no index

Explain plan with no index

A Better Query

Ok, so to scan less you need to do some filtering based on something to reduce the records you’re talking to. In this case, we use the bboxes from the index tables to grab a more limited set of features to actually perform the more intensive ST_Contains test with just to make sure the point ACTUALLY falls within the set of polygons the BBOX suggests.

1
2
3
4
5
SELECT HU_12_NAME FROM huc12
WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802))
AND
ROWID IN (
SELECT pkid FROM idx_huc12_Geometry WHERE xmin &lt; -69 AND xmax &gt; -71)

OK, now I was getting ~1.1 seconds pretty reliably. Yep, that’s detectable change between the two queries. So, we have enough records to see the benefits of the improved queries. This was an improvement, but not enough.

Timings with some index

Timings with some index

The explain plan gets better. We see the filtering and while we’re scanning, it’s more manageable.

Explain plan with some index

Explain plan with some index

 Best Query

So then came Sandro’s reply. In it he guided me to this type of query.. which rocks.

1
2
3
4
5
6
7
SELECT HU_12_NAME FROM huc12
WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802)) = 1
AND ROWID IN (
SELECT ROWID
FROM SpatialIndex
WHERE f_table_name = 'huc12'
AND search_frame = MakePoint(-70.250,43.802));

And here we see true magic. The original query with no indexes ran for 1.9 seconds. Some index sped things up to 1.1. Now, hold on to your socks, Sandro’s more proper query comes in at 0.16 seconds. That kind of performance really gives me hope.

Timings with some index

Timings with some index

We don’t learn too much more from the explain plan. But hey, who needs to ask more questions of that kind of speed up.

Explain plan with optimal virtual index

Explain plan with optimal virtual index

That was fun. What does it do when stuck into PHP?

First let’s just do the same old “no index” version.

PHP page with more features and no index in query

PHP page with more features and no index in query

About 2.5 seconds on average. No surprises there.

How about with some spatial index.

PHP page with more features and some spatial index in query

PHP page with more features and some spatial index in query

 

About 1.8 seconds on average. Shaved quite a bit off, but we’re still way over the 1 second goal. How about with Sandro’s query?

PHP page with more features and best spatial index in query

PHP page with more features and best spatial index in query

Cool. We’re below the 1 second mark. I even saw a 0.7 second timing every so often. But, we’re very close to 1 second, so I’m starting to sweat a little. But wait, there’s more. I’ve still got those little filler queries happening on the page, and one of them is a bulky SELECT DISTINCT which is going to scan my larger dataset now. So, what happens when we get rid the crud?

PHP page with more features, best spatial index, and extra queries removed

PHP page with more features, best spatial index, and extra queries removed

Yes! At 0.26 seconds we’re way below 1 second again. Phew! What about the original query? What happens when we remove the cruft from that page but still use the un-optimized query?

PHP page with no index, but extra queries removed

PHP page with no index, but extra queries removed

1.9 seconds. Some improvement from the first page, but not where we need it. No worries, we’re moving on!

Example with PHP and Spatialite, part 2

This post is part of a series [1][2][3][4]

So I’m ready to take the next steps with my little project. This is a continuation of my previous post about my little journey. At this point I am connecting to a physical database file that I loaded with some sample data (12-digit watersheds). Now I’m going to practice with queries and you can see the results below.

Here are the base data.

12-digit US watersheds (HUC12) and the example data set used here. Found in Southern Maine, Cumberland

And here are some close ups of the data. These are fairly dense polygons.

Example of polygon vertices

Example of polygon vertices

In fact, it looks like this query is testing the relationship between the point and polygons formed by 144,700 coordinate pairs (vertices) by scanning without the help of an index.

Lots of little points to check

Lots of little points to check

At this point I’m just going to perform basic queries without using spatial indexes. You will almost always want to use spatial indexes, but I’m going to practice this in phases so these examples won’t use indexes.

Note that unlike tradition database indexes, spatial databases like Spatialite and PostGIS (and their GiST/R-tree indexes) do not use indexes for spatial queries unless you explicitly tell them to (though PostGIS seems to use them by default some of the time). You must smartly craft the use of indexes the same way that you do the SQL itself… or at least it seems this way to me.

In Spatialite, the indexes are just tables and you have to add subqueries to your query to grab the bounding rectangles from the Rtrees to pre-filter your queries for the index-driven speed-up.

And here are the spatial indexes that spatialite sees.

spatialite_indexes

Spatial indexes used in spatialite

So what’s in these indexes? Boxes…as we see below. Hopefully you can imagine how we get boxes from Xmin, Ymin, Xmax, Ymax extents. There is one box for each polygon HUC12 feature (note the PK_UID is the primary key of the main geometry layer). These simple boxes are much simpler to test for spatial relationships that the multitude of vertices we saw above. But also much less accurate; especially for funny shaped things like watersheds. But, we can use these simple boxes to pre-filter the number of features that have to be tested by the more accurate (but lengthy) spatial test – hence speeding up the overall operation in many cases.

What is in a name... or an Rtree index.

What is in a name... or an Rtree index.

 

Below is an example of the spatial query used in the code below. Translated, it says, “show me the name of the HUC12 that contains this point.”

The free gui provided by spatialite and a spatial query

The free gui provided by spatialite and a spatial query

Here are the files in the project so far. Of course you’re not normally going to be putting a  loadable extension library (libspatialite.so) in a web server file directory. But, this is just practice.

Files so far for this project

Files so far for this project

Here’s the code of db.php. This isn’t using spatial indexes, so it’s scanning 183 features and a whole bunch of vertices to figure out which polygon actually contains that point… and doing a couple simpler things like opening a connection, asking some simple questions, and closing the connection all in about 0.4 seconds.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
&lt;html&gt;
&lt;head&gt;
&lt;title&gt;Testing SpatiaLite on PHP&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;h1&gt;Testing SpatiaLite on PHP&lt;/h1&gt;

&lt;?php
// Start TIMER
// -----------
$stimer = explode( ' ', microtime() );
$stimer = $stimer[1] + $stimer[0];
// -----------
try {
/*** connect to SQLite database ***/
$db = new SQLite3('db.sqlite');

/*** a little message to say we did it ***/
echo 'database connected';
}
catch(PDOException $e)
{
echo $e-&gt;getMessage();
}
# loading SpatiaLite as an extension
$db-&gt;loadExtension('libspatialite.so');

# reporting some version info
$rs = $db-&gt;query('SELECT sqlite_version()');
while ($row = $rs-&gt;fetchArray())
{
print "&lt;h3&gt;SQLite version: $row[0]&lt;/h3&gt;";
}
$rs = $db-&gt;query('SELECT spatialite_version()');
while ($row = $rs-&gt;fetchArray())
{
print "&lt;h3&gt;SpatiaLite version: $row[0]&lt;/h3&gt;";
}

/* SELECT HU_12_NAME FROM huc12 WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802));
*/

/*
* Create a query
*/

$sql = "SELECT DISTINCT Count(*), ST_GeometryType(Geometry), ";
$sql .= "ST_Srid(Geometry) FROM huc12";
$rs = $db-&gt;query($sql);
while ($row = $rs-&gt;fetchArray())
{
# read the result set
$msg = "There are ";
$msg .= $row[0];
$msg .= " entities of type ";
$msg .= $row[1];
$msg .= " SRID=";
$msg .= $row[2];
print "&lt;h3&gt;$msg&lt;/h3&gt;";
}

$sql = "SELECT HU_12_NAME FROM huc12 WHERE ST_Contains(Geometry, MakePoint(-70.250,43.802))";
$rs = $db-&gt;query($sql);
while ($row = $rs-&gt;fetchArray())
{
# read the result set
$msg = "Your point is in the HUC12: ";
$msg .= $row[0];
print "&lt;h3&gt;$msg&lt;/h3&gt;";
}
/*
* do not forget to release all handles !
*/


# closing the DB connection
$db-&gt;close();

// End TIMER
// ---------
$etimer = explode( ' ', microtime() );
$etimer = $etimer[1] + $etimer[0];
echo '&lt;p style="margin:auto; text-align:center"&gt;';
printf( "Script timer: &lt;b&gt;%f&lt;/b&gt; seconds.", ($etimer-$stimer) );
echo '&lt;/p&gt;';
// ---------

?&gt;

&lt;/body&gt;
&lt;/html&gt;

 

Not too bad, but I want this faster because I want to feed it much larger data in my final project.

Results of the first try at this

Results of the first try at this

Example with PHP and Spatialite, part 1

This post is part of a series [1][2][3][4]

So, PHP supports SQLite out of the box (http://www.sqlite.org/, now at 3.7.10), making it a nice combo if you want to do some reads from your page. My impression is that SQLite is not recommended if you want stuff with database writes and you have more than a couple visitors. But reading seems to be fine.

I think this fits my use cases just fine as I just want to hang out some very basic utility services that can run on the “single beige box in the corner” or the “beige cloud in the sky” with few resources needed. First, I want to create a simple REST service that, when passed a pair of long/lat coordinates, will do nothing but return that name of the county they are in. Then I’ll do one for watershed identifiers (USDA WBD HUC12 to be exact), and eventually maybe I’ll work up to a nearest place service (http://www.geonames.org/) and so on. Maybe even some downstream/upstream routing with Spatialite’s network utilities [1], [2], [3].

Of course, these are spatial functions so I’m using the spatial extension to SQLite called Spatialite found at http://www.gaia-gis.it/gaia-sins/. I find Spatialite to be a profoundly elegant amalgam of existing projects (SQLite and GEOS) and new, efficient and pragmatic programming that fills an empty niche. Here that niche is don’t make me deploy anything more than I need. I simply want some basic, re-usable services that don’t do much, including have their data updated – and I don’t want to run a heavy spatial infrastructure just to cheaply answer some basic questions on a $6/month virtual, private, cloud LAMP box with 256MB of memory or this little Pentium 4 appliance running in my snack drawer at work.

So, I began compiling spatialite, and at the time I was using 3.0beta1a, so I just kept running with it. I’m still learning the basics of spatialite so I dinked around a bit. Then I followed the instructions for getting spatialite running within PHP  [here] and/or [here] (not sure which one is the official guide. The site has been migrating to a new infrastructure lately).

After making way too many typos, I got it working and am getting the expected output. I also added some timer code which tells me that from my Ubuntu VM running on my 6-month-old laptop I’m completing these ~30,000 operations in about 6 seconds against the in-memory database, including opening and closing the connection to a database and tables that are created each page load.

Sample spatialite with PHP screen

Sample spatialite with PHP screen

My next exercise will be to figure out how to connect to an existing disk-based DB and try some simpler operations. My goal will be to get my operations out the door in about 1 second on modest hardware under no load.

[code below]
Continue reading

Remove all foreign keys from MySQL database

I needed to load some lookup data into a MySQL database littered with foreign keys. So I copied the empty DB and ran the output of the following:

1
2
3
SELECT CONCAT('alter table ',table_schema,'.',TABLE_NAME,' DROP FOREIGN KEY ',constraint_name,';')
FROM information_schema.table_constraints
WHERE constraint_type='FOREIGN KEY' AND table_schema = 'MY_DATABASE_NAME_HERE';

I loaded the values, and dumped just the values (only insert statements) that I will use to populate the version of the DB with my FK’s intact.

Creating an ST_ table and view in Oracle for ArcSDE

I created a table with ST_GEOMETRY as a column with SQL, inserted records, created a view with some filtering and registered both with ArcSDE.

They are both accessible and useable from ArcMap and ArcCatalog.

Steps:
create and load the table

 

1
2
3
4
5
6
7
8
9
CREATE TABLE TEST_GIS_PERMITS (OID INTEGER NOT NULL, permit_no nvarchar2(12), location sde.st_geometry);

CREATE INDEX IX1_TGP ON TEST_GIS_PERMITS (LOCATION) INDEXTYPE IS SDE.ST_SPATIAL_INDEX PARAMETERS('ST_GRIDS = 4644.5262325165 ST_SRID = 8 ST_COMMIT_ROWS = 10000  PCTFREE 0 INITRANS 4') NOPARALLEL;

CREATE UNIQUE INDEX UX1_TGP ON TEST_GIS_PERMITS (OID) NOLOGGING NOPARALLEL;

INSERT INTO TEST_GIS_PERMITS (OID, permit_no, location) (SELECT objectid, permit_no, shape FROM SW_PERMITS_09_2007);

commit;

then create the view

1
2
3
4
5
6
7
8
9
10
11
CREATE VIEW V_TEST_GIS_PERMITS

AS

SELECT oid, permit_no, location

FROM TEST_GIS_PERMITS

WHERE SUBSTR(permit_no,1,3) = 'A90';

    

register the layer with SDE

1
2
3
4
5
6
7
C:\Users\myname&gt;sdelayer -o register -l TEST_GIS_PERMITS,LOCATION -t ST_GEOMETRY -C OID,USER -u GA_DEV -p devGA0628 -s DIVS135GEODEV -i sde:oracle11g:/:GA_DEV -e p -R 1

ArcSDE 10.0  for Oracle11g Build 685 Fri May 14 12:05:43  2010
Layer    Administration Utility
-----------------------------------------------------

Successfully Created Layer.

 

 Register the view with SDE

1
2
3
4
5
6
7
8
9
C:\Users\myname&gt;sdelayer -o register -l V_TEST_GIS_PERMITS,LOCATION -t ST_GEOMETRY -C OID,USER -u GA_DEV -p devGA0628 -s DIVS135GEODEV -i sde:oracle11g:/:GA_DEV -e p -R 1

ArcSDE 10.0  for Oracle11g Build 685 Fri May 14 12:05:43  2010

Layer    Administration Utility

-----------------------------------------------------

<span>Successfully Created Layer.

Now we can read and write to this layer using ESRI’s ST_ SQL or ArcMap. and read the latest records from the view (which limits the out of date records since we are versioning using home grown methods in Oracle tables) with ArcGIS Server OR ArcMAP OR our Oracle/.Net application which doesn’t care about the geometry.

Read the Descriptions from all columns in MS Access database

From one of my developers. This VBA (couldn’t find an approach in .Net) will loop through all the tables and write out the description from all the fields.

 

TODO’s include grabbing descriptions for tables and then tweaking below to create ALTER statements so that the comments can be applied to a server RDBMS after migration from Access (since almost none of the migration utilities I’ve seen migrate this documentation from Access).

 

Option Explicit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
'call readAllTables
Public Function readAllTables()

Dim DB As Database, tbl As TableDef, fld As DAO.Field

Dim RS As Recordset
Dim Table As String
Dim allDesc As String

Set DB = CurrentDb()

For Each tbl In DB.TableDefs
If Left$(tbl.Name, 4) &lt;&gt; "MSys" Then
'Debug.Print "In Table " &amp; tbl.Name '&amp; "      " &amp; tbl.DateCreated &amp; "      " &amp; tbl.LastUpdated &amp; "     " &amp; tbl.RecordCount
allDesc = allDesc &amp; vbNewLine &amp; "Table:" &amp; tbl.Name
' optional code to print all the fields
On Error Resume Next
For Each fld In tbl.Fields
'Debug.Print fld.Name
allDesc = allDesc &amp; vbNewLine &amp; fld.Name &amp; ":" &amp; fld.Properties("Description")
Next fld
End If
Next tbl

WriteToATextFile (allDesc)
End Function

Sub WriteToATextFile(ByVal outputStr)
'first set a string which contains the path to the file you want to create.
'this example creates one and stores it in the root directory
Dim MyFile As String
MyFile = "c:\" &amp; "TableFieldsWithDesc.txt"
'set and open file for output
Dim fnum As Integer
fnum = FreeFile()
Open MyFile For Output As fnum
'write project info and then a blank line. Note the comma is required
Write #fnum, outputStr
Write #fnum,
Close #fnum
End Sub