Benchmarking MySQL

So I’m trying to figure which old machine to turn into a little mySQL number cruncher. So, I’m going to do some clean installs of Ubuntu server on each and run this little script (with the same my.cnf) and see how they fair. Perhaps you will this useful, run it a few times.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/bin/sh
# jcz 2011-April-22
#
# This script will time your MySQL database in a repeatable way
#
# Date and other variables pretty self explanatory, S is seconds
# date format is currently YYYYMMDD_HHMMSS
    dater=$(date +%Y%m%d_%H%M%S)
    dayer=$(date +%a)
    myhost=$(hostname)
    directory=$(pwd)
    outfile="slapout.txt"

# THE MYSQL USER
    super="username"

# THE MYSQL SUPERPWORD
    superword="password"

# THE MYSQL HOSTNAME or IP.
    hoster="localhost"

echo "------------------ BEGIN --------------------" >> $outfile
date >> $outfile
echo $myhost >> $outfile
echo $directory >> $outfile

# COPY THE COMMAND BELOW
mysqlslap -u$super -p$superword  -h$hoster -v --concurrency=1 --iterations=2 --number-int-cols=4 --number-char-cols=5 --auto-generate-sql --auto-generate-sql-secondary-indexes=3 --engine=myisam,innodb  --auto-generate-sql-add-autoincrement --auto-generate-sql-load-type=mixed  --number-of-queries=2 >> $outfile

 echo "................................" >> $outfile
 # PASTE THE COMMAND BELOW BETWEEN THE QUOTES OR EDIT BOTH. I CAN'T FIND ANOTHER WAY TO RECORD IT
echo "mysqlslap -u$super -p$superword  -h$hoster -v --concurrency=1 --iterations=2 --number-int-cols=4 --number-char-cols=5 --auto-generate-sql --auto-generate-sql-secondary-indexes=3 --engine=myisam,innodb  --auto-generate-sql-add-autoincrement --auto-generate-sql-load-type=mixed  --number-of-queries=2" >> $outfile
echo "................................" >> $outfile
echo "The above command was executed to produce the results above it." >> $outfile
echo "------------------ END --------------------" >> $outfile
echo "" >> $outfile
echo "" >> $outfile
echo "" >> $outfile

The above script makes output like below

Continue reading

Run queries in sequence in Access

I’ve gotten so used to using “real” databases, that I find myself frustrated when I have to switch back to Microsoft Access. But, hey, it’s good for a lot of things.

Annoyingly, if you just want to run some SQL back-to-back, or one after another, you have to call it in VBA. So, create a module and do something silly like.

Sub update_results()
DoCmd.SetWarnings False
DoCmd.OpenQuery “q_1-4000″
DoCmd.OpenQuery “q_4001-8000″
DoCmd.OpenQuery “q_the_rest”
DoCmd.SetWarnings True
End Sub

The SetWarnings stuff stops pop ups from annoying you about the fact that you are going to update some data.

Screen capture programs

I always need to capture computer screens into graphic files. Even for my Mom who needs the occasional over-email tutorial. So here is a list of screen capture utitlities that I’ve used over the years. Snag-it is by far my tool of choice. But really, all will work. Do you know of any more?

Do you run Firefox? If so, try this.

https://addons.mozilla.org/en-US/firefox/addon/awesome-screenshot-capture-/?src=discovery-learnmore

 or..

 I bought this, can’t live without it for $50

http://www.techsmith.com/snagit/

 

 

This is free and I’ve liked it over the years

http://cropper.codeplex.com/releases/view/56509

http://blogs.geekdojo.net/brian/articles/Cropper.aspx

 

Some people like this, but I never got into it

http://www.gadwin.com/printscreen/

 

Tables of Tables from MySQL

From Matthew Crowley on the MySQl forums (http://forums.mysql.com/read.php?101,8004), this php script will output a DESC of all databases and tables in MySQL. It needs some formatting for the output, but it works and might be handy later. I really just need to get around to figuring out how to do this in a procedure or something.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
$connection = mysql_connect("localhost","root","PASSWORD");

$showDB = mysql_query("show databases");
if($myrow=mysql_fetch_array($showDB))
{
do
{
$DB = $myrow["Database"];
echo "$DB \n";
$showTable = mysql_query("show tables from $DB");
if($myrow=mysql_fetch_array($showTable))
{
do
{
$col = "Tables_in_".$DB;
$Table = $myrow["$col"];
echo "$Table \n";
$describeTable = mysql_query("describe $DB.$Table");
if($myrow=mysql_fetch_array($describeTable))
{
do
{
$field = $myrow["Field"];
$type = $myrow["Type"];
$null = $myrow["Null"];
$key = $myrow["Key"];
$default = $myrow["Default"];
$extra = $myrow["Extra"];
echo "$field \t $type \t $null \t $key \t $default \t $extra \n";
}
while ($myrow=mysql_fetch_array($describeTable));
}
}
while ($myrow=mysql_fetch_array($showTable));
}
}
while ($myrow=mysql_fetch_array($showDB));
}

Handy database documenter for MySQL

UPDATE: see the next iteration on this project [here]

 

Here’s a view that will spit out just about everything MySQL (5.1) knows about the tables and fields it maintains for you. The first field can be joined to the output of something like

1
SELECT * FROM AZ_CA_NV_UT_species_LOCAL PROCEDURE ANALYSE(10000, 4000);

to see before and after “optimal” (1)   (2) field types and lengths predicted by the internal PROCEDURE ANALYSE.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_field_table_data` AS
SELECT
CONCAT_WS('.',`tables`.`TABLE_SCHEMA`,`columns`.`TABLE_NAME`,`columns`.`COLUMN_NAME`) AS `FIELD_NAME`,
`tables`.`TABLE_COMMENT`             AS `TABLE_COMMENT`,
`columns`.`TABLE_NAME`               AS `TABLE_NAME`,
`columns`.`COLUMN_NAME`              AS `COLUMN_NAME`,
`columns`.`COLUMN_TYPE`              AS `COLUMN_TYPE`,
`columns`.`DATA_TYPE`                AS `DATA_TYPE`,
`columns`.`COLUMN_KEY`               AS `COLUMN_KEY`,
`tables`.`TABLE_SCHEMA`              AS `TABLE_SCHEMA`,
`tables`.`TABLE_TYPE`                AS `TABLE_TYPE`,
`tables`.`ENGINE`                    AS `ENGINE`,
`tables`.`VERSION`                   AS `VERSION`,
`tables`.`ROW_FORMAT`                AS `ROW_FORMAT`,
`tables`.`TABLE_ROWS`                AS `TABLE_ROWS`,
`tables`.`AVG_ROW_LENGTH`            AS `AVG_ROW_LENGTH`,
`tables`.`DATA_LENGTH`               AS `DATA_LENGTH`,
`tables`.`MAX_DATA_LENGTH`           AS `MAX_DATA_LENGTH`,
`tables`.`INDEX_LENGTH`              AS `INDEX_LENGTH`,
`tables`.`AUTO_INCREMENT`            AS `AUTO_INCREMENT`,
`tables`.`CREATE_TIME`               AS `CREATE_TIME`,
`tables`.`UPDATE_TIME`               AS `UPDATE_TIME`,
`tables`.`TABLE_COLLATION`           AS `TABLE_COLLATION`,
`tables`.`CHECKSUM`                  AS `CHECKSUM`,
`tables`.`CREATE_OPTIONS`            AS `CREATE_OPTIONS`,
`columns`.`ORDINAL_POSITION`         AS `ORDINAL_POSITION`,
`columns`.`COLUMN_DEFAULT`           AS `COLUMN_DEFAULT`,
`columns`.`IS_NULLABLE`              AS `IS_NULLABLE`,
`columns`.`CHARACTER_MAXIMUM_LENGTH` AS `CHARACTER_MAXIMUM_LENGTH`,
`columns`.`CHARACTER_OCTET_LENGTH`   AS `CHARACTER_OCTET_LENGTH`,
`columns`.`NUMERIC_PRECISION`        AS `NUMERIC_PRECISION`,
`columns`.`NUMERIC_SCALE`            AS `NUMERIC_SCALE`,
`columns`.`CHARACTER_SET_NAME`       AS `CHARACTER_SET_NAME`,
`columns`.`COLLATION_NAME`           AS `COLLATION_NAME`,
`columns`.`EXTRA`                    AS `EXTRA`,
`columns`.`PRIVILEGES`               AS `PRIVILEGES`,
`columns`.`COLUMN_COMMENT`           AS `COLUMN_COMMENT`,
NOW()                                AS `RUN_DATETIME`
FROM (`information_schema`.`tables`
JOIN `information_schema`.`columns`
ON (((`tables`.`TABLE_SCHEMA` = `columns`.`TABLE_SCHEMA`)
AND (`tables`.`TABLE_NAME` = `columns`.`TABLE_NAME`))))

MySQL Functions

I just want to record these for future reference. I’m actually using the first now.

From the MySQL 5.0 Online manual

An example of how to make the first letter in a string uppercase – analogous to UCFIRST

1
SELECT CONCAT(UPPER(SUBSTRING(firstName, 1, 1)), LOWER(SUBSTRING(firstName FROM 2))) AS properFirstName

a user-defined function in MySQL 5.0+ similar to PHP’s substr_count(), since I could not find an equivalent native function in MySQL. (If there is one please tell me!!!)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
delimiter ||
DROP FUNCTION IF EXISTS substrCount||
CREATE FUNCTION substrCount(s VARCHAR(255), ss VARCHAR(255)) RETURNS TINYINT(3) UNSIGNED LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
DECLARE COUNT TINYINT(3) UNSIGNED;
DECLARE offset TINYINT(3) UNSIGNED;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET s = NULL;

SET COUNT = 0;
SET offset = 1;

REPEAT
IF NOT ISNULL(s) AND offset > 0 THEN
SET offset = LOCATE(ss, s, offset);
IF offset > 0 THEN
SET COUNT = COUNT + 1;
SET offset = offset + 1;
END IF;
END IF;
UNTIL ISNULL(s) OR offset = 0 END REPEAT;

RETURN COUNT;
END;

||

delimiter ;

Use like this:

1
SELECT substrCount('/this/is/a/path', '/') `count`;

`count` would return 4 in this case. Can be used in such cases where you might want to find the “depth” of a path, or for many other uses.

It’s pretty easy to create your own string functions for many examples listed here

1
2
3
4
5
6
## COUNT substrings

CREATE FUNCTION substrCount(x VARCHAR(255), delim VARCHAR(12)) RETURNS INT
RETURN (LENGTH(x)-LENGTH(REPLACE(x, delim, '')))/LENGTH(delim);

SELECT substrCount('/this/is/a/path', '/') AS COUNT;

+——-+
| count |
+——-+
|     4 |
+——-+

1
SELECT substrCount('/this/is/a/path', 'is') AS COUNT;

+——-+
| count |
+——-+
|     2 |
+——-+

1
2
3
4
5
6
## Split delimited strings

CREATE FUNCTION strSplit(x VARCHAR(255), delim VARCHAR(12), pos INT) RETURNS VARCHAR(255)
RETURN REPLACE(SUBSTRING(substring_index(x, delim, pos), LENGTH(substring_index(x, delim, pos - 1)) + 1), delim, '');

SELECT strSplit("aaa,b,cc,d", ',', 2) AS SECOND;

+——–+
| second |
+——–+
| b      |
+——–+

1
SELECT strSplit("a|bb|ccc|dd", '|', 3) AS third;

+——-+
| third |
+——-+
| ccc   |
+——-+

1
SELECT strSplit("aaa,b,cc,d", ',', 7) AS 7th;

+——+
| 7th  |
+——+
| NULL |
+——+

1
2
3
4
5
6
## UPPER CASE FIRST letter, UCFIRST OR INITCAP

CREATE FUNCTION ucfirst(x VARCHAR(255)) RETURNS VARCHAR(255)
RETURN concat( UPPER(SUBSTRING(x,1,1)),LOWER(SUBSTRING(x,2)) );

SELECT ucfirst("TEST");

+—————–+
| ucfirst(“TEST”) |
+—————–+
| Test            |
+—————–+

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
##Or a more complicated example, this will repeat an INSERT after every nth POSITION.

DROP FUNCTION insert2;
DELIMITER //
CREATE FUNCTION insert2(str text, pos INT, delimit VARCHAR(124))
RETURNS text
DETERMINISTIC
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE str_len INT;
DECLARE out_str text DEFAULT '';
SET str_len=LENGTH(str);
WHILE(i<str_len) DO
SET out_str=CONCAT(out_str, SUBSTR(str, i,pos), delimit);
SET i=i+pos;
END WHILE;
-- trim delimiter from end of string
SET out_str=TRIM(TRAILING delimit FROM out_str);
RETURN(out_str);
END//
DELIMITER ;

SELECT insert2("ATGCATACAGTTATTTGA", 3, " ") AS seq2;

+————————-+
| seq2                    |
+————————-+
| ATG CAT ACA GTT ATT TGA |
+————————-+

—————————-

Accessing versioned geodata in ArcSDE with SQL

This is excerpted from the following forum post:

“We’re running SDE and have several featuresets with versioning turned on…. We have numerous non-GIS applications that use SQL queries to access information from the spatial data. However, we don’t get all the features that I’m expecting to be returned. How do I access the features that have been added/deleted (and are, therefore, “hidden” from a straight SQL query)?”

As long as it is just the attributes that you are after, you can set up a multi-versioned view using sdetable.exe.

1
sdetable.exe -o create_mv_view -T mvv_wMeter -t wMeter

Then on your db connection, you execute the set_current_version stored proc to set the version, then issue your select statement on the multi-versioned view:

1
2
3
exec sde.set_current_version 'SDE.SOMEOTHERVERSION'
GO
Select COUNT(*) from mvv_wMeter ;

i don’t have it my example, but sdetable.exe needs db connection info arguments: -s,-i,-u,-p,-D in some combination depending on your config (I usually just set the SDEDATABASE,SDEINSTANCE,SDESERVER env vars in a batch file–sdetable use the env vars if they exist). – Jay Cummins Aug 6 ’10 at 19:39

I think this ESRI support page may be relevant.

I believe it is worth posting the warnings on that page here:

Never use database management system (DBMS) tools to update any row ID (object ID) field maintained by ArcSDE in the database. These object ID fields are allocated and managed by the geodatabase and, therefore, should not be altered using SQL.

Never edit the DEFAULT version of the geodatabase using SQL. Starting an edit session on a version obtains an exclusive lock on the state that the version references. If you lock the DEFAULT version, you prevent ArcGIS users from connecting to the geodatabase.

In the 9.3 help page they also warned against editing non-simple feature class attributes (Geometric Networks, Topologies, etc.) via SQL.

ITIS Organism Flat table from MySQL

I need to make a table to look up common (vernacular) names for some organisms. So I imported the USGS ITIS database from text files into MySQL and created this little view. Hopefully this helps someone else.

CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_flat_table` AS
SELECT
  `taxonomic_units`.`tsn`         AS `tsn`,
  `taxonomic_units`.`unit_ind1`   AS `unit_ind1`,
  `taxonomic_units`.`unit_name1`  AS `unit_name1`,
  `taxonomic_units`.`unit_ind2`   AS `unit_ind2`,
  `taxonomic_units`.`unit_name2`  AS `unit_name2`,
  `taxonomic_units`.`unit_ind3`   AS `unit_ind3`,
  `taxonomic_units`.`unit_name3`  AS `unit_name3`,
  `taxonomic_units`.`unit_ind4`   AS `unit_ind4`,
  `taxonomic_units`.`parent_tsn`  AS `parent_tsn`,
  `taxonomic_units`.`update_date` AS `update_date`,
  `taxon_unit_types`.`rank_name`  AS `rank_name`,
  `vernaculars`.`vernacular_name` AS `vernacular_name`,
  `longnames`.`completename`      AS `completename`
FROM (((`taxon_unit_types`
     JOIN `taxonomic_units`
       ON (((`taxon_unit_types`.`rank_id` = `taxonomic_units`.`rank_id`)
            AND (`taxon_unit_types`.`kingdom_id` = `taxonomic_units`.`kingdom_id`))))
    LEFT JOIN `vernaculars`
      ON ((`vernaculars`.`tsn` = `taxonomic_units`.`tsn`)))
   LEFT JOIN `longnames`
     ON ((`longnames`.`tsn` = `taxonomic_units`.`tsn`)))

Loading tab-delimited data to local MySQL

I’m looking at some data from the Avian Knowledge Network and Microsoft Access just isn’t up to dealing with the volume of records. So I switched over to MySQL.

I’m using the Positive Observation Essentials format as queried from their database. Here’s an example of the data:

I did use a sed command (created and verified in Excel) on the original files to fix the spaces in the column names (if you are in the business of making data for people, never, ever, ever, ever create large tabular files with spaces or special characters in the column names. Avoid dashes as well, and preferably use all caps). Note that the commands below will REPLACE your files. So be sure to back up your originals in case anything goes wrong.

find . -name ‘*AKN*’ -exec sed -i ‘s/Record\ Number/Record_Number/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Global\ Unique\ Identifier/Global_Unique_Identifier/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Collection\ Code/Collection_Code/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Scientific\ Name/Scientific_Name/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Latitude/Latitude/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Longitude/Longitude/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Country/Country/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/State\ Province/State_Province/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Observation\ Count/Observation_Count/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Obs\ Count\ At\ Least/Obs_Count_At_Least/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Obs\ Count\ At\ Most/Obs_Count_At_Most/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Observation\ Date/Observation_Date/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Duration\ In\ Hours/Duration_In_Hours/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Time\ Observations\ Started/Time_Observations_Started/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Time\ Observations\ Ended/Time_Observations_Ended/g’ {} \;
find . -name ‘*AKN*’ -exec sed -i ‘s/Sampling\ Event\ Identifier/Sampling_Event_Identifier/g’ {} \;

Then I create a basic staging table (commands also from Excel). I don’t know why I used InnoDB, when MyIsam would have been faster. But, notice the absence of indexes for faster loading.

CREATE TABLE `nv_akn` (
  `Record_Number` VARCHAR(30) DEFAULT NULL,
  `Global_Unique_Identifier` VARCHAR(30) DEFAULT NULL,
  `Collection_Code` VARCHAR(30) DEFAULT NULL,
  `Scientific_Name` VARCHAR(150) DEFAULT NULL,
  `Latitude` VARCHAR(30) DEFAULT NULL,
  `Longitude` VARCHAR(30) DEFAULT NULL,
  `Country` VARCHAR(50) DEFAULT NULL,
  `State_Province` VARCHAR(50) DEFAULT NULL,
  `Observation_Count` VARCHAR(30) DEFAULT NULL,
  `Obs_Count_At_Least` VARCHAR(30) DEFAULT NULL,
  `Obs_Count_At_Most` VARCHAR(30) DEFAULT NULL,
  `Observation_Date` VARCHAR(30) DEFAULT NULL,
  `Duration_In_Hours` VARCHAR(30) DEFAULT NULL,
  `Time_Observations_Started` VARCHAR(30) DEFAULT NULL,
  `Time_Observations_Ended` VARCHAR(30) DEFAULT NULL,
  `Sampling_Event_Identifier` VARCHAR(50) DEFAULT NULL
) ENGINE=INNODB DEFAULT CHARSET=latin1

I’m just doing this locally, so XAMPP is my friend. So from the xampp /mysql/bin directory, I ran mysql.exe. I chose my database and ran the following on the text files produced by the above sed cleaners.

mysql> LOAD DATA LOCAL INFILE ‘c:/xampp/mysql/bin/Nevada_Pos_obs_Essent_15-MAR-2011.txt’ INTO TABLE nv_akn
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’ IGNORE 1 LINES;

I used LOCAL since the database is on my workstation. Note the full path to the windows file, with forward slashes. Fields are tab-delimited, lines seem to just use carriage returns (or at least it doesn’t look like I need another line ender) and I’m ignoring the column header row.