Monday, December 04, 2006

MySQL Recipes: promoting a slave to master or changing masters

In the corporate world, promoting a slave to a master requires many years hard work, obtaining a suitable business degree, getting an MBA, and having a taste for promotion. The alternative is getting that break as an entrepreneur where you start as the master of your own destiny and end being the master when the money runs out or you take the money and move on.

In MySQL land, failovers for redundancy, disaster recovery or load balancing are performed by master databases and slave databases, the most popular method using binlog replication.

Note: This commands are valid for MySQL 3.23,4.0,4.1,5.0 and 5.1

For older versions:

  1. replace RESET MASTER with FLUSH MASTER.
  2. replace RESET SLAVE with FLUSH SLAVE.

In a simple configuration you might have one master and one slave.

Recipe to promote a slave to a master for simple replication.

On Master:

  1. FLUSH LOGS;

On Slave:

  1. STOP SLAVE;
  2. RESET MASTER;
  3. Optional: CHANGE MASTER TO MASTER_HOST='';

More complex setups:

If you have a chain of replication using a slave as a pseudo master for another slave. Kind of like having a king with a local sheriff taking orders and then telling the serf.

eg: master M1 -> slave S1 -> slave S2

Note: Slave S1 has --log-slave-updates enabled

Recipe to promote a slave to a master for simple chained replication.

On Master M1:

  1. FLUSH LOGS;

On Slave S1:

  1. STOP SLAVE;
  2. RESET MASTER;
  3. Optional: CHANGE MASTER TO MASTER_HOST='';

On Slave S2:

  1. STOP SLAVE;
  2. RESET SLAVE;
  3. START SLAVE;

If you have a multiple slave replication with one master and two or more slaves.

eg: slave S2 <- master M1 -> slave S1

Recipe to promote a slave to a master for multiple slave replication.

Note: Slave S2 is going to use the newly promoted slave S1 as its master.

On Master M1:

  1. FLUSH LOGS;

On Slave S1:

  1. STOP SLAVE;
  2. RESET MASTER;
  3. Optional: CHANGE MASTER TO MASTER_HOST='';

On Slave S2:

  1. STOP SLAVE;
  2. CHANGE MASTER TO MASTER_HOST = 'SLAVE S1 hostname';
  3. RESET SLAVE;
  4. START SLAVE;

Verification:

Once you have used one the recipes, and you are keen to taste the results, run the command

SHOW SLAVE STATUS;

Make sure that the binlog for the master is actually the correct one.

One of the joys of using MySQL is the simple commands required to perform task which can require some fancy footwork on other RDBMS's.

There are a couple more methods of replication which aren't covered here. Also see the MySQL Replication FAQ. The methods are the same, but the formatting of the procedure(s) are less than ideal.

Have Fun

Paul

Monday, October 16, 2006

SQLserver performance tricks

I came across this post at Adam's website on using TOP and ORDER BY to achieve intermediate materialization of inline view.
I promised Adam in the comments I would pass it on and I have, via email to my DBA mates who I am sure haven't seen this technique and now here.

This is a very good technique, I was going to type trick however a trick is something which seems to conjue up an image of magic, once you use this trick yourself enough it will become a skill.

Adam's article whilst not mentioning it also shows the iterative processes of tuning, at times the steps can be large and seem weird to the outsider i.e. how did you go from here to there without all those intermediate steps.

I tried Adam's technique on MYSQL 5.0, maybe I am missing something with MySQL however it couldn't get MySQL to use the derived table on the join... sigh.
I noticed the documentation (MySQL calls them derived tables not inline views) mentions the fact they are always evaluated for optimisation!?!



Simple join

mysql> explain select a.userid,count(*) from ratings a
-> join ratings b on a.movieid = b.movieid
-> group by a.userid;
+----+-------------+-------+--------------------+--------------------+---------+------------------+--------
| id | type | possible_keys | key | key_len | ref | rows | Extra
+----+-------------+-------+--------------------+--------------------+---------+------------------+--------
| 1 | index | ix_ratings_movieid | ix_ratings_userid | 5 | NULL | 100165 | |
| 1 | ref | ix_ratings_movieid | ix_ratings_movieid | 5 | movie1.a.movieid | 28 | Using where;
Using index|
+----+-------------+-------+-------+--------------------+--------------------+---------+------------------+

2 rows in set (0.00 sec)

Using INLINE view

mysql> explain select a.userid,b.theCount from ratings a
-> join
-> (select movieid,count(*) theCount from ratings b group by movieid) b on a.movieid = b.movieid
-> ;
+----+-------------+------------+-------+--------------------+--------------------+---------+--------
| id | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+--------------------+--------------------+---------+--------
| 1 | | NULL | NULL | NULL | NULL | 1682 | |
| 1 | ref | ix_ratings_movieid | ix_ratings_movieid | 5 | b.movieid | 28 | Using where |
| 2 | index | NULL | ix_ratings_movieid | 5 | NULL | 100165 | Using index |
+----+-------------+------------+-------+--------------------+--------------------+---------+--------
3 rows in set (0.11 sec)


Using INLINE with ORDER BY n LIMIT m

mysql> explain select a.userid,b.theCount from ratings a
-> join
-> (select movieid,count(*) theCount
-> from ratings group by movieid order by movieid limit 2147483647) b on a.movieid = b.movieid
-> ;
+----+-------------+------------+-------+--------------------+--------------------+---------+---------
| id | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+--------------------+--------------------+---------+---------
| 1 | NULL | NULL | NULL | NULL | 1682 | |
| 1 | ref | ix_ratings_movieid | ix_ratings_movieid | 5 | b.movieid | 28 | Using where |
| 2 | index | NULL | ix_ratings_movieid | 5 | NULL | 100165 | Using index |
+----+-------------+------------+-------+--------------------+--------------------+---------+--------
3 rows in set (0.11 sec)



Have Fun

Friday, September 29, 2006

Infrequent blog posting

Seems a while since I have posted here and recently I haven't posted anything here about databases. But I continue to write and post blog articles. I have been writing some articles for my new employer over at Pythian

I went through a period where I haven't had a lot of time to keep up-to-date with what has been going on and some of the sources which I had been using as a quick summary have descended badly, ahh you get what people vote for.

Prepare for a brain dump...

It is interesting to see the change in content, and also what it takes to get an article promoted to the first couple of pages, very interesting how quickly people form blocs or cliques which provide almost guaranteed high rank.

In a way purchase decisions are like voting, however unlike a slight peer pressure (hard to define is it being cool, being a "friend") your personal money is on the line. There has been plenty of work done on purchasing decisions and money management where your money is on the line as opposed to other people's money.

A long line of economists from Adam Smith to Hayek use of the idea that it is impossible to determine how people act or what priorities or self-interest plays in buying and selling, hence the impossibility of trying to set a price, I agree even agent based modeling whilst better does not capture the essence of each purchase or sale decision. Stop however before we proceed to take that idea use the relativist idea that everything is subjective. What they were saying is that the price contains many many decisions reflecting the opinions of many people.
If at a point in time you can start to model and predict purchases decisions of a majority of people, or sufficient number for a profitable outcome you have gained an edge.

I wonder if this is still true where the decision has no effect on the hip-pocket (free voting) or hidden, for example taxpayer subsidies or taxpayer funded services. There is plenty of real evidence that you get the lowest common denominator or excessive freeriders (tragedy of the commons).

Would a subscription model enable news aggregator websites a cleaner picture, would the voting be just limited to the subscribers (who are paying) and therefore fit a certain demography model? What about the places like Yahoo who use the concept of futures or shares for different areas to get a feel for the buzz.

Can we use data from purchasing decisions ala Amazon to track what is hot, most books whilst not too expensive provide at least one insight what is happening. I am not sure about you but Amazon is still a personal thing. So most of my purchases from Amazon are my purchases not a family or friend.

What other sources exist?
do new academic papers provide a clue to what could be hot or what has been hot? The purchase decision isn't so much monetary, more fundamental, normally a large investment in time by the writer(s) and also the reviewers.
Unfortunately those decisions are guided like any media by editors.

So what other sources exist?

Have Fun

Saturday, September 02, 2006

Little gem found whilst websurfing

Do you still websurf?

Or is it all google and RSS and aggregators?

Is what you read presented to you as the result of a popularity contest at digg, reddit, or even google (page rank is a popularity contest)?
I am beginning to understand, partially as some of those sources signal to noise ratio decreases and partially from watching how articles become popular and how there is a huge demand for Search Engine Optimization (SEO).

I was bouncing from one article to another, seeing something interesting and opening a firefox tab to read or bookmark for later digestion and came across an interesting article or transcript of Bruce Sterling about pervasive computing. A gem!

I have listened to a couple of Bruce Sterling's presentations via podcasts from ITconversations
and old Bruce has a way with words. I rate his presentations highly and feel they are actually just as effective, just hearing the voice as I guess seeing the presentation.

I am in mind expansion mode at the moment, and it is time to learn more again.

Ever see people use the word "Grok" before? If you don't the background to the word check out Robert Heinlein "Stranger in a strange land". I am not Grok-ing stuff at the moment, I am purely sucking in new concepts before I can begin to understand.

Have Fun

Monday, August 14, 2006

Great complex systems thinking diagram

Go check out this website called Idiagram: The Art of Insight and Action specifically this page which not only shows a great diagram of complex systems thinking but also the impressive way to blend java-script with graphics to present a freeform user driven diagram.

I found this great website via the Anecdote blog entry

Have Fun

Paul

Thursday, August 03, 2006

Design or Evolve?

Here is an interesting couple of articles from Steve Jurvetson about weighting up the benefits of designing your algorithms or letting them evolve. Read the comments as well as they are pretty good.

  1. The Dichotomy of Design and Evolution

  2. Recapitulation in Nested Evolutionary Dynamics
More work is required in using Bayesian reasoning on evolved algorithms, essentially using the power of Bayesian to determine the set of information which you don't know and continuing working on whittling that set down.
Directed evolution would potentially help as well, dependent on understanding to the point of exhausion what your assumptions are in setting the direction or guidelines or rules (framework)

Have Fun

Paul

Howto: copy of a production mysql database using mysqlhotcopy backup

Do you need to see what the data was like yesterday, last week in your tables?

Here are the steps to create a copy of a mysql database on the same machine, with no downtime.

Assumption: You have made backups using mysqlhotcopy and are using myisam tables.
Caveat: Make sure you copy the backup into the new directory, otherwise you will nuke the current database. You have been warned.
  1. Create a new directory in your mysql data directory (/var/lib/mysql): mkdir new_db
  2. Copy the backup you want into that new directory: cp /backup_dir/current_db/* new_db/
  3. Log into mysql and create the new database: CREATE DATABASE IF NOT EXISTS new_db;
  4. If any tables have been created since the backup you will have to create them in new_db, using this command: CREATE TABLE IF NOT EXIST new_db.table_name like current_db.table_name;
  5. Check/Validate all the tables in new_db are ok using command: CHECK TABLE new_db.table_name;
You can use this method to create a copy to check for changes between now and the last backup, this might be for recovery reasons or audit reasons.

The reason I created this post was that I spent at least 30 minutes looking using google for a solution to this and whilst there were plenty articles about backups, nothing mentioned making a copy locally, on the same system.

Have Fun

Paul

Wednesday, August 02, 2006

How inflation happens or What inflation is

So what is inflation, how does inflation happen?

It is not increasing prices, that is the effect of the cause.

The cause is an increase in the amount of money in the system.

If you think of prices as ratios eg. a kilo bag of potatoes cost a certain number of dollars.

Prices are signals, signals of demand and supply.

Recently Australian bananas have been really expensive due to a couple of reasons
  1. Cyclone destroyed a large growing region in North Queensland.
  2. Imports are restricted due to quarantine requirements.
Hence the supply was restricted by demand remained the same, so what happens to the price? the ratio? it increases, so it took more dollars to buy the limited bananas.

OK, so what is inflation, and what does banana prices have to do with the increase in prices of a whole swag of goods and services. Prices are still ratios so something has happened to either the supply of goods and services, or the supply of money on the other hand.

There is a increase in the money in the system (monetary inflation) which is causes the same or fewer goods to increase in price.

Where is this money coming from? it is happening because there is a demand for money which the Reserve Bank of Australia (RBA) is supplying to maintain its monetary policy (the interest rate).

Australia doesn't live in a closed world, we are affected by other countries demand for our goods as well. Hence when China, Japan, South Korea, Europe and the US want our commodities they will bid the price up, sending the signal that supply is restricted. This leads to a long chain of price signaling all the way through the supply chain.

Why is there such as demand for money at the moment. The RBA has had to increase the interest rate again to try and dampen the demand. See this statement.

This is rational behaviour from investors and everyone, if you believe that prices are going to be higher in the future then the goods and services are cheaper now then in the future...
The money you have now is going to be worth less in the future, so you exchange for something else.

If the price signal mechanism is constrained due to regulation, monopolies, or the supply is inelastic, eg. it takes 5-10 years to get a mine going, it takes 10 years to be a good doctor.
Then the rational thing to do is to demand as much money now to buy any goods and services which are available now or borrow to build to satisfy that demand. This is arbitration at its best.

I have said this in the past, the RBA is acting like an overbearing nanny! It believes it can calm down the arbitration process by making money more expensive. They are looking into the future and predicting some of the current demand is transient (caused by other arbitrators) and there will be a massive over-investment in supply.

Why is it rational to take as much money as you can get and buy goods or services?
Why is it rational to get rid of any money you get as quickly as possible?

Steps for being rational:
  1. I download this spreadsheet from the RBA
  2. I add some percentage changes over the last year for M1 (Currency and Currency in the bank) raw money.
  3. If I start at Jan 2006, I see that last year M1 increased by 6.95%!!
  4. Interest rates are 5.50%!!
  5. That is a negative interest rate! You leave the money in the bank and you will get nothing, worse you will have something which is potentially worth less at the end of the year.
More recently the month over month percentage have been crazy!
If you check in May 2006 year over year M1 increased by 8.98%!!

Have Fun

Paul

Friday, July 21, 2006

Gapminder visualization of demographic info

Hat-tip to Catallarchy for this awesome visualizer of demographic info.

The ability to move through time takes a snapshot in time and turns it into a good indication of behaviour.

Try a movie (playing the data) of children per women (fertility rate) against life expectancy.

or

Child Mortality against Fertility rate.

People are rational.

Have Fun

Paul

Trackback

Monday, July 17, 2006

RSS and the subscribe model

There has been a idea (a meme) building for a while around the use of RSS (or atom) as a possible way of sharing and propagating information both internally and externally in a business.

Adam Bosworth has a great podcast on this subject when he was talking about Google and his version of a distributed DNS-like database.

As I was reviewing by blog aggregator (sage) for firefox I found another link to a page dedicated about RSS in the enterprise.

RSS builds on the acceptance of XML as a basis for sharing information between companies, there are various formats out there for your standard B2B solution. With Oracle getting into the vertical integration game and almost every major software vendor pushing Service Oriented Architecture (SOA) I am sure it won't be long before RSS in some form appears as a solution for retrieving data from diverse data sources.

This is the database replication model where RSS tells you what has changed so you get everything incrementally which is faster. You also could only get what you want to receive, pulling that data you want, or even what you are allowed to see, which provides security and privacy.

The possibilities are big... what about a way where you take one or more feeds add some value and pass that new feed onto as a new RSS. This is what people call mashups or remixing.
For all the database people this is like a view on one or more tables, your VIEW of the data which is feed to you.

From a monitoring point of view, wouldn't it be nice to be able to pull the data you want and forget the rest.

The last interesting idea is based on a metaphor from this article if you pass around XML, you could pass S-expressions, pass around s-expressions and you can pass not only data but code. I had an idea or thought, after reading that article, that maybe the data in a database is really code after all, the tables being just functions (or relations in the original sense), the data model actually reflective the function of the business (the business model) in abstraction.

So you could in fact use RSS not only to share data or a view of data but also share code or functions.

Have Fun

Friday, July 07, 2006

SQL Server 2005: Creating V$ACTIVE_SESSION_HISTORY

Are you are DBA like me who works on both Oracle and SQL server (and MySQL)?

Do you miss Oracle performance views like v$session, v$sql and in Oracle 10G v$active_session_history?

I have created a series of tables and procedures which provide roughly the same functionality in SQL server 2005, basically mirroring the same view that is in Oracle 10G. I will backport the stuff to SQL server 2000 at some stage.

These scripts use the new dynamic management views (DMV) available to SQL Server 2005.
Specifically,
  1. sys.dm_exec_requests
  2. sys.dm_exec_sessions
I have tried to map the columns in v$active_session_history to columns from the DMVs, whilst keeping the overhead of selecting the data under control, which means limiting the number of joins. So unfortunately there is no rowwait data included in this iteration.


The steps to install this on your SQL Server 2005 instance
  1. Copy and paste these scripts to a sql file.
  2. Open up SQL Server Management Studio.
  3. Select the instance and database where you want to create the tables. I suggest MSDB as the database or a dba specific database if you already have one.
  4. Open the file as a query.
  5. Execute the query. This will create the view, table and two stored procedures.
  6. Create a job which executes the stored procedure usp_ins_sql2005_ASH every n secs. It would be prudent to start with 3-5 secs and see how much overhead that causes.
  7. Create a job which executes the stored procedure usp_purge_sql2005_ASH. This stored procedure takes a parameter for the number of hours to keep.

Note: Check the growth of your table to determine how much data you want to keep. As busy systems are likely to have many rows per sample.

E.g. Sampling every 2 secs will create 30 samples per minute, 1800 samples per hour. If you have on average 5 sessions running or waiting per sample that will be 9000 rows per hour.

I will provide some useful reports from the data you have collected in the next article.

The scripts are attached at the end of this article. Due to html parsing < as a tag you will have to check the purge script before running.


/* Recreating V$ACTIVE_SESSION_HISTORY in SQL Server 2005
Version: 0.1
Created: Paul Moen 2006
*/


drop view uv_active_session_history
go
create view uv_active_session_history
as
select
getdate() as sample_time,
req.session_id,
req.sql_handle as sql_id,
req.plan_handle as sql_plan_hashvalue,
req.database_id,
req.user_id,
req.command as sql_opcode,
req.status as session_state,
req.blocking_session_id as blocking_session,
req.wait_type as event,
req.wait_time,
sess.program_name as program,
sess.client_interface_name as module
from sys.dm_exec_requests req join sys.dm_exec_sessions sess
on req.session_id = sess.session_id
where req.user_id <> 1
go

drop table active_session_history
go
select top 0 * into active_session_history
from uv_active_session_history
go

drop procedure usp_ins_sql2005_ASH
go
create procedure usp_ins_sql2005_ASH
as
insert into active_session_history
select * from uv_active_session_history
go

drop procedure usp_purge_sql2005_ASH
go

create procedure usp_purge_sql2005_ASH
@hour int
as
delete from active_session_history
where sample_time < dateadd(hh,-1*@hour,getdate())
go

Thursday, June 29, 2006

SQLserver 2005 query efficiency sql scripts

Until recently if you wanted to determine what queries or stored procedures were the least efficient you would have to capture all SQL either via SQL Profiler as described in these early posts on automating SQL Profiler and tuning using SQL Profiler or have a scheduled job which dumped the sql running from sysprocesses.

Now with the new dynamic performance views available in SQLserver 2005 you can run queries which allow you to determine these poor performers without much extra work.

As I noted in my last entry, the SQLserver 2005 team have a series of blogs and this inspired me to read and port my existing Oracle scripts which determine query efficiency to SQLserver 2005.
One of my favourite Oracle scripts uses the number of logical reads per execution as a good sign of poorly performing sql. Logical reads per execution is also a reasonable estimation of CPU per execution.

Normally on a poorly performing system I tend to follow these steps, in this case rewritten to use SQLserver 2005 new dynamic views.
  1. Quickly check taskmanager or perfmon to verify that the CPU or IO hog is in fact SQLserver and not IIS or SQL fulltext indexing services (or something else).
  2. Check for contention, is there one process blocking all others.
  3. Run the script to find sql with the highest elapsed time per execution.
  4. Run the script to find sql with the highest physical reads (PIO) per execution.
  5. Run the script to find sql with the highest logical reads (LIO) per execution.
So here are the efficiency scripts:

rem SQL Efficiency by Elapsed Time. Paul Moen 2006

select qs.creation_time
, qs.execution_count "Exec"
, qs.total_elapsed_time "Elapsed"
, total_physical_reads "PIO"
, total_logical_reads "LIO"
, round(qs.total_elapsed_time/qs.execution_count,1) "Time/Exec"
, round(qs.total_physical_reads/qs.execution_count,1) "PIO/Exec"
, round(qs.total_logical_reads/qs.execution_count,1) "LIO/Exec"
, st.text
from sys.dm_exec_query_stats as qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) as st
where qs.execution_count > 0
order by "Time/Exec" desc

rem SQL Efficiency by Physical Reads per execution. Paul Moen 2006

select qs.creation_time
, qs.execution_count "Exec"
, qs.total_elapsed_time "Elapsed"
, total_physical_reads "PIO"
, total_logical_reads "LIO"
, round(qs.total_elapsed_time/qs.execution_count,1) "Time/Exec"
, round(qs.total_physical_reads/qs.execution_count,1) "PIO/Exec"
, round(qs.total_logical_reads/qs.execution_count,1) "LIO/Exec"
, st.text
from sys.dm_exec_query_stats as qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) as st
where qs.execution_count > 0
order by "PIO/Exec" desc


rem SQL Efficiency by Logical Reads per execution
which is good estimate for CPU/execution. Paul Moen 2006

select qs.creation_time
, qs.execution_count "Exec"
, qs.total_elapsed_time "Elapsed"
, total_physical_reads "PIO"
, total_logical_reads "LIO"
, round(qs.total_elapsed_time/qs.execution_count,1) "Time/Exec"
, round(qs.total_physical_reads/qs.execution_count,1) "PIO/Exec"
, round(qs.total_logical_reads/qs.execution_count,1) "LIO/Exec"
, st.text
from sys.dm_exec_query_stats as qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) as st
where qs.execution_count > 0
order by "LIO/Exec" desc

Sunday, June 25, 2006

SQLserver 2005 team blogs

I have been scanning some blogs by different teams involved with SQLserver 2005.
Here is a short list
  1. SQLserver 2005 Database engine tips
  2. SQLserver 2005 Query Optimizer
  3. SQLserver Query optimization - talking about SQL/TSQL optimization
  4. SQLserver storage Engine
There is a bit of cross linking happening so some articles are repeated on one or more blogs, nothing like helping the SEO along.

This article from the database engine blog about finding the top N worse SQL brought a (bemused) smile to my face. Finally SQLserver has views like Oracle, no more do you need to run SQL profiler 24x7 or some script which captures the SQL running every n secs to have an historical record of what has been running.

Guess this means I can start porting my Oracle scripts from using v$sql, v$sqlarea and in Oracle 10G R2 v$active_session_history to using sys.dm_exec_query_stats, sys.dm_exec_sql_text and sys.dm_exec_query_plan.

The trouble with plenty of relational databases has been that lack of exposure of the metadata/catalog of the database and the data within that catalog. Until recently plenty of the internal stuff in SQLserver had to be queried using various DBCC calls. Similarly, this is the same stuff that MySQL versions prior to 5 have with SHOW TABLE STATUS and SHOW FULL PROCESSLIST etc.

There is no nice way to see what the spread of data within a column is. It is good that these vendors are exposing this to the outside world via SQL rather than a tool with requires the output to be further parsed.

Have Fun

Friday, June 16, 2006

Interesting web articles

I am getting back into the groove, catching up on all those interesting blog entries and web articles which I missed in Canada, not because I didn't have internet access, just my whole RSS feed reader wasn't there, neither were my bookmarks/favourites. My setup is the result of many hours of filtering which I don't need to spend time remembering (unless I need to).

I am working from home in the interim, which has been great, depending on what I am doing I enjoy working or studying to some music. As I have mentioned in the past pandora provides a great resource of music. I tend to listen to a genre called vocal trance which is basically dance music, other times it might be classical, jazz or even metal in various forms. I like my music to be packed with plenty of action and depth.

Here are couple of articles worth reading:
  1. Mosh Pit Learning
  2. Python Sudoku Solver
I find the area on constraint programming interesting, there is a prolog Sudoku solver as well which uses a similar method using prolog constraints to solve the puzzles. The idea that the program moves forward and reduces the problem set might be useful in tuning situations where the need to eliminate non-issues (sometimes in a hurry) is important.

Lots of developers have issues with using constraints in databases, mainly relating to error handling. That is kinda weird as the database model is really an expression of a business or application model. i.e. a sale must have a product and a customer.
Foreign key, unique and check constraints allow the database to police those business rules (or constraints) so that the data remains accurate, without depending on any application.

In some sense the data model provides insight into the business. What problem (for the customer) is business trying to solve...

Have Fun

Sunday, June 11, 2006

Back to Sydney

My three weeks in Canada is almost over.

I am killing a couple of hours in Vancouver before heading back to Sydney.

If you get a chance go and have a read of the Pythian blog. Like I said before these blokes and gals are smart. Christo is a damn fine Starcraft player as well. I thought I was reasonable, but hey 6 minutes and its over is really really good.

I am going to being spending time getting the Australian office happening and then Pythian goes 24x5!!

My time is limited again.

Craig Sham. Orapref has done some interesting work on performance forecasting. This I believe is the next tuning wave, after ratios,waits and response time.
Hotsos is doing some fine stuff with queuing theory as well.

Enough for now. I will be back on Aussie time in about 20 hours

Have Fun

Sunday, June 04, 2006

Kaplan-Meier analysis

The blokes at Pythian are damn smart.
Paul Vallee mentioned the use of a statistical analysis of survival. Specifically Kaplan-Meier survival analysis.
After Paul explained how he was using it, I thought of other areas where you could use it. It comes down to how or what you class as being survival and death.

  1. Database performance. At what point does the performance hang (or die) and what is the mean time of survival. Survival could be a reasonable response time for a known query.
  2. Stock survival. The ASX whilst having large blue chip stocks continues a lot of speculative mining and biotech startups. It would be interesting to run some survival analysis over them.
This tool doesn't tell you what is required to survive and that would be heavily influenced by choice of the survival indications.

Anyway I have to go. Going to play computer games all afternoon. Should be good.

Have Fun

Sunday, May 28, 2006

First week in Canada or cleaning internet cafe computers

Finally I have a chance to use a computer without the threat of spyware,adware and rootkits. I thought I could use a internet cafe a couple of blocks from where I am staying, but the last visit required 40 minutes of downloading spybot, ad-aware, firewalls and sysinternals rootkit revealer just to make the machine ok to use. It was affected with spyware and adware badly.

Anyway I chose to come into the Pythian office and use the computer I have been given for the duration of my stay as I know it is secure.
So it is Saturday morning Canadian (EDT) time just before lunchtime.

The first week has been busy, the learning curve steep, but I have jumped into handling the daily processes and alerts as a way to get up to speed as quickly as possible. Doing actual work as opposed to reading documentation and having meetings (whilst important) is a much faster way to "get a grip" on the new work environment, processes, clients etc.

The system that Pythian have for handling the large number of clients and larger number of databases, client -> databases is a one to many relationship :) is very very good.
Like any successful company, systems and processes are one of the keys to allowing the company to expand and also provide a method of storing or maintaining what is called organizational learning or knowledge. This way if someone leaves the company, the knowledge doesn't leave as well.

I won't post much in way of the new direction and database related posts until I get back to Australia.

More later, have fun.

Saturday, May 20, 2006

Blogging from Canada

As part of my new role I am going to meet the Pythian team based in Ottawa, Canada. It will be great to meet the guys and girls face to face.

So at the moment I am sitting in the departure area of Vancouver Airport waiting for the connection to Ottawa. I have time to burn so I thought it might be fun to write an article from an airport internet terminal. Humour me...

I have a couple of ideas for new articles and maybe a extra direction for the site relating to ASX listed stocks.

Some ideas I want to pursue in the near future are

  1. Using Analytical functions in Oracle in combination with Oracle statistical gathering to provide indepth and potential a new way to view performance metrics.
  2. Use the Oracle Data Miner and prediction functions to actually produce predictive performance analysis based on the Oracle stats.
  3. Review the state of SQLserver dates as stored by the datetime datatype. We have seen some weird dates which do not match what the online books (doco) say are the limits.
  4. Further review and detailed outlines of installing Oracle on Ubuntu.
  5. Expansion on the dead money comparsions (rent vs mortgage interest) to truly show the tax effective nature of buying and leasing for the first 7-10 years of the loan.
  6. Build on my expanding knowledge of Analytics to produce some unique reports on ASX listed stocks.

I am also going to write some blog articles for Pythian on their blog once I get going. The first one relates to extracting a complete copy of the CTXSYS to faciliate moving a schema with text or domain indexes.

Plenty of work ahead and as I mentioned, the more popular a subject the more I will use that to guide the articles I write in the future.

My time is limited to the coins I have so I will sign off.

Have Fun

Paul

Monday, May 15, 2006

Google Ads and feed links

All those returning visitors would have noticed that there now google ads and some RSS feeder links.
This is an exercise in reviewing and working with the various web technologies rather than a chance to make any money. Given the site cracked 300 visitors this month (as measured by statcounter.com) you see what I mean.
I have been playing around with including google ads for a while. I might also upgrade the amazon stuff and actually start writing some reviews for the books I have read in my lifetime, although most of those books are not related to the theme of this blog.

The most popular pages of this blog are:

Given the signals I am receiving from the market (of web surfers and search engine results), I should start writing some more about those areas. Signal to Noise ratio is a bit low at the moment...

Have Fun

Paul

Sunday, May 14, 2006

Market Profile perl script

I posted ages ago on a ASX stockmarket forum that I had developed a perl script to display a market profile.
What is a market profile?
  1. Here are links to the CBOT education site with another link to Market Profile 101
  2. In a nutshell, a market profile seeks to show the distribution of prices intraday from tick or trade-by-trade data.
  3. It is simple to draw, which allowed floor traders to draw (before portable devices)
  4. It can reveal a difference trading pattern to what the standard OHLC price bar suggests.
I have access to historical tick data from the ASX and so was using perl to test various theories I had thought up about the market. Developing the market profile script was more of a challenge.

Anyway I still get requests for perl script via email even though that post was back in 2001.

As part of the move away from perl to SQL for other stuff, I am going to load my tick data into my OracleXE database and I should be able to rewrite the perl script as SQL then rewrite as a function.

If you want a copy of the perl script, email me. roobaron-/*?at*?/-yahoo.com. Take out the regex breaking bit.
or
Cut and paste the script from here

Have Fun

Paul

### CUT HERE ###

# Market Profile
# Paul Moen 05/12/2001
# Explanation:
# Takes csv file from Weblink and converts into market profile
# Traditionally market profile was split on 30 minute increments
# Each 1/2 hour is given a letter starting at A
# Range is determined by Price based on ASX tick movements
# i.e <> 1.00 increments 0.01
# So. each TP (time point) will be 5 ticks. eg price > $1.00 TP = 0.05
# Program logic based on manual procedure. i.e as data arrives time is checked
# and price is checked to see if resides with existing TP or needs a new TP
# Read the attached readme_market_profile.txt for more info on the program logic.


# Read csv file piped via sort (sorting on price)

# $title = ;

# Format of file is expected to be like this
# No, Date, Time, price, volume, sale, C Code
# Therefore @line[3] (4th element) is the price.

# Set variables

$tpr_flag = 0;
$last_price = 0;
$period_change_flag = 0;
$tp_list = '';
@mp_list = '';

while ($line = ) {
chop($line);
@line = split(/,/,$line); # split line into array elements.

# grab the date from 2nd element
$date = @line[1];
# Grab the time and determine what letter it is
# Note: Any time before 10am is ignored and time after 4.30pm is ignored
# Letter increment by 1/2 hour

# Now grab the price
$price = @line[3];
# We are going to calculate the tpr only once so check to see if it set already
if ($tpr_flag != 1){
# tpr is the time price range
if ($price > 1.00){
# tick is 0.01, tpr is 0.05
$tpr = 0.05;
$tpr_flag = 1;
} elsif ($price <> 0.10){
# tick is 0.005, tpr is 0.025
$tpr = 0.025;
$tpr_flag = 1;
} elseif ($price < tpr =" 0.005;" tpr_flag =" 1;" time =" @line[2];" period =" 'A';" period =" 'B';" period =" 'C';" period =" 'D';" period =" 'E';" period =" 'F';" period =" 'G';" period =" 'H';" period =" 'I';" period =" 'J';" period =" 'K';" period =" 'L';" period =" 'M';" period =" 'N/A';" period_change_flag =" 1;" period_change_flag =" 0;" last_price ="="" tp_list =" $price;" mp_length =" push(@mp_list,$tp_list);" last_price =" $price;" price_ge =" $last_price" price_le =" $last_price" the =","> or <> $price_le) {
# Price is within tpr and no change to , so no change to list
} elsif ( $price > $price_ge ) {
# price is greater than the price + tpr
$tp_list = $price_ge; # Price is in new tpr
$tp_list .= ' ';
$tp_list .= $period;
# push tp_list onto the top of mp_list (main list)
# @mp_list[length -1] becomes the last point.
$mp_length = push(@mp_list,$tp_list);
$last_price = $price;
# Calculate the timepoint range tpr
$price_ge = $price_ge + $tpr;
$price_le = $price_le - $tpr;
} elsif ($price < $price_le) { # price is less than price - tpr $tp_list = $price_le; # Price is in new tpr $tp_list .= ' '; $tp_list .= $period; # shift tp_list onto the bottom of the mp_list (main list) # @mp_list[0] becomes the start. $mp_length = shift(@mp_list,$tp_list); $last_price = $price; # Calculate the timepoint range tpr $price_ge = $price_ge + $tpr; $price_le = $price_le - $tpr; } else { print "something is wrong\n"; } } else { # Period is not equal to A need to work with list of scalars if (($price < $price_ge && $price > $price_le) {
# Price is in current tpr but the period is different, there should be
# a tp already in existence which requires an additional tp to be added.

} # period A check
} # everything else
$current_period = $period;
} # end of while
# end of market_profile.pl

### CUT HERE ###

Wednesday, May 03, 2006

Commentary of RBA Interest rate rise

The Reserve Bank of Australia (RBA) decided to raise its money market rate by 25 basis (0.25%) to 5.75%. It also released a media statement injunction with the announcement.

A quick summary (although you should go and read it)
  1. International growth is driving strong growth in business profits, which is enabling business to invest and borrow on the strength of those increased cashflows.
  2. Domestic consumers are able to borrow more as wage growth increases borrowing capacity.
  3. Wage growth is driven by business expansion.
  4. The RBA thinks the current borrowing is too much and/or too quick and wants to dampen the demand for credit by increasing the rate.
As much as the RBA would like to reduce the demand for credit, given it is backed by strong increases in cashflows for both businesses and consumers (on the back of business demand for workers) how far will it go?
If the business growth is coming a stronger export sector that sector will be getting its demand signals and its increasing cash flows from overseas consumers.
If those consumers are intermediate producers or producers of consumer goods for export, those producers will be taking their demand signals, stronger cashflows and the ability to borrow more from their overseas customers.

The RBA can't win, it can't set the Chinese interest rate or US interest rate or Japanese rate or EU rate, or stop individuals or companies fulfilling demand either domestically or internationally.
Instead it acts like a overbearing nanny, scolding businesses for borrowing to invest (in capital) so they can expand production, so they can produce and spend more.

The headlines have all been about poor old Australian consumer who has the massive mortgage and how the rate hike will make things harder and it will be for highly leverage consumers.
The consumer is collateral damage, this rate rise is aimed directly at reducing business borrowings for investment. When businesses cough, consumers catch a cold.

What the RBA has done is make it harder for Australian businesses to expand their production in the face of increased demands. This will limit the ability of those businesses to supply both domestic and international customers, therefore limiting their ability to spend but also limiting their ability to increase profits and most likely productivity.
The RBA is worried about price inflation, however by restricting the ability to increase supply, when demand is increasing, it's decision will cause prices to rise. That is what PRICE increases are signaling to businesses, INCREASE supply, INVEST in production.

The RBA like any other public institution needs to be seen to act otherwise its reason for existing risks being questioned. In increasing or decreasing the rate, the RBA changes the whole production structure within Australia, the structure it most likely can't fully comprehend.
It would be better leaving the rate stable, never again changing it. This would allow businesses to quit worrying about the affect of changes in interest rates on their profits and ability to expand and allow those businesses to make decisions based on price signals. Businesses try to achieve some stability by hedging the interest rates however most hedges are relatively short term.

If the RBA said that the rate was not going to change in 30 years, that stability would allow businesses to make much longer range decisions, enabling longer chains of production to emerge. Instead it plays around with the rate, decreasing it to increase the attractiveness of borrowing, then having remorse two or three years later and deciding to increase it again.

The joys of having a single Command and Control institution controlling the most important price signal of all i.e. the price of money.

Have Fun

Monday, May 01, 2006

Downgrading from Enterprise Edition to Standard Edition

The business decided that one of the databases that I support, that there was no requirement to run Enterprise Edition. So today and tomorrow morning early we (Junior DBA and I) are going to through the process of downgrading to Standard Edition.
Here are the steps Oracle recommends:
  1. Export the whole database to file using exp or expdb, depending on the version.
  2. Deinstall the Enterprise Edition using Oracle installer.
  3. Reinstall the Standard Edition software using Oracle software.
  4. Create a new database and import the data from export file.
What we did instead to minimize downtime to just the export and import process.

  1. Install Standard Edition in a different Oracle home and different Oracle home path eg. $ORACLE_HOME becomes /oracle/product/920
  2. Create a new database using dbca or from scripts. Applying all necessary patches.
  3. Export the whole database from old database.
  4. Shutdown old database.
  5. Import into new database.
  6. Switch listener to point at new database.
This has reduced the actual downtime from at least 2 hours to 20 minutes which is just the export/import process (the db is small and the box fast).

A couple of gotchas when using dbca to create the database.
  1. LOG_ARCHIVE_DEST_n doesn't work for Standard Edition. This parameter was set by the dbca in its standard init.ora file. This parameter setting returns the error ORA-439 feature not enabled: Managed Standby. Use LOG_ARCHIVE_DEST instead for Standard Edition.
  2. When using ssh -X hostname you can't use su or sudo to change to oracle. Tunneling X through ssh (the -X option) requires ssh -X oracle@hostname to get the right DISPLAY set.
Have Fun

Thursday, April 27, 2006

Trade deficit and Foreign Debt - 6

It has been a while since I wrote something about Australia's Trade Deficit and Foreign debt. In fact that last article was way back in June 2005!!

The Aus. Bureau of Stats. (ABS) released the figures for International Trade in Goods and Services February 2006 recently, and the original figures are so close to balanced, there was a minor $AUD 105Million in it.
This increase in Goods credits is still driven by the demand for ore and minerals. Australia's commodity (BBQ) economy continues the provide the fuel for growing economies. Almost all the goods were up by at least 12-15% over the last month's numbers suggesting that the numbers were adversely affected by the shorter February month.
If you review the pdf you can see the tables for the original, the deficit gets close to balance and then extends again. Civilian Aircraft makes an important impact of the debit side of the ledger, however fuel debits as up 43% over 8 months compared to last February.
No signing the K. Protocol means that Australia has not been penalised for having a comparative advantage in coal, given coal and coking coal is now by far our largest export.

The other stuff hasn't changed much as the US is still our biggest trading partner and the partner which we run the highest trade deficit with.
For all the worry and concern about China destroying our manufacturing, it is benign with the overall trend in our favour, our exports to China rose $AUD 3.6 Billion, 48% over 8 months (Feb 05 to Feb 06) whereas our imports only rose $AUD 1.9 Billion, 15% over the same period.

A quick check of the finances saw non-financial corporations borrowing heavily, this is most likely to pay for expansion to fuel the demand. An early signal that corporations feel they have expanded enough would be an continued fall in demand for imported capital goods, although the original series still shows that the continued demand for what would seem to be mining related equipment. This might also be seen in a reduction for corporate credit, although this could last longer as corporations use strong cash flows to borrow for merger and acquisition (M&A) to continue to expand.

Takeover fever really hasn't touched the booming mining industry as high prices allow companies to revisit unexplored leases both in Australia and elsewhere, rather than choosing to buy working mines to booster production whilst keeping costs at bay. M&A is happening in areas which are feeling the pinch on margins.

Mining costs will increase as lower yield ores are sourced to fill existing orders i.e. the mine life can be extended as most calculations are based on a specific price, increases in price make mining lower grade ores possible and if the company is smart it will mine those lower grade ores now at a higher cost whilst saving the lower cost, higher yielder ores for when prices drop. The aim afterall is extending the life and profit of the mine for as long as possible. This method is more longer term, ending up with more profit at the end, rather than having massive profits now at the expense of few or none later. I am sure there are companies doing both...

Have Fun

Previous articles:
Trade Deficit and Foreign Debt -1
Trade Deficit and Foreign Debt -2
Trade Deficit and Foreign Debt -3
Trade Deficit and Foreign Debt -4
Trade Deficit and Foreign Debt - 5
Related Articles:
Australian Trade Partners

Wednesday, April 26, 2006

Battlefield 2 fun #3

I made the 5000 point Staff Sergeant rank last week, I took the upgrade to the assault weapon. I have been playing around with a tool FRAPs which displays the framerate in the game and also can take movies as well. Every time I do something which would be interesting to see again it is too late unfortunately.

I have been disappointed in every upgraded weapon apart from the medic upgrade. The scope and semi-auto accuracy are very good. The rest of the weapons (apart from the sniper) seem less accurate, I still seem to shoot everything but too much in close combat even with the sights trained squarely on the enemy.

I would love EA to have kept data so that I could review my performance over time, I am definitely a better player now, but as the stats are still aggregate my earlier low scores still have too much significance of the points per minute and accuracy counters etc.

It is still great fun, even better to kill people who are clearly using some kind of dodgy aim device. I see it in every round with 5-10% of people immediately doing the dive/roll and shot on seeing you. More often then not that shot is a headshot. Are they just that good? From that distance? With that weapon? When I am moving at the same time?
As my muscle memory improves now I instinctively jump or dodge for cover (mostly in vain), sometimes I survive the first shot and kill them (much to their amazement I guess). I had one guy do the move 3 times and I still got him as I kept a building between me and his headshots.

If I feel I am getting better and the game is now almost 12 months old, does that mean the other players are also going to better rather than rookies?
Or have the best left for the next game and only the latecomers and therefore rookies are still left?

Have Fun

New Job

I have resigned from my current job (still working due 4 week notice and no replacement) and taken a role with Pythian.
The new role starts on the 22nd May 2006 and I will traveling over to Canada to catch up with Paul Vallee and the team. After that I will be back to Australia to provide coverage for the North American evening and early morning, at least in the beginning.
The role should be great, plenty of variety with the bonus of working for a DBA company (finally).

We live in a 3 bed house with a granny flat (one bedroom/bathroom detached studio) out the back. That is going to become the office for a while. It is good as now with two small children, we have reached the conclusion that we need that space for the study/office and each kid can have their own bedroom. Given we rent as Sydney house prices are bad unless you want to live 1 1/2 to 2 hours commute from the CBD, we plow the difference between rent and a massive mortgage (if we were to buy equivalent) into our own property and the added expenses of children :)

Have Fun

UNION and Venn diagrams

I had a developer come around today with a problem piece of SQL where he needed to get a distinct list of users (userids) which the application hadn't set preferences correctly for.
The joys of no constraints in the db. Here was a classic example where a constraint/rule i.e. a user must have a preference (or at least a default) in the beginning would have saved this whole data cleansing process now.
This was MySQL 4.0 so there were no subqueries (not that they were needed anyway).

We spent, well I spent about 10 minutes understanding exactly what he was trying to achieve with the SQL he provided but the overall aim of the SQL.
In the end I got out the trusty pad and drew some Venn (Set) diagrams of the JOINs and shaded the areas he wanted back. This was enough for us both to see the existing SQL was going to return the wrong result!

So where does UNION come into it?

UNION exists in almost every relational database and is used to combine to rowsets of data together. UNION will remove duplicates, UNION ALL will leave duplicates.
So instead of using a temporary table to store the rowset and issue another SQL, with UNION you can get the desired rowset in one group. This type of operation validates a performance rule i.e. visit the same data once or a few times as possible.

Example:

select userid from t1 join t2 on t1.id = t2.id
union
select userid from t3 join t4 on t3.id = t4.id

If there is a table on both sides of the UNION there is potential to replace with the appropriate join or subquery, sometimes it is easier though (unless performance is a problem) to leave it as a UNION for readability.

The end result was SQL which provided the result required, without the complexity of temporary tables. It hasn't fix the poor database design yet, as we still need to use distinct and worry about dupes.

Have Fun

Paul

Tuesday, April 25, 2006

Great article as always

The number of blogs which I use RSS to check grows over time. I use an Firefox extension called Sage to perform the aggregation and checking.

The blog: Creating Passionate Users is always a great read. This article covers the data to wisdom hierarchy with plenty of great links.

I have thought that the killer app (or at least one) is transforming data into wisdom. Given my functional programming slant plus exposure to datawarehousing and ETL (Extract Transform Load) ideas, I see this process as passing the data and then the result of each step through to a compiled form which can used to give experience, understanding and wisdom.

I kinda of see this compiled form being the input into a 3D computer environment, with the understanding and experience gained via something like a hero's journey or quest.
This allows the reuse of existing components, in the form of 3D gaming engines and Massive Multiple Player Online communities such as Everquest, World of Warcraft etc.

Whilst this is high tech, plenty of illiterate tribes have used stories and myths to pass understanding onto the next generation. So gaining understanding and wisdom through stories should be natural and easy to understand.

I keen to see what other peoples comments as, if you have links to other sites which are doing this that would be great.

Have Fun

Wednesday, April 12, 2006

Expanding Oracle knowledge

I have been using Oracle XE on Ubuntu to build the basic stock indicators. I have been using SQL to get a idea, later I am going to rewrite as functions and bundle into a package.

The main reasons for using Oracle over writing the equivalent in perl or something else is.

  1. The data now resides in a database, selecting and manipulating data is a breeze using a language I am intimate with (SQL).
  2. There is no need to reinvent the wheel or roll-your-own, after some playing with examples using the Analytical functions in Oracle is straight forward.
  3. It exposes me to new areas of Oracle and improves my skillset in Oracle. I know can take my expanded knowledge and apply that to other areas on my DBA work. I have a couple of ideas brewing along those lines.
I have come across a couple of great blogs,
  1. Oracle Data Mining and Analytics
  2. Mark Rittman
I have always enjoying learning new stuff both in database land and elsewhere. Pushing the envelope by expanding your knowledge, even within your specialty means that area remains fresh.
I have known plenty of people who have become DBAs only to move on as it was too boring or become so stale that they leave the IT area completely.

Have Fun

Paul

Saturday, April 08, 2006

Performance Tuning: Proactive tuning

Matt Rittman has a good review of a talk by Mogens on the different tuning methods and some comments about technology.

Tim Gorman had a talk about using the analytical functions LEAD/LAG to make statspack or AWR snapshots into a datawarehouse. Allowing you to track the performance of SQL across time. This allows you to identify the troublemakers that Mogens is talking about.
This link contains the whitepaper and slides from his talk to RMOUG.

Identifying and removing troublesome SQL before it becomes a problem is proactive tuning. Rather than let SQL build up and become a performance issue and then REACT, it is better to find and handle early.
This is a same for capacity planning. This task allows you to track the growth of tables, and extrapolate where they will be. Combining the growth data with performance data allows your to see what the impact of table growth is having on performance.

The scripts I have developed on SQLserver which use SQL profiler to capture SQL performance allow me to perform a similar process (although with much less metrics) and track the performance over time.

On a sidenote: Daylight savings went off without a problem after we implemented the blackout schedule for logshipping. Everything I read indicated that there wasn't a problem going backwards for the SQL Agent (next_date is the problem) however I was better to be safe then sorry.

Have Fun

Related: Automatic SQL Profiler

Wednesday, April 05, 2006

SQL Server vs Oracle: Ease of Administration

Oracle has released a study which shows that Oracle is easier to administrator as a DBA than SQL Server.
Here is a blog article about the results. I would have to agree with that blog article that any commissioned study has to very careful about bias.

My take on both databases is this, and I am in position to know as I am a DBA for both Oracle and SQLserver.
As long as nothing goes wrong SQLserver is a breeze to admin via Enterprise Manager.

However if anything goes wrong or requires some analysis, because the SQL server GUI hides the complexity, Oracle is better and easier to diagnose and tune. This is simple stuff like determining what SQL run over a period (out-of-the-box). Auditing is also easier.

The real killer continues to be the SQLserver locking method, it sucks! I hope SQL 2005 snapshots will help, but the whole readers can block writers is awful.

I have been gathering some interesting info regarding datetime datatype in SQL 2000. That is the next blog post.

Have Fun

Saturday, March 25, 2006

Oracle XE on Ubuntu

Finally I had a chance to fix up my install of Oracle XE (Oracle Express Edition) on my ubuntu linux box. This humble machine Pentium 3 660, 512M RAM is my play area for unix.
At some point last year I was completely over the stuffing around with Redhat (or Fedora) and tried Debian. That was a nightmare as well, the install hanging with no idea as to why.
So I asked my friendly unix team what they would recommend and use, and they mentioned Ubuntu which is debian distribution but nicer for dumb installers like me (yeh like I have never installed unix before)
So I grabbed a copy of the CD and installed no problems. Devices all found and very clean feel.
Rant: I am over any OS (including Windows XP) showing too much of the complexity to the user. Whilst I can jump thru the hoops, after a couple of rounds I am over the whole process and want something which just works properly. Ubuntu does just that.

Ok now to install Oracle XE. After playing around with MySQL, I decided to grab Oracle XE and install. This was NOT straight forward.
I found two good blog articles about installing Oracle XE on Ubuntu. However I had to combine both.
  1. Check how much space is on / using df -k. I didn't have enough space so I needed to make a symbolic link to somewhere else. This actually makes sense anyway as installing Oracle software in /usr/lib/oracle is not something you want to do with oracle anyway. On our enterprise boxes the software is on a separate partition/filesystem, either in /home/oracle/product/... or /oracle/product/... So the command is: sudo ln -s /home/oracle /usr/lib/oracle
  2. This article covers the install mostly.
  3. Whilst apt-get might work I had to go and download and install libaio1 and libc6. I got them from Ubuntu Packages. Use dpkg -i to install both packages.
  4. This article also covers how to add extra swap if the installer complains. I had to add additional swap space.
  5. Run the converted rpm file as per article in step 2.
So now I have a version of Oracle XE running. I can now see how easy compared to MySQL LOAD INFILE it is to load my stock files into Oracle. Then I can start using a database will a more complete set of statistics functions :)

Have Fun

Paul

Saturday, March 18, 2006

SQL Server procedure cache weirdness

We have had a ongoing problem where a sqlserver instance logging web activity and running reports takes 100% of CPU. This is a dual xeon CPU box, with each xeon hyperthreaded which is almost equivalent to 4 CPUs.

Yesterday I had the luxury of looking at everything about this problem as it was occurring without the need to return the system to a more stable state rapidly.
I looked at everything, and there was no perfmon counter and no bad waits/blockers in the instance. Eventually I removed all the sessions which were consuming CPU without any response and tried to rerun the stored procedure they were calling.
That proceeded to hang as well. The normal duration of this call is 4-6 secs. For a report that is not too bad given the size of the tables.
Given that the procedure should have worked, that suggested there was a some problem with the stored procedure. I extracted the SQL which is run and executed it. The SQL worked and proceeded to return data (without hanging).
This suggested that there was something wrong with the actual calling or executing of the stored procedure.
Background, SQLserver parses and stores the stored procedure and the determined execution plan in memory in an area called the procedure cache. Given this stored procedure had been parsed in the past, this potentially meant there was a problem with the cached stored procedure and most likely the execution plan.

So I executed the stored procedure with the recompile option.
eg. EXEC usp_report_this WITH RECOMPILE

The result was the stored procedure now worked and return a result within the expected duration. When I removed the recompile option, the call proceeded to hang again. This confirmed that the stored procedure in the procedure cache was corrupt in some form.
So to get rid of the problem, I used DBCC FREEPROCCACHE to flush the procedure cache of all cached stored procedures and execution plans.
The retest of the call with the recompile option now worked as well.

Given I am still unable to determine why the procedure cache was corrupted or how that occurs we recreated the stored procedure and added the WITH RECOMPILE option, so that all future calls of that stored procedure will be recompiled.
eg.
create procedure usp_fred
declare ...
WITH RECOMPILE
as

Given the reason for the corrupt stored procedure is unknown at this point and searches on Microsoft and google didn't reveal anything meaningful. If I find out the reason I will update this blog entry or produce a new one with a more thorough explanation.

Adding the WITH RECOMPILE option is something which should only be used as a last resort. As the parsing and determining execution plan part of the execution of SQL or stored procedures can add significant CPU overhead. We are lucky the frequency of these reports is low enough not to be a significant problem.

Have Fun

US real interest rate and emerging market financial crisis

This article talks about the correlation between US real rates and emerging market financial crisis.

Essentially a carry trade of borrowing in a low rate country and buying/investing/speculating in a high rate country is effected by rapid increases in the real rate in the low rate country.

There are plenty of investments which rely on and extrapolate current growth rates to justify higher Net Present Value (NPV), via higher future cash flows, without taking into consideration the effect of foreign source investment dollars having the ability to bid up the prices of services, goods and investments within the high rate country.

Have a play with the NPV calculator in Excel or whatever spreadsheet you use to see what the effect of increasing the growth rate of cash flows does the NPV.

Have Fun

Paul

Monday, March 13, 2006

Modeling connection pooling

There is a good direction in database land away from the idea that tuning is all seat-of-the pants, smoke-n-mirrors stuff. In Oracle land it was the recognition of Oracle's wait events and the ability to see what Oracle was waiting on and hence discover a probable bottleneck. This replaced the hit ratio and other ratio crowd as the logical and objective nature of the using Oracle waits proved more benefical.
There is still a way to go, as there is plenty of good work done on areas which directly relate to database stuff. Unfortunately for lots of people, math causes some serious FUD (Fear, Uncertainty and Doubt).
I use sage as my RSS feed controller in firefox, whilst the list of new blogs grows, the ability to read content in a single sitting doesn't diminish. I came across an article from one of the couple of math related blogs. This blog talks about using Markov Chains to model resource pools.

I will be having fun tomorrow gathering the necessary information to check the database connection pools which have databases I support. A couple have excess numbers of pools (which waste memory resources) and with better modeling we should be able to recommend some lower numbers.

Have Fun

Sunday, March 12, 2006

Replacing Perl with MySQL and maybe Oracle XE

I have a series of Perl scripts which use Perl modules to perform some statistical calculations of stock market End-Of-Day (EOD) data. I had the prospect of writing a Perl script to return the values of statistics calculation.
So I decided it was time to convert from flat files to storing the data in a database. So I downloaded MySQL 5.1 and installed on a box running Ubuntu. It was a reasonably straight forward process, I had some fun and games with having the default data directory in /var/lib.
This was solved by shutting down MySQL (mysqladmin shutdown -u root -p), moving the files to a more appropriate location and creating a symbolic link (directory) pointing at the new location.
The next problem was the lack of some important statistical functions such as correlation. I had a look around on the net and found that Oracle has the ANSI SQL standard CORR(x,y) function.
So off the www.oracle.com to download Oracle XE.
Given the box in question has 512M of RAM, the 1Gig limit isn't a problem. Plus I will get the CORR function and plenty of other analytical functions to boot.

The next step is get the grips with the loading of data, most likely with external tables. The MySQL function LOAD INFILE 'filename' INTO table is very easy to understand and use.

Have Fun

Paul

Thursday, March 09, 2006

Interesting articles and Dynamic SQL

I was websurfing and came across Steve Yegge website/blog from reddit.com.
I spent time last night and time day digesting the articles on the site. I would recommend you go and have a read of some of his articles. He has also put his top 7 most popular by traffic at the top.

I have played around a bit with various languages since starting with computers. Given my area is more technical and support than development or software engineering, I haven't had the need to learn any language more than being able to understand what the gist of the code and the overall logic.

The thing which makes me want to spend more time with Lisp and its macros as they remind me of dynamic SQL. Lisp macros are more powerful and expressive but the ability to generate code from code is a similar concept.

Good security suggests that any object permissions should be granted to roles and those roles granted to individual users/logins. Here is a good example of dynamic SQL in action.

-- Create the role read_only
-- Prompts for a schema name (owner)
-- Generates GRANT SELECT ON owner.table_name TO read_only

create role read_only;

set echo off
set pagesize 0
set verify off
spool /tmp/read_only_grants.sql
select 'GRANT SELECT ON '||owner||'.'||object_name||' TO read_only;'
from dba_objects
where owner = '&owner'
and object_name not like '%$%'
and object_type in ('TABLE','VIEW')
order by object_name;
spool off
@/tmp/read_only_grants.sql


You can extend this even further using DECODE or CASE to handle granting execute permissions on packages,procedures and functions.

select 'GRANT '||
CASE
when object_type in ('TABLE','VIEW') then 'SELECT'
when object_type in ('PACKAGE','PROCEDURE','FUNCTION') then 'EXECUTE'
else 'SELECT' end
||' ON '||owner||'.'||object_name||' TO read_only;'
from dba_objects
where owner = '&owner'
and object_name not like '%$%'
and object_type in ('TABLE','VIEW',
'PACKAGE','PROCEDURE','FUNCTION')
order by object_name;


I use dynamic SQL all the time, it saves time and allows the quick generation of repeating SQL.
It is like mail merge in Word or like Lisp macros.

Have Fun

Paul

Wednesday, March 08, 2006

Battlefield 2 fun #2

I have been playing Battlefield 2 (BF2) a bit over the last month or so, and it has been fun. There are some annoyances and suspect deaths (people shooting the other direction and getting a head shot) but the online game play is good, most players who have managed those JFK ballistic miracles don't last long as the admins are quick to ban suspect behaviour.

There are weapon upgrades for most kits as your player progresses up the ranks, from private to corporal all the way to a four star general. If you play on a ranked server you can get your scores updated to the main EA server.

The ranked servers which are most popular in Aus. are located at Telstra's GameArena. There are lots of stats available to see how well you are playing online, you can also compare yourself to other players included the highest ranking, rising stars (most change in points) etc.
At the moment I am averaging under 1 point per minute of game play, which is pretty ordinary. I blame the tools I use of course, a older P4, 1.5G RAM but only nVidia 5200 card means that sometimes my framerate and lag within the game are bad. I had one situation where I saw an opponent jump from another building, turn and fire in 3 lagged frames, naturally my ability to evade death is limited in that situation.
With all these handicaps :) I still have made Sergeant (2500 points) and now face the next rank of Staff Sergeant (5000 points).
At 1 per minute of game play I am looking at about 2200 minutes = 36 hours of game play to reach the next level, if I manage to double the rate, it is still 18 hours of free time to make the next rank. The highest rank is a whopping 100000 points, which is 97600 minutes of game play, If I was to play 8 hours every day it would take 203 days to reach that rank!!
I am not sure what the lifespan of the game is and whether the best players would have moved on by that stage, along with the majority of hosted servers to the next best thing by the end of 2006.

Naturally given these calculations gives pause to thoughts extended solo games and the social aspect of playing with mates in a squad kicks in, as does the weird and amazing kills and deaths which occur in the game.

Have Fun

Oracle: Performance feature 1

A lot of people have known that since Oracle 8 you have had the ability to create functional indexes. Basically functional indexes allow you to create an index of calculated values. The most common example provided is the infamous to_char (date-to-char) conversion function.

Consider this SQL, which returns the product and amount for today from the sales table.

select product,amount
from sales
where to_char(purchase_date,'YYYYMMDD') = '20060308'


If there is an normal index on purchase_date, it can't be used in this case as the explicit conversion of purchase_date to a character string.

Here Oracle's functional index can come to the rescue. Creating an index of to_char(...) calculated values:

Create index ix_func_sales_pdate
on sales(to_char(purchase_date,'YYYYMMDD'))


Henceforth the above SQL can use the functional index to retrieve calculated values of to_char, and sped access to rows which have those calculated values.

However, hat-tip to Tom Kyte of asktom.oracle.com fame, you can use the ability to create an index of calculated values to speed access for any user-defined function!
Oracle normal indexes (B-Tree) don't index NULL values. So you create a function which will return NULL for rows you don't want (which won't be indexed) and non-NULL values for rows which you require.
The best use of this ability is for columns of low cardinality, which are highly skewed to one value. For example, a table contains a column opt_out which tracks permission to see marketing information. The opt_out column has two values 0 or 1, where 0 = No and 1 = Yes. Naturally it is skewed, which most values in the column being 0.

You have a nightly process called send_newsletter which requires a list of people who have opted in, your choice of index for this column should be a functional index not a BITMAP index. BITMAP indexes work better when there are multiple low cardinality columns.
You need to create a function which turns opt_out = 0 into NULL. A simple solution is to use CASE.
eg.
CASE
when opt_out = 0 then NULL -- the most common first
when opt_out = 1 then 1
else NULL end

Creating a functional index on that function will mean Oracle will only index rows which have opted IN to receive the newsletter. Retrieving rows using SQL like below will sped access via the functional index.

select email_addr
from customers
where opt_in(opt_out) = 1


This is a powerful feature of Oracle which can be used in much more complex functional examples. The main caveat is that inserts,updates will be slower as the row values have to be calculated on insert or update. This penalty is less than the penalty of BITMAP indexes.

Have Fun

Monday, March 06, 2006

Oracle SQL equivalent to SQLserver TOP

In SQLserver, you can issue the following code to only return the TOP n rows.

Eg. To return the first 300 rows from a search table (to return to a webpage) order by price (lowest first) where suburb is rich and transport_rating is less than 3

select TOP 300 * from search
where suburb = 'Rich'
and transport_rating <= 3
order by price


In Oracle, there is no TOP function to perform this, the simplest method is wrap the SQL returning the rowset as an inline view and use ROWNUM to limit, the same requirement as above:

select * from
(select * from search
where suburb = 'Rich'
and transport_rating <= 3
order by price)
where ROWNUM <=300


Both SQLserver and Oracle have HINTs which you can hint to the database which type of plan to execute the SQL. In most cases, you want the rows back as fast as possible (for websites).
So for the SQLserver code add the following OPTION(FAST n) HINT. This hint will tell SQLserver you want the first n rows back as fast as possible. This normally means that SQLserver will choose to do NESTED LOOP joins over MERGE or HASH JOINs.

select TOP 300 * from search
where suburb = 'Rich'
and transport_rating <= 3
order by price
OPTION(FAST 300)


In Oracle there is something similar called FIRST_ROWS, you can either set it at the database level or for individual sessions or in individual SQL. The Oracle SQL code is rewritten as follows,

select * from
(select /*+ FIRST_ROWS */ * from search
where suburb = 'Rich'
and transport_rating <= 3
order by price)
where ROWNUM <=300


You can also pass the number of rows to Oracle as well as follows:

select * from
(select /*+ FIRST_ROWS(300) */ * from search
where suburb = 'Rich'
and transport_rating <= 3
order by price)
where ROWNUM <=300


I find the SQLserver method cleaner, however getting used to Oracle INLINE views opens up a world of powerful options.

Have Fun

Paul

p.s. I thought I would start to post more often. I am not sure I can do daily however that is plan for the next 30 days.

Saturday, March 04, 2006

Oracle 10G SQL Tuning Advisor

Over the 9 months or so I have been supporting a datawarehouse which has been running on 10G R1. The experience has been fun, given the new features and strange in other respects (see my post about PARALLEL_AUTOMATIC_TUNING behaviour.

Oracle 10G comes with a new PL/SQL package called DBMS_SQLTUNE, which is the API or command line way of running the SQL Tuning advisor, instead of through Oracle EM GUI.

Sometimes SQL or PL/SQL is complex, running to many lines and in some cases involving several levels of nested inline views. It can take time just to figure out what the SQL is trying achieve, before you can actually start to review if there are any potential tuning opportunities.

Enter SQL tuning advisor, in a nutshell it takes sql, either in text, referenced by SQL_ID or from a snapshot or captured source and does a more indepth search for a better execution plan.

If you want a quick analysis to kick off the tuning exercise this tool is probably to best spot to start.
I have added some simple scripts to run SQL Tuning advisor from the sqlplus command line at the end of this post. I used similar scripts to produce some analysis in which the advisor found a better execution plan (which I verified). It reduced the running time substantially.
DBMS_SQLTUNE has had extra work in Oracle 10G R2 and has lots of additional features.

Have Fun

Paul

### Cut here ###
-- Script to create tuning task and automatically run tuning task using default settings
-- Input: sql_id captured from v$sql or history tables
-- PM 2006/03/04

set serveroutput on size 80000

declare v_taskname varchar2(50);
begin
v_taskname := dbms_sqltune.create_tuning_task(sql_id=>'&sql_id');
dbms_output.put_line('TaskName: '||v_taskname);
dbms_sqltune.execute_tuning_task(v_taskname);
end;
/

### End of script ###

### Cut Here ###

-- Script to report results from tuning task
-- Input: TaskName
-- Modify linesize or chunksize to sort required output width.
-- PM 2006/03/04

set long 8000
set long_chunksize 200
set linesize 200
set pagesize 3000

spool sqltune_report;
select dbms_sqltune.report_tuning_task('&task_name') from dual;
spool off;
### End of script ###

Friday, February 24, 2006

Database performance articles and the need to understand your database.

I was reading through various blogs and this site caught my eye. Lo and behold here are some great articles on SQLserver performance tuning by Dennis Forbes.

Whilst the articles are written with SQLserver in mind, the 95% saving or 20x faster is a metric I can understand and have seen from experience. If you have time read them now, otherwise bookmark for later.

I would add that whilst a lot of developers acknowledge the need for databases, they want their application to be database generic, essentially using the database as a black box. This leads from a database equivalent to a flat file system with a SQL wrapper to a tables which mirror the object model and require multiple joins with complex relationships to tables containing lots of redundant data.
I agree with Tom Kyte when he writes that he finds the need for the database to be generic as strange given the programming language, application server, webserver etc all tend to be specific and definitely not generic.
This all leads to products like hiberate which seek to remove the need to understand the database and relational model completely. Just call essentially a rule based translation tool to handle the mapping of object models to relational tables.

My advice is to read the concepts manual of the database you are going to be using, understand the architecture, understand ACID and most important, understand your data i.e. your content is an essential ingredient in why your company creates value which customers require.

Have Fun

Paul

Sunday, February 05, 2006

Charolais website

My sister and brother have a Charolais Cattle stud. I thought I would provide a link to their website.
They have been breeding for the tenderness and marbling genes are pretty close to having the all genes required which have been discovered and can be tested.

Have Fun

Sunday, January 22, 2006

SQL Performance reports from profiler traces

Over the December and January period I have been working with some developers to get their database performance improved. Not that there has been a problem with performance, rather we have been looking proactively at the slower stored procedures and reviewing the SQL and the logic behind each to see if we can improve.

This has lead to the need for the ability to compare the performance of stored procedures over periods. After finally fixing the buggy SQL code which returns the stored procedure name or DML that was run, I reworked some older code to return a comparison of some performance metrics over two periods.
This code uses two inline views which then join on the stored procedure name or DML SQL to compare number of calls, total duration and the main one, duration per call. This enables us to produce a report which can quickly identify code which is running significantly slower.

Unfortunately I am not going to publish that code just yet. I want to rework into a stored procedure which can be added to an expanding library of jobs and stored procedures to performance tune SQLserver.
The plan is to have a complete solution to provide the tools to automate performance metric collection and reporting using tools provided by SQLserver. I am hoping that people will be able to download a zip file which provides an easy install.
Tuning based on the reports is always going to be harder, but capturing the information is the first step.

After some other workload is over (doing an database migration), I will have time to recode into stored procedure and also install and play with SQLserver 2005.
Given the new performance tables, and my existing library of Oracle scripts I should be in a position to remap those Oracle scripts to use the new SQLserver 2005 tables where possible.

Have Fun

Battlefield 2 fun

I haven't played FPS (First Person Shooters) in a while. Having a family with 2 little kids doesn't lend itself to having much time to be mucking around playing computer games.
However after I cleaned up the study (spare bedroom) in the house, Santa was good enough to give me Battlefield 2 for xmas.

I had been playing the demo on and off for 2 months and decided to buy the game. Demos are a great way to try before you buy. The full version of the game is always better than a crippled version. In BF2 demo, it was one map, but it was enough to showcase the game.

So I am happily smashing the computer AI bots in singleplayer even on veteran level, so it was time to face reality and play against humans. You can not code the computer to act like humans.
For example, in another multiplayer game Starcraft (real time strategy RTS), we were playing each other in a local LAN game of 6 players. One of players nuked their own base to destroy an attack. The computer would never do that.

So I go to the GameArena Bigpond Games servers and after some false starts due to network problems I am in a 32 player map. I got my ar*e kicked. I think I managed about 10 kills over about 5 games, and was killed 10+ times each of those.
The hardest part was I knew noone in the game, plus I hadn't been using the chat and talk functions at all, so I couldn't communicate quick enough. eg. Stop driving as there are mines ahead.
On the fun side of things I learned a couple of things
1) Head ons at sped are a good way to kill more people even if you die.
2) Some players are awesome flying jets, deadly accurate too.

The funniest situations were
1) Missing throwing a grenade into a helicopter three times... I need practice. I managed to bounce the grenade off the helicopter instead of landing inside which would have killed 4.
2) Being rocket-ed to death and returning the favour later.

More practice is required which should be fun

Have Fun