postgresql FAQ

Published on 09/28,2005

http://sraapowergres.com/en/newsletter/issue_01/faq.inc.html

This FAQ is geared for those people interested in the more common issues that one encounters with PostgreSQL. It will grow in time with every issue by adding 10 new FAQ's to our list. They have been tested against the 7.x version of PostgreSQL but they are equally valid for the 8.x series too.


Q:How do I datamine log files when the information loaded may already be present i.e. what do I do about data duplication?

One easy way is to upload the data into a table and then create a view using the DISTINCT clause that removes duplicate records:

CREATE TABLE mytable(comments text);
CREATE VIEW v_mytable AS SELECT DISTINCT ON(comments) oid,comments FROM mychild;
copy mytable from logfile.txt
You can now query the view v_mytable which returns only unique records:
SELECT comments FROM v_mytable ORDER BY oid;

Q:How do I get a date, either from the past or in the future that is relative to a particular one such as today's date?

There are two techniques demonstrated here. The first method involves using a variation of either of these two statements which returns the date 30 days (1 month) from today:

SELECT now() + '1 month'::interval;
SELECT now() + '30 days'::interval;
The second method involves using the function date_mii() which returns the same date as the above example:
SELECT date_mii(date(now()),-30);

Q:How do I load a file into a database?

One method of importing a file into a database is to take advantage of PostgreSQL's Large Object mechanism which can load form of text or binary file. Manipulating a large object can be carried out by psql the PostgreSQL console client.
The following is an example of an import. Successful imports always return the Object Identification number (oid).

psql -Umyuser -c "lo_import 'myphoto.png' 'a picture of me' " mydatabase
This is an example of an export and assumes the oid is 12345:
psql -Umyuser -c "lo_export '12345' 'myphoto.png' " mydatabase
One can also import and export a large object using the equivalent server-side function:
SELECT lo_import ('/tmp/latestnews.ogg');
lo_import
-----------
444611
(1 row)
Server side functions are restricted in that they can only be used by the superuser. It can only access files on the machine where the server is running and the superuser has permission to access (under normal circumstances this excludes home accounts).

Q:Isn't there a way of loading a binary file into a record just like normal data without using Large Objects?

As a matter of fact yes. All programming languages such as perl, tcl, python, php etc, with the necesary hooks into libpq, can insert data directly using ordinary SQL INSERT commands.
Here's an example using the psql client, with a binary file called "mydocument.pdf".
Create a table called myfiles in a database called temp:

CREATE TABLE myfiles(mybinary text);
Log into the database:
psql -Upostgres temp
Convert and assign the contents of the binary file to base64, using "openssl", to the internal variable named "content":
set content ''' `openssl base64 -in mydocument.pdf` '''
You can now insert this binary into the table:
INSERT INTO VALUES myfiles(:content);

Q:How do I carry out an SSL session between client and server?

SSL connections are possible only when the capability has been expressly compiled into the client, such as the command line utility psql (using the --with-ssl switch). The postgreSQL server can accept one of three states of encrypted communications: none (no SSL required), dedicated SSL (only SSL sessions are accepted), detected (the server attempts SSL session but falls back to none if the client doesn't have any)
The settings in the files pg_hba.conf, postgresql.conf defines the type of socket connection.
Insert the following line in the file, postgresql.conf:

ssl = true
Define the client's authentication in the file, pg_hba.conf, as for example;
#TYPE      DATABASE   USER         IP-ADDRESS    IP-MASK         METHOD 
hostssl ALL postgres 192.168.0.1 255.255.255.0 password
hostnossl ALL sraa_client 192.168.0.1 255.255.255.0 password
host ALL sraa_admin 192.168.0.1 255.255.255.0 password
You can define a number of connections with a mix of encrypted and unencrypted sessions. For example, the first line insists on encrypted SSL session for the super user logging in from 192.168.2.0.1 while the second line permits the client sraa_client a non encrypted session coming from the same ip address. The third line tries first an SSL connection otherwise it switches over to a plain socket connection.

Q:What if I want an encrypted session and my client doesn't support it natively?

If you are communicating accross a network, you can still have an encrypted session by running an SSH encrypted tunnel between client and server. The method is to run "Port Forwarding" between the two computers.
In this example, a ssh client is used on the database client side while the SSH server is resident on the PostgreSQL server side. The client executes the following command to establish a connection and goes into the background once a successfull authentication has been made:

ssh -fN -L 6262:192.168.2.1:5432 user@192.168.2.1
The pg_hba.conf file must be configured to permit login. You must usethe server's own ip address, 192.168.2.1. Here's an example configuration instruction:
# IPv4-style local connections:
# TYPE DATABASE USER IP-ADDRESS IP-MASK METHOD
host all postgres 192.168.2.1 255.255.255 trust
The postgresql.conf file must contain the line:
tcpip_socket = true
This example login is made by the psql console client, although any client would work. It connects to port 6262 on the client localhost, where it will be forwarded to port 5432 on the remote server where the PostgreSQL server resides:
 psql -U postgres -h 127.0.0.1 -p 6262 template1

CAVEATS

  • The SSL authentication and connection is not handled by thePostgreSQL i.e. the server thinks it's a plain tcp socket connection, since it's the SSH session that's handling the encryption.
  • A SSH account is required on the server.
  • You can eliminate using passwords by using public-key cryptography (refer to the SSH man page) Verify that your authentication will function with port forwarding i.e. start with as simple authentication as possible such as "trust" and increase it from there.

Q:How do I bind the PostgreSQL server to a particular ip address?

There are two ways of assigning the ip address. The first method is to Assign the port number to the PGPORT environment variable in the same shell that the server starts up, for example;

#!/bin/bash
export PGPORT=6262
pg_ctl -d data start
The second method is to edit the postgresql.conf file:
tcpip_socket = true
port = 6262

Q:I've made changes to template1 but now I want to remove them. How do I accomplish this?

Every database cluster requires template1. The advantage of adding objects to template1 is that you can simplify the process of adding functions, tables, triggers, procedural languages etc to a number of databases by first adding them first here and then creating these new databases with the createdb utility or the SQL command CREATE DATABASE.
Removing these additions from template1 can be difficult if you don't remember all the changes.
There's two ways creating a clean template1 database. The first method is to obtain a datadump of template1 before any changes have been made to it:

pg_dump -c -Upostgres template1>template1.db
You can restore this data defintion back to the database as required:
psql -Upostgres -f template1.db template1
The second method involves creating a new database, using template0 as the TEMPLATE, and dumping its data definition to a file. Restore template1 using this newly created datadump in a manner similar to what has been outlined in the first example:
createdb -Upostgres -T template0 replacement
pg_dump -c -Upostgres replacement >template1.db
psql -Upostgres -f template1.db template1

Q:How do I effect changes made in the configuration files without shutting down the server and losing client connections?

Most Unix based distributions include server scripts permitting the system administrator to run the PostgreSQL server. For example, in a Debian, Linux, distribution (a System V operating system) we can use:

/etc/init.d/postgresql reload
Another method is to invoke the PostgreSQL server utility, pg_ctl, directly:
pg_ctl reload
Yet a third method is to send a signal to the server process directly. This can be accomplished by obtaining the process id, where you can get it by grepping the output from the ps utility, and sending a SIGHUP signal using the "kill" utility. If the postgres server had a process number of 13751 , you could reload the configuration files by typing the following command in a terminal:
kill -s HUP 13751

Q:How do I determine the size of a table or database?

There is an optional module available in the PostgreSQL "contributions" section that will return the size of any relation in a database, or even the database itself. Look in your postgres library directory for a file named "dbsize.so" to see if you have the module, otherwise you will need to download the complete PostgreSQL source, including the contributions source, and compile it yourself.
If the module is present you will need to create two functions, called "database_size" and "relation_size", to your database before you can use it. Carry out the following SQL commands in the database you are currently logged into (note: these functions will only work in the database where the following SQL commands have been invoked):

CREATE FUNCTION database_size (name) RETURNS bigint
AS '$libdir/dbsize', 'database_size'
LANGUAGE C WITH (isstrict);

CREATE FUNCTION relation_size (text) RETURNS bigint
AS '$libdir/dbsize', 'relation_size'
LANGUAGE C WITH (isstrict);
Here is an example invocation that returns the size of the database "template1":
SELECT database_size('template1');
Here is an example invocation that returns the size of the table "myrelation" in the currently logged in database:
SELECT relation_size('myrelation');


Comments

Leave a Reply

 authimage