postgresql FAQ
http://sraapowergres.com/en/newsletter/issue_01/faq.inc.html
This FAQ is geared for those people interested in the more common issues that one encounters with PostgreSQL. It will grow in time with every issue by adding 10 new FAQ's to our list. They have been tested against the 7.x version of PostgreSQL but they are equally valid for the 8.x series too.
Q:How do I datamine log files when the information loaded may already be present i.e. what do I do about data duplication?
One easy way is to upload the data into a table and then create a view using the DISTINCT clause that removes duplicate records:
CREATE TABLE mytable(comments text);You can now query the view v_mytable which returns only unique records:
CREATE VIEW v_mytable AS SELECT DISTINCT ON(comments) oid,comments FROM mychild;
copy mytable from logfile.txt
SELECT comments FROM v_mytable ORDER BY oid;
Q:How do I get a date, either from the past or in the future that is relative to a particular one such as today's date?
There are two techniques demonstrated here. The first method involves using a variation of either of these two statements which returns the date 30 days (1 month) from today:
SELECT now() + '1 month'::interval;The second method involves using the function date_mii() which returns the same date as the above example:
SELECT now() + '30 days'::interval;
SELECT date_mii(date(now()),-30);
Q:How do I load a file into a database?
One method of importing a file into a database is to take advantage
of PostgreSQL's Large Object mechanism which can load form of text or
binary file. Manipulating a large object can be carried out by psql the
PostgreSQL console client.
The following is an example of an import. Successful imports always return the Object Identification number (oid).
psql -Umyuser -c "lo_import 'myphoto.png' 'a picture of me' " mydatabaseThis is an example of an export and assumes the oid is 12345:
psql -Umyuser -c "lo_export '12345' 'myphoto.png' " mydatabaseOne can also import and export a large object using the equivalent server-side function:
SELECT lo_import ('/tmp/latestnews.ogg');
lo_import
-----------
444611
(1 row)Server side functions are restricted in that they can only be
used by the superuser. It can only access files on the machine where
the server is running and the superuser has permission to access (under
normal circumstances this excludes home accounts).
Q:Isn't there a way of loading a binary file into a record just like normal data without using Large Objects?
As a matter of fact yes. All programming languages such as perl,
tcl, python, php etc, with the necesary hooks into libpq, can insert
data directly using ordinary SQL INSERT commands.
Here's an example using the psql client, with a binary file called "mydocument.pdf".
Create a table called myfiles in a database called temp:
CREATE TABLE myfiles(mybinary text);Log into the database:
psql -Upostgres tempConvert and assign the contents of the binary file to base64, using "openssl", to the internal variable named "content":
set content ''' `openssl base64 -in mydocument.pdf` '''You can now insert this binary into the table:
INSERT INTO VALUES myfiles(:content);
Q:How do I carry out an SSL session between client and server?
SSL connections are possible only when the capability has been expressly compiled into the client,
such as the command line utility psql (using the --with-ssl switch). The postgreSQL server can accept
one of three states of encrypted communications: none (no SSL required), dedicated SSL (only SSL sessions are accepted),
detected (the server attempts SSL session but falls back to none if the client doesn't have any)
The settings in the files pg_hba.conf, postgresql.conf defines the type of socket connection.
Insert the following line in the file, postgresql.conf:
ssl = trueDefine the client's authentication in the file, pg_hba.conf, as for example;
#TYPE DATABASE USER IP-ADDRESS IP-MASK METHODYou can define a number of connections with a mix of encrypted and unencrypted sessions. For example, the first line insists on encrypted SSL session for the super user logging in from 192.168.2.0.1 while the second line permits the client sraa_client a non encrypted session coming from the same ip address. The third line tries first an SSL connection otherwise it switches over to a plain socket connection.
hostssl ALL postgres 192.168.0.1 255.255.255.0 password
hostnossl ALL sraa_client 192.168.0.1 255.255.255.0 password
host ALL sraa_admin 192.168.0.1 255.255.255.0 password
Q:What if I want an encrypted session and my client doesn't support it natively?
If you are communicating accross a network, you can still have an
encrypted session by running an SSH encrypted tunnel between client and
server. The method is to run "Port Forwarding" between the two
computers.
In this example, a ssh client is used on the database client side while
the SSH server is resident on the PostgreSQL server side. The client
executes the following command to establish a connection and goes into
the background once a successfull authentication has been made:
ssh -fN -L 6262:192.168.2.1:5432 user@192.168.2.1The pg_hba.conf file must be configured to permit login. You must usethe server's own ip address, 192.168.2.1. Here's an example configuration instruction:
# IPv4-style local connections:The postgresql.conf file must contain the line:
# TYPE DATABASE USER IP-ADDRESS IP-MASK METHOD
host all postgres 192.168.2.1 255.255.255 trust
tcpip_socket = trueThis example login is made by the psql console client, although any client would work. It connects to port 6262 on the client localhost, where it will be forwarded to port 5432 on the remote server where the PostgreSQL server resides:
psql -U postgres -h 127.0.0.1 -p 6262 template1
CAVEATS
- The SSL authentication and connection is not handled by thePostgreSQL i.e. the server thinks it's a plain tcp socket connection, since it's the SSH session that's handling the encryption.
- A SSH account is required on the server.
- You can eliminate using passwords by using public-key cryptography (refer to the SSH man page) Verify that your authentication will function with port forwarding i.e. start with as simple authentication as possible such as "trust" and increase it from there.
Q:How do I bind the PostgreSQL server to a particular ip address?
There are two ways of assigning the ip address. The first method is to Assign the port number to the PGPORT environment variable in the same shell that the server starts up, for example;
#!/bin/bashThe second method is to edit the postgresql.conf file:
export PGPORT=6262
pg_ctl -d data start
tcpip_socket = true
port = 6262
Q:I've made changes to template1 but now I want to remove them. How do I accomplish this?
Every database cluster requires template1. The advantage of adding
objects to template1 is that you can simplify the process of adding
functions, tables, triggers, procedural languages etc to a number of
databases by first adding them first here and then creating these new
databases with the createdb utility or the SQL command CREATE DATABASE.
Removing these additions from template1 can be difficult if you don't remember all the changes.
There's two ways creating a clean template1 database. The first
method is to obtain a datadump of template1 before any changes have
been made to it:
pg_dump -c -Upostgres template1>template1.dbYou can restore this data defintion back to the database as required:
psql -Upostgres -f template1.db template1The second method involves creating a new database, using template0 as the TEMPLATE, and dumping its data definition to a file. Restore template1 using this newly created datadump in a manner similar to what has been outlined in the first example:
createdb -Upostgres -T template0 replacement
pg_dump -c -Upostgres replacement >template1.db
psql -Upostgres -f template1.db template1
Q:How do I effect changes made in the configuration files without shutting down the server and losing client connections?
Most Unix based distributions include server scripts permitting the system administrator to run the PostgreSQL server. For example, in a Debian, Linux, distribution (a System V operating system) we can use:
/etc/init.d/postgresql reloadAnother method is to invoke the PostgreSQL server utility, pg_ctl, directly:
pg_ctl reloadYet a third method is to send a signal to the server process directly. This can be accomplished by obtaining the process id, where you can get it by grepping the output from the ps utility, and sending a SIGHUP signal using the "kill" utility. If the postgres server had a process number of 13751 , you could reload the configuration files by typing the following command in a terminal:
kill -s HUP 13751
Q:How do I determine the size of a table or database?
There is an optional module available in the PostgreSQL
"contributions" section that will return the size of any relation in a
database, or even the database itself. Look in your postgres library
directory for a file named "dbsize.so" to see if you have the module,
otherwise you will need to download the complete PostgreSQL source,
including the contributions source, and compile it yourself.
If the module is present you will need to create two functions, called
"database_size" and "relation_size", to your database before you can
use it. Carry out the following SQL commands in the database you are
currently logged into (note: these functions will only work in the
database where the following SQL commands have been invoked):
CREATE FUNCTION database_size (name) RETURNS bigintHere is an example invocation that returns the size of the database "template1":
AS '$libdir/dbsize', 'database_size'
LANGUAGE C WITH (isstrict);
CREATE FUNCTION relation_size (text) RETURNS bigint
AS '$libdir/dbsize', 'relation_size'
LANGUAGE C WITH (isstrict);
SELECT database_size('template1');
Here is an example invocation that returns the size of the table "myrelation" in the currently logged in database:
SELECT relation_size('myrelation');