First we add an OS user. This user will be used to start/stop/restart the database. This user will also be the superuser for the cluster. The following command has to be executed as root or with
Next, we create a directory that will be the base directory for the new cluster. This could be anywhere. A standard location can be
/usr/local/pgsql/data. However, you might want to have the database cluster on a separate partition. In the event of an OS and associated file system crash, your database data remains intact. It can also be that you want to use faster spinning disks or Solid State Disks for the database cluster to improve performance. In short, performance and/or reliability concerns can make you choose a location other than the default location to initialize the database cluster. As root, we execute the commands:
[root@MyCentOS extension]# mkdir -p /pgdata/9.3
-p option ensures that the parent directory is also created if it is nonexistent:
[root@MyCentOS extension]# chown postgres /pgdata/9.3
Then, we switch user to
postgres. This is necessary because when we initialize the cluster, the user under which the command was executed becomes the owner of the cluster. The server process will also be owned by this user. We will go with the standard
user –postgres, which we created:
su - postgres
[postgres@MyCentOS ~]$ initdb -bash: initdb: command not found
This is because we haven't added the directory containing the PostgreSQL executables to the environment variable
PATH yet. We could provide the absolute path and make it work. However, it is better to set the variables.
postgres, we have the following lines:
PATH=$PATH:$HOME/bin export PATH
Just before the export
PATH line, add:
Then, try this:
[postgres@MyCentOS ~]$ which initdb /usr/bin/which: no initdb in (/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/postgres/bin)
Not surprising, as
.bash_profile doesn't get executed unless we source it or log out and log in. Log out, log in, and try again:
[postgres@MyCentOS ~]$ exit logout [root@MyCentOS ~]# su - postgres [postgres@MyCentOS ~]$ which initdb /usr/local/pgsql/bin/initdb
Now we are good to go! It's a good idea to execute the following:
initdb --help | more
Among the many parameters available, the important ones, in most cases, will be
--pgdata. This parameter is used to define the directory where the cluster will be initialized (where the database cluster should be stored). This is the only mandatory parameter. Another parameter that can be useful is
--pwprompt. Using this, we can set the password for the database superuser. So, we execute the following command:
initdb --pgdata=/pgdata/9.3 --pwprompt
As seen in the preceding screenshot, the process asks for the superuser password. Towards the end, it gives a warning that the trust authentication is enabled for local connections. This means that it will be possible to make connections from the localhost without being prompted for a password. It's a good idea to change this setting. We will come to this later. For more options available for
initdb, please refer to http://www.postgresql.org/docs/current/static/app-initdb.html.
As always, let's see what happened; which directories got created when we initialized the cluster:
cd /pgdata/9.3 [postgres@MyCentOS 9.3]$ find ./ -maxdepth 1 -type d ./ ./base ./pg_stat ./pg_clog ./pg_xlog ./pg_tblspc ./pg_twophase ./pg_subtrans ./global ./pg_notify ./pg_stat_tmp ./pg_snapshots ./pg_multixact ./pg_serial
[postgres@MyCentOS base]$ expo rt PGDATA=/pgdata/9.3/
PGDATA is the default data directory location.
[postgres@MyCentOS base]$ pg_ctl start server starting [postgres@MyCentOS base]$ LOG: database system was shut down at 2013-10-13 13:48:07 IST LOG: database system is ready to accept connections LOG: autovacuum launcher started
pg_ctl is a utility to start, stop, check the status, or restart the database cluster. Passing
initdb as a parameter results in the initialization of a cluster. More options can be explored at http://www.postgresql.org/docs/current/static/app-pg-ctl.html.
Now, we will go through the directories:
[postgres@MyCentOS base]$ pwd /pgdata/9.3/base [postgres@MyCentOS base]$ find ./ -type d ./ ./1 ./12891 ./12896 [postgres@MyCentOS base]$ oid2name All databases: Oid Database Name Tablespace ---------------------------------- 12896 postgrespg_default 12891 template0pg_default 1 template1 pg_default
[postgres@MyCentOS base]$ pwd /pgdata/9.3/base [postgres@MyCentOS base]$ psql psql (9.3.0) Type "help" for help. postgres=# \! ls 1 12891 12896 postgres=# CREATE DATABASE test; CREATE DATABASE postgres=# \! ls 1 12891 12896 16385 postgres=# DROP DATABASE test; DROP DATABASE postgres=# \! ls 1 12891 12896
OID stands for Object Identifier. They are used internally by PostgreSQL as primary keys for various system tables. Objects created in the cluster (tables, databases, and so on) will have OIDs associated with them. The
oid2nameutility is available as a
contribmodule and helps us examine the databases, tables, and related file nodes.
We note a few things here. First, we were not prompted for a password (remember the warning about trust authentication?). Next, we can execute host commands from
psql(more on this in a later chapter). Third, creating/dropping databases are actually similar to creating/deleting directories. PostgreSQL does do quite a bit of internal book-keeping when we create or drop objects in addition to manipulating directories.
In short, we have directories to hold containers of real user created data (tables and tablespaces), directories to hold data about all the data and data structures (metadata), and then directories to hold data about the state of transactions.
[root@MyCentOS ~]# ps -fupostgres
The following screenshot illustrates the processes spawned:
We can see that there is one parent process (
pid1566), which spawned a few child processes.
One important file that gets created when we initialize a database cluster is
postgresql.conf. This file contains a number of critical parameters related to the server processes and resource allocation, memory management, logging, file locations, and so on. Another file is
pg_hba.conf."hba" (which stands for host-based authentication). Changes to these files are necessary to enable client connection to the database from a different machine. Both of these are in the
postmaster.pid file in the same directory is used by
pg_ctl to determine whether the database is running:
[postgres@MyCentOS 9.3]$ pg_ctl status pg_ctl: server is running (PID: 1566)
The contents of the file are as follows:
[postgres@MyCentOS 9.3]$ head -1 postmaster.pid 1566
1566 number in the
pid file is the same as what we got for the parent process when we did a process listing earlier.