Performance tuning Maven-OpenCms builds using PostgreSQL
Having a newly installed Ubuntu 12.04 on my machine, I noticed that building the OpenCms project I am currently working on is a very time consuming process:
oli@rikit:~/develop/projects/foo$ time mvn clean install > mvn.log 2>&1 real 12m34.748s user 2m10.132s sys 0m6.836s
Uh, more than 12 minutes… definitely too much; even for an OpenCms project.
During the Maven builds install phase a lot of database write operations are performed: after an OpenCms module has been built (it’s a multi-module project so there are more than one OpenCms modules), each OpenCms module gets deleted, re-imported and finally published into the OpenCms VFS. For more details on how we do build automation for OpenCms projects with Maven, see here
I searched the web and quickly found the reason for this performance issue: on my machine I have an ext4 file system; the PostgreSQL server version installed is 9.1. And thats where PostgreSQLs Write Ahead Log (WAL) configuration settings become interesting. In short, the PostgreSQL server uses synchronous commits by default, which means that PostgreSQL waits for ext4 to confirm that page images have been written to the permanent WAL storage on disk.
One solution is to set the configuration parameter fsync to value off (file postgresql.conf). This prevents PostgreSQL from performing any attempt to synchronize database write operations by never invoking the operating systems fsync() system call. This introduces the risk of data corruption in the event of a power failure or system crash.
Another option is switching to asynchronous transaction commits which means that the PostgreSQL server does no longer wait for confirmation that the transactions WAL records have been written on disk. Instead, it continues just after the transaction commit is considered logically completed. This can be achieved by setting the configuration parameter synchronous_commit to value off (file postgresql.conf). This comes at the risk of data loss (but not data corruption, as with fsync=off).
After setting synchronous_commit=off the build process is much faster:
oli@rikit:~/develop/projects/foo$ time mvn clean install > mvn.log 2>&1 real 1m47.170s user 2m9.856s sys 0m6.688s
Applying fsync=off even saves some more seconds:
oli@rikit:~/develop/projects/foo$ time mvn clean install > mvn.log 2>&1 real 1m42.451s user 2m8.744s sys 0m6.668s
For more details on PostgreSQLs WAL mechanism and configuration options have a look at the PostgreSQL documentation: