How to build your own ready-to-use Neo4j image

In our previous Neo4j articles we described how to deploy Neo4j to Kubernetes and how to run backups on the community edition. We use a similar set-up as described in the blog in production. But after a while we realised the Init Container is not such a convenient way to preload the jar-files needed for the backup and the APOC library. This is because every time the container was restarted we had to download the files. During development this happened quite frequently, but it also happened in production (e.g. new version release). In case of an emergency on the production system we were dependent on the availability of these remote resources from an external provider. If the provider would be temporarily unavailable, the pod would not restart. We therefore thought about an alternative to minimize dependencies on external resources. The idea is to build an image that already includes all needed jars.

We achieved this using a separate project, only managing the image for the project specific database. We used Maven and defined the needed jars as dependencies. This is convenient as we can automate the check for new versions by using a bot like a Renovate Bot on top of the repository to update these dependencies.

Our pom.xml therefore includes the dependencies on the jars needed for backup (some newer versions by now).

<dependencies>
    <!-- Neo4j plugins to directly access S3/minio -->
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-core</artifactId>
        <version>1.12.384</version>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-s3</artifactId>
        <version>1.12.384</version>
    </dependency>
    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpclient</artifactId>
        <version>4.5.14</version>
    </dependency>
    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpcore</artifactId>
        <version>4.4.16</version>
    </dependency>
    <dependency>
        <groupId>joda-time</groupId>
        <artifactId>joda-time</artifactId>
        <version>2.12.2</version>
    </dependency>
</dependencies>

We then described a dockerfile to create our own specific Neo4j image dervied from the original Neo4j base image.

# original base image from Neo4j
FROM neo4j:4.4.16

# create temporary directory, download (only!) the needed plugin jars
# copy these to /var/lib/neo4j/plugins
RUN mkdir ./tmp-plugins-folder/
COPY . ./tmp-plugins-folder/
WORKDIR ./tmp-plugins-folder/
RUN ./mvnw dependency:copy-dependencies -DexcludeTransitive=true
RUN  [ "/bin/sh", "-c", " cp -v /var/lib/neo4j/tmp-plugins-folder/target/dependency/*.jar /var/lib/neo4j/plugins/"]

# cleanup
WORKDIR /var/lib/neo4j/
RUN rm -rf ./tmp-plugins-folder/

# copy pre-delivered APOC lib
RUN  [ "/bin/sh", "-c", " cp -v /var/lib/neo4j/labs/apoc-*.jar /var/lib/neo4j/plugins/"]

This will create a ready-to-use Neo4j image with all backup and APOC jars in place. If needed you could add some additional labels for the file. Your database pod should now reference the newly created image and you are up and running.

Continue reading: This article is part 4 of a series of 5 articles on Neo4j. The upcoming article is about presenting data with NeoDash to the management.