A centralized log analytics platform can serve as the source of truth for effective operational and cyber incident response.
Our research has identified nearly 92% of small and mid-market businesses don’t have an existing log-based model for infrastructure event monitoring.
In this guide, we explore how you can leverage key open source tools: Graylog and Ansible - to gain control over what’s happening in your infrastructure.
Check out the Github repository for codesamples referenced in the guide below.
Requirements for Deployment
The foundation of this guide uses on-premise Docker infrastructure. Log servers are critical assets during incident response and ensuring a compliant deployment is a pre-requisite for getting started.
As a general rule, you should already be familiar with linux, containerization, and networking technologies but we’ve explained in-depth the reasoning behind each configuration for all developer audiences.
Docker Environment

Docker Docs Docker provides a containerization platform using atomic image-based deployment for preconfigured resources.
Runtime While the docker runtime can be virtualized, a dedicated always-on Unix host with power and network redundancy is strongly recommmended.
Common RHEL repositories include Podman, a Docker compatible runtime however the compose scripts may not be fully supported for deployment and are untested with this configuration.
Security
We’ll reference the open source docker/docker-bench-security tool for PMO IT compliance.
The host node should be configured with audit.rules
and validated as meeting key security specifications.
Managing Clients with Ansible Playbooks

Ansible Docs Ansible is a runtime framework for automation of multivariate system tasks using standardized operational IT routines defined in ansible playbooks.
Client nodes should be preconfigured with ansible runtime for remote management. For new environments, use the Fedora/CentOS/RHEL compatible ansible.setup.sh script over SSH to configure client runtime.
By following this guide, you’ll learn how to use docker-compose
to setup a multi container instance of Graylog with Postgres. Here’s what we’ll cover in the Docker section.
- Creating a MongoDb container for Log Storage using Docker Compose
- ElasticSearch for Accelerated Log Analytics
- Deploying Graylog Web Server using Docker Compose
- Configuring Graylog to Accept Syslog Input using Web Interface
- Migrating Systems for Centralized Logging with Ansible
Create the following tree structure with a boilerplate compose.yaml
1graylog-ansible-playbook2├── client3├── server4└────└── compose.yaml
Let’s get started below.
1. Creating a MongoDb container for Log Storage using Docker Compose
In the compose.yaml
we’ll use the following notation for declaring contents, useful for printing to console when calling $ head
command.
1# --------------------------------2# Infra::Docker::Graylog+MongoDB+ElasticSearch3# Docs: << docs link >>4# Github: << github repo >>5# --------------------------------
Graylog can emit to multiple outputs but by default, it’s optimized for database usage. We’ve selected MongoDb due to it’s recognized high-availability and scalability.
Mongo is provided as an official image rolled and tested from source on Docker Hub with a CI/CD pipeline. The recommended variant of MongoDb for Graylog is the pervious major release due to the scaffolding scripts used.
docker-compose.yaml
1version: '3.3'23services:4 # MongoDb Container5 mongodb:6 image: mongo:37 container_name: qone.graylog.mongodb8 restart: always9 volumes:10 - ./mongodb:/data/db11 - ./mongodb/mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js:ro12 env_file:13 - mongo.env14 networks:15 - d1_graylog16 ports:17 - 9052:2701718 deploy:19 resources:20 limits:21 cpus: '1'22 memory: 1G
Let’s breakdown each component of the mongodb
services section in more detail to understand the reasoning behind these properties.
restart Restart policy must be configured as a precaution to failure management such as I/O errors thrown from a remounting disk. Note that it’s insufficient by itself and should be paired with other approaches such as database and volume cloning (i.e. RAID1) redundancy.
volumes
For the purposes of this guide, we’ll use an isolated and mounted filesystem $ cat /etc/hosts
for MongoDb data and can be any arbitrary subdirectory on the host.
Run the following shell command to prevent accidental of the path: $ sudo chattr +i ./mongodb
.
Note: Docker security recommends seperation of runtime from data storage, however native docker image volumes are hosted on the same disk. To define isolation rules in /etc/docker/daemon.json
, refer to the docker docs otherwise proceed with the following workaround.
env_file: mongo.env
To manage securables in docker-compose, place the credentials in a seperate mongo.env
file. This can be excluded globally with **/*.env
in .gitignore
to prevent leakage of secrets in code repositories.
The env file will contain parameters for the root
credentials for the MongoDb cluster from external connections outside of graylog. Replace the following <<placeholders>>
accordingly.
1MONGO_INITDB_DATABASE=<<root_db>>2MONGO_INITDB_ROOT_USERNAME=<<root_username>>3MONGO_INITDB_ROOT_PASSWORD=<<secure_password>>
mongo-init.js
Since Mongo does not include a default database constructor, to generate a custom database for graylog, clone the mongo-init.js and place in root where docker-compose.yaml
is going to be executed from on the host.
Let’s take a closer look at mongo-init.js
where you’ll need to specify custom credentials for the new graylog database.
1db.createUser(2 {3 user: "<<graylog_user>>",4 pwd: "<<password>>",5 roles: [6 {7 role: "readWrite",8 db: "graylog" # <<graylog_database>>9 }10 ]11 }12);
These credentials will be used in graylog.conf
below.
networks
While docker creates a default
virtual network on the host, a seperate isolated instance should be used for namespace coalescing.
The same network must be declared on all services and globally outside the services
serction in the yaml
file. Refer to the Docker Docs: Compose File v3 Reference - Network Driver on selecting the driver type overlay (swarm) || bridge (default)
suitable for your environment.
1services:2 mongodb:3 ...4 networks:5 - d1_graylog6 elasticsearch:7 ...8 networks:9 - d1_graylog10 graylog:11 ...12 networks:13 - d1_graylog1415networks:16 d1_graylog:17 driver: bridge
ports
The default MonoDb port is 27017
which can be reference under expose
or by using host forwarding 0.0.0.0:{{HOST_PORT}} -> 172.0.0.1:27017 {container}
.
We’ll forward the port to host as it allows for accessing the data for usage in services such as a Grafana Dashboard or Machine Learning analytics.
For cross-container services, note that the container port 27017
takes presedence in graylog.conf
.
1services:2 mongodb:3 ...4 ports:5 - {{HOST_PORT}}:27017
It’s strongly recommended against exposing container ports to the web due to the potentially sentitive nature of data. In certain circumstances, you may be able to use a Reverse Proxy or client VPN approach.
deploy
Next, to comply with Docker Audit spec requirements #X.XX we provision reosurce limits on each of the container services. Generally the following limits are sufficient to consume up to 50 clients
.
Resource limits prevent leakage such as from request overloads during a Distributed Denial of Service (DDoS) attack from crashing the host.
1deploy:2 resources:3 limits:4 cpus: '0.5'5 memory: '1G'
The same convention with modified values will be used for graylog
and elasticsearch
container services.
2. ElasticSearch for Accelerated Log Analytics
ElasticSearch provides cached acceleration of search over data stores and is a required service for the purpose of Graylog deployment.
|graylog/graylog |-|-|-| |
1# ---------------------------2 # Elasticsearch::Cache3 # Docs: https://www.elastic.co/guide/en/elasticsearch/reference/6.x/docker.html4 # ---------------------------5 elasticsearch:6 image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.107 container_name: graylog.escache8 restart: always9 volumes:10 - ./escache:/usr/share/elasticsearch/data11 networks:12 - d1_graylog13 environment:14 - http.host=0.0.0.015 - http.port=920016 - transport.tcp.port=930017 - transport.host=localhost18 - network.host=0.0.0.019 - "ES_JAVA_OPTS=-Xms512m -Xmx512m"20 ulimits:21 memlock:22 soft: -123 hard: -124 deploy:25 resources:26 limits:27 cpus: '2'28 memory: 2G
Exploring a breakdown of these components:
volumes
ElasticSearch requires a local tmp fs directory which can be placed on the same relative volume as MongoDB.
For instances with large clients, use a dedicated flash storage medium and $ ln -s <<source>> ./escache
or hardcode the path as necessary.
1volumes:2 - ./escache:/usr/share/elasticsearch/data
ulimits
To learn more about memlock
refer to the SLES Docs - Memlock which provides an a great overview.
environment
Since there are no securables in elasticsearch, the network configuration is explicitly set. These are default values referenced in the generated graylog.conf
, more on that below.
networks
Recall from the MongoDB breakdown, we’re using the same network ffor this service.
3. Deploying Graylog Web Server using Docker Compose
The graylog image can be found in Docker Hub with extensible configuration. As of this guide, there are three release cadence branches but it is strongly recommended to use the production variant.
|graylog/graylog |-|-|-| |
1# ---------------------------2 # Graylog::Production3 # Docs: https://hub.docker.com/r/graylog/graylog/4 # ---------------------------5 graylog:6 image: graylog/graylog:3.37 container_name: graylog.server8 restart: always9 volumes:10 - ./gldata/config:/usr/share/graylog/data/config11 networks:12 - d1_graylog13 - default14 depends_on:15 - mongodb16 - elasticsearch17 ports:18 # Host:Container19 # Graylog Web Interface and REST API20 - 9050:905021 # Syslog TCP22 - 514:51423 # Syslog UDP24 - 514:514/udp25 # GELF TCP26 - 9051:1220127 # GELF UDP28 - 9051:12201/udp29 deploy:30 resources:31 limits:32 cpus: '2'33 memory: 4G
container_name
Following the convention we’ve been using in this guide, set a developer friendly container_name
.
restart
Docker audit recommends a detailed restart
policy be limited with max_retries
.
In case of failure of the database container, using an asynchronous approach that awaits for dependency restart can be padded with a buffer delay such as 15000ms
.
However for always on log collection servers, an exception to use always
to all services can be made.
volumes
Clone the github file config.sh to generate a runtime configuration for graylog. It includes default values where you can populate the MongoDb connectionString
securables.
Generated with $ ./config.sh
, place the edited graylog.conf in the volume path ./gldata/config
where the container is going to be deployed from.
graylog.conf
Here’s the sections to edit in the generated graylog.conf, along with detailed explanations below:
1password_secret = << 96_char_token >>23root_username = << admin_user >>4root_password_sha2 = << shasum >>5root_email = << consulting@quant.one >>67# CEST: New York / Toronto8root_timezone = America/Atikokan
Graylog uses 96 char
hash for securable key rotation which can be generated using rotate_key.sh and output stored in password_secret
.
Set the credentials for the root user that’ll be used to login to the web interface. As graylog.conf is unencrypted, generate a SHA256 SHASUM of your input password using shasum.sh and store the hash in root_password_sha2
.
The timezone should be localized to the clients feeding inputs and will be rendered in log dashboards. If you’re managing geo-distributed systems, use the local time for the PMO office.
depends_on
To allow MongoDb and Elasticsearch container services to complete entrypoint
initialization, we’ll use the docker compose asynchronous await model in depends_on
.
Otherwise, a fault may occcur while Graylog attemps to connect to an uninitialized container.
deploy
Modified values have been applied for streaming data and input workers. Refer to the explanation in the MonoDB section above.
docker-compose.yaml
Here’s the overview of docker-compose.yaml covering the topics mentioned above: services, networks, volumes and compliance.
MongoDb + ElasticSearch + Graylog
1# --------------------------------2# Infra::Docker::Graylog+MongoDB+ElasticSearch3# Docs: https://docs.graylog.org/en/3.3/pages/installation/docker.html4# Github: https://github.com/quantoneinc/Log.Analytics.Graylog.Ansible/5# --------------------------------67version: '3.3'89services:10 # ---------------------------11 # MongoDB::Data12 # Docs: https://hub.docker.com/_/mongo/13 # ---------------------------14 mongodb:15 image: mongo:316 container_name: graylog.mongodb17 restart: always18 volumes:19 - ./mongodb:/data/db20 - ./mongodb/mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js:ro21 env_file:22 - mongo.env23 networks:24 - d1_graylog25 ports:26 - 9052:2701727 deploy:28 resources:29 limits:30 cpus: '1'31 memory: 1G3233 # ---------------------------34 # Elasticsearch::Cache35 # Docs: https://www.elastic.co/guide/en/elasticsearch/reference/6.x/docker.html36 # ---------------------------37 elasticsearch:38 image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.1039 container_name: graylog.escache40 restart: always41 volumes:42 - ./escache:/usr/share/elasticsearch/data43 networks:44 - d1_graylog45 environment:46 - http.host=0.0.0.047 - http.port=920048 - transport.tcp.port=930049 - transport.host=localhost50 - network.host=0.0.0.051 - "ES_JAVA_OPTS=-Xms512m -Xmx512m"52 ulimits:53 memlock:54 soft: -155 hard: -156 deploy:57 resources:58 limits:59 cpus: '2'60 memory: 2G6162 # ---------------------------63 # Graylog::Production64 # Docs: https://hub.docker.com/r/graylog/graylog/65 # ---------------------------66 graylog:67 image: graylog/graylog:3.368 container_name: graylog.server69 restart: always70 volumes:71 - ./gldata/config:/usr/share/graylog/data/config72 networks:73 - d1_graylog74 - default75 depends_on:76 - mongodb77 - elasticsearch78 ports:79 # Host:Container80 # Graylog Web Interface and REST API81 - 9050:905082 # Syslog TCP83 - 514:51484 # Syslog UDP85 - 514:514/udp86 # GELF TCP87 - 9051:1220188 # GELF UDP89 - 9051:12201/udp90 deploy:91 resources:92 limits:93 cpus: '2'94 memory: 4G9596networks:97 d1_graylog:98 driver: bridge
1graylog-ansible-playbook2├── server3│ ├── config.sh4│ ├── mongo.env5│ ├── mongo-init.js6│ ├── docker-compose.yaml7│ ├── rotatekey.sh8└────└── shasum.sh
With the server configuration completed, go ahead and deploy the infrastructure to your docker container using the docker-compose
command.
$ docker-compose -f docker-compose.yaml up
We strongly recommended using a failover cluster with reverse proxy or load balancing for scalable event handling in large volume client situations.
The deployment can be adapted for Docker Swarm multi-node clusters by updating the variant spec to version: 3.7
and referencing the compose file specification.
4: Configuring Graylog to Accept Syslog Input using Web Interface
Navigate to the running Graylog endpoint at https://node_ip:graylog_port/
and authenticate using the credentials previously defined in configuration.
Create a new input worker to accept 514/tcp
and 514/udp
by following the guide below:
<< video >>
Once input connections are established, we can map Unix hsots to push syslog content.
5. Migrating Systems for Centralized Logging with Ansible
rsyslog is a common linux systemd
service for event handling system logs and can emit to any output source such as graylog.
We’ll configure it on cliuents using Ansible by creating an operational routine known as a playbook.
Conclusion
Given the documented approach, you should now end up with a completed tree structure for both client and server scripting.
1graylog-ansible-playbook2├── client3│ ├── configure_rsyslog.yaml4│ ├── target.env5├── server6│ ├── config.sh7│ ├── mongo.env8│ ├── mongo-init.js9│ ├── docker-compose.yaml10│ ├── rotatekey.sh11└────└── shasum.sh
Check out the Github repository for completed codesamples referenced in the guide.
In a follow up post we’ll explore using Rundeck and Ansible Playbooks as an automated response to threat incident response. [Subscribe] to stay notified.