Event logging subsystem
🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions. |
1. General description
This subsystem provides centralized storage of technical logs for logging platform components and the registry in a unified form, full-text search, and building information and analytical representations through specialized web interfaces.
2. Subsystem functions
-
Saving event logs of the platform applications.
-
Saving the OpenShift container orchestration platform event logs.
-
Processing event logs and saving them to the search server.
-
Visualization of the event log data via web interfaces.
3. Subsystem technical design
The logging subsystem collects information about events from the entire cluster, both platform and registry, and stores them in the special Elasticsearch
repository.
The Kibana
web interface is used for data visualization.
The event logging subsystem aggregates the following types of logs:
-
Registry components: Container logs that make up the registry.
-
Infrastructure: The logs created by the infrastructure containers running on the OpenShift container orchestration platform.
Infrastructure components are containers that work in the openshift*
,kube*
ordefault
projects. -
Audit of virtual machines: The logs created by
auditd
— the OpenShift auditing system of virtual machines, which are stored in the/var/log/audit/audit.log
file, and the Kubernetes APIServer and OpenShift APIServer audit logs.By default, the logging subsystem does not store audit logs in the Elasticsearch
repository. If necessary, you can configure this for viewing them, for example, inKibana
.
The components of the event logging subsystem contain the following main components:
-
Exporters: They collect, aggregate logs from the Platform components and registries, format them and send to the log repository. The current implementation is
Fluentd
. -
Log Store: It stores the component logs. The current implementation is
Elasticsearch
, which is optimized for short-term storing. -
Visualization is the user interface used to view logs and dashboards. The current implementation is
Kibana
.
This diagram shows the components that are the part of the Event logging subsystem and their interaction with other subsystems.
4. Subsystem components
Component | Namespace | Deployment | Origin | Repository | Designation |
---|---|---|---|---|---|
Web interface for viewing the Platform’s event logs |
|
|
3rd-party |
Web interface for accessing, searching and displaying technical event logs in the Platform. |
|
Platform log repository |
|
|
3rd-party |
Serves as a repository of logs, where all data collected by exporters is stored. Elasticsearch allows you to quickly and efficiently search and analyze the aggregated data from logs. |
|
Operator of the event logging subsystem |
|
|
3rd-party |
Supports configuration, deployment and maintenance of the event logging subsystem in OpenShift. |
|
Log storage operator |
|
|
3rd-party |
Supports configuring, deployment, and maintainance of the Elasticsearch event log repository in OpenShift. |
5. Kibana dashboards
The dashboards below are manually installed by following the appropriate instructions for administrators.
Dashboard | Technical name | Designation | Link |
---|---|---|---|
Request dashboard |
|
Provides overview information on the registry operation, in particular the status of external requests execution. |
Dashboard: github:/epam/edp-ddm-logging/main/dashboards/kibana/request-dashboard.json Setup instructions: Visualization of the request states in Kibana during registry operation |
Event logs dashboard |
|
Provides consolidated information from all collected event logs. |
github:/epam/edp-ddm-logging/main/dashboards/kibana/request-dashboard.json Setup instructions: Working with event logs in the Kibana application interface |
6. Technological stack
The following technologies were used in the design and development of the subsystem:
7. Subsystem quality attributes
7.1. Scalability
The Event logging subsystem is deployed in the High Availability mode with several instances of key components, which allows effective processing of the event logs even if there are many deployed registries in the Platform.
7.2. Performance
The Event logging subsystem provides fast full-text search and analysis of data from event logs due to the optimized Elasticsearch repository and Lucene search syntax.
7.3. Reliability
The Event logging subsystem can operate reliably and continue to log events even if its individual components or virtual machines fail.
7.4. Observability
The Event logging subsystem records and provides detailed information about events in the Platform in order to facilitate the identification and resolution of possible failures and issues by the Platform administrators when the Platform operates in the production environment.