A tool for distributed container monitoring over Kubernetes.
A tool for distributed container monitoring over Kubernetes.
Basic diagram
collector module, which saves the data in a CSV fileFor more information about the collected metrics, please refer to:
/proc filesystem using both psutil Python API and /sys/block/<dev>/stat.
cgroups.
/proc filesystem using psutil Python API.
| Type | Unit | Metric |
|---|---|---|
| CPU | Quantity Quantity Quantity Quantity Clock Ticks Clock Ticks Clock Ticks Clock Ticks Clock Ticks Clock Ticks Clock Ticks Clock Ticks Clock Ticks |
Context Switches Interrupts Soft Interrupts Syscalls Times User Times System Times Nice Times Softirq Times IRQ Times IOWait Times Guest Times Guest Nice Times Idle |
| Memory | Quantity Quantity Quantity Quantity Quantity KB KB Quantity Quantity Quantity Quantity |
Active (Anon) Inactive (Anon) Inactive (file) Active (file) Mapped Pages KB Paged In Since Boot (pgpgin) KB Paged Out Since Boot (pgpgout) Pages Free (pgfree) Page Faults (pgfault) Major Page Faults (pgmajfault) Pages Reused (pgreuse) |
| Disk | Requests Requests Sectors Milliseconds Requests Requests Sectors Milliseconds Requests Milliseconds Milliseconds Requests Requests Sectors Milliseconds Requests Milliseconds |
Read I/O Read I/O Merged with In-queue I/O Read Sectors Total Wait Time for Read Requests Write I/O Write I/O Merged with In-Queue I/O Write Sectors Total Wait Time for Write Requests I/O in Flight Total Time This Block Device Has Been Active Total Wait Time for All Requests Discard I/O Processed Discard I/O Processed with In-Queue I/O Discard Sectors Total Wait Time for Discard Requests Flush I/O Processed Total Wait Time for Flush Requests |
| Network | Bytes Bytes Packets Packets |
Sent Received Sent Received |
| Type | Unit | Metric |
|---|---|---|
| CPU | Clock Ticks Clock Ticks Clock Ticks Clock Ticks Clock Ticks |
User Time System Time Children User Children System IOWait |
| Memory | Pages Pages Pages Pages Pages Pages Pages |
Total Program Size (size) Resident Set Size (resident) Resident Shared Pages (shared) Text (text) Data + Stack (data) |
| Disk | Requests Requests Bytes Bytes Chars Chars |
Read Write Read Write Read Write |
| Network | Bytes Bytes Packets Packets |
Sent Received Sent Received |
| Type | Unit | Metric |
|---|---|---|
| CPU | Clock Ticks Clock Ticks Quantity Quantity Clock Ticks |
User System Periods Throttled Throttled Time |
| Memory | Pages Pages Pages Pages Pages Pages Pages Pages Pages Pages Pages Pages |
Resident Set Size (rss) Chached Mapped (mapped_file) Paged In (pgpgin) Paged Out (pgpgout) Page Faults (pgfault) Major Page Faults (pgmajfault) Active (active_anon) Inactive (inactive_anon) Active File (active_file) Inactive File (inactive_file) Unevictable |
| Disk | Bytes Bytes Bytes Bytes Bytes Bytes |
Read Write Sync Async Discard Total |
| Network | Bytes Bytes Packets Packets |
Sent Received Sent Received |
Before installing Kubemon, make sure Kubernetes and Docker are properly installed in the system.
Download the latest version here: kubemon
Extract the zip file and go on the extracted directory
Update the nodeName field in kubernetes/04_collector.yaml to your the name of your Kubernetes control-plane node.
Apply the Kubernetes objects within kubernetes/:
$ kubectl apply -f kubernetes/
namespace/kubemon created
configmap/kubemon-env created
persistentvolume/kubemon-volume created
persistentvolumeclaim/kubemon-volume-claim created
service/collector created
service/monitor created
pod/collector created
daemonset.apps/kubemon-monitor created
The following subsection will detail about how to configure and execute the data collecting process.
Kubemon has a few variables that can be defined by the user. For instance, some of the required fields to be configured before running the tool is NUM_DAEMONS, which denotes the expected amount of client instances should be connected to the collector component. In addition, the Kubemon components are configured through environment variables inside the Kubernetes pods.
The configuration file is at kubernetes/01_configmap.yaml. At the current version of Kubemon, the configmap lists all the configurable variables. You can update according to your needs.
The collected metrics will be saved in the Kubernetes control-plane node by default, in /mnt/kubemon-data. This setting can be changed in ./kubernetes/02_volumes.yaml by updating the hostPath field.
Example:
# Before
...
hostPath:
path: "/mnt/kubemon-data"
# After
...
hostPath:
path: "/home/user/data"
To start the collecting process, you can either start the CLI or execute commands within Python.
Example with the CLI:
$ make cli host=10.0.1.2
Waiting for collector to be alive
Collector is alive!
>>> start test000
Starting 2 daemons and saving data at 10.0.1.2:/home/kubemon/output/data/test000
Example by using the CLI API within Python:
>>> from kubemon.collector import CollectorClient
>>> from kubemon.settings import CLI_PORT
>>>
>>> cc = CollectorClient('10.0.1.2', CLI_PORT)
>>> cc.start('test000')
Starting 2 daemons and saving data at 10.0.1.2:/home/kubemon/output/data/test000
Within the CLI:
>>> stop
Stopped collector
Using the API:
...
>>> cc.stop()
Stopped collector
You can retrieve all the implemented commands by either typing help within the CLI prompt or by running .help() method from the API.
All the commands:
'start': Start collecting metrics from all connected daemons in the collector.
Args:
- Directory name to be saving the data collected. Ex.: start test000
'instances': Lists all the connected monitor instances.
'daemons': Lists all the daemons (hosts) connected.
'stop': Stop all monitors if they're running.
'help': Lists all the available commands.
'alive': Tells if the collector is alive.