dish Monitoring Service
A small & fast monitoring service written in Go: introducing dish
by vxn.
Introduction
dish is a tiny, one-shot executable written in Go. It is meant to help monitor remote endpoints, websites and services with ease when you do not wish to resort to heavier monitoring solutions.
Being a one-shot executable means that in order to monitor the specified services you start dish
, it performs the check, optionally submits alerts and then it stops. There is no long-running agent or server - this, combined with being a single, small binary makes it an easily maintainable and portable solution.
See below for a quick example of running dish
:
Installation
Following the brief introduction, let's fetch and install dish
!
Four methods are currently available:
- Local Go runtime
- Docker
- Homebrew
- Manual download of the built binary
Local Go runtime (Linux)
The Go runtime is required to build a common Go project. You can find and install it via this link.
The installation (including the build part) itself is very easy to start, just type:
# Fetch and install the specific version (after the @ sign)
go install go.vxn.dev/dish/cmd/dish@latest
export PATH=$PATH:~/go/bin
The export
command ensures that the shell can find the executable from its input.
Docker
Another way of running dish
is to fetch an image with the Go runtime and build it inside a container. This process utilizes the staging approach, where the build and release parts are isolated, and the final executable is then copied from the build stage into the clean release stage.
To use this approach, the Docker runtime/engine has to be installed in the system.
The build process is made into a gnumake
target, so in the repository's root just type:
make build
This procedure will build a light image containing just the necessary OS runtime (size varies depending on the base image of choice: e.g. alpine
vs. debian
), and the dish
executable itself.
The image can be used in many ways. Some of the examples would be:
- Using the provided compose stack example from the repo (hardcoded into the compose configuration file)
- Using native Docker run
Examples are shown below.
# Run using docker compose stack
make run
# Run using native docker run
docker run --rm \
dish:1.10.4-go1.23 \
-timeout 15 \
${SOURCE_URL}
Homebrew
Simply run the following:
brew install dish
Manual Download
Download the built binary for your OS and architecture from our GitHub repository.
Configuration
dish
provides multiple configurable parameters such as the source of the endpoints to be monitored (also referred to as sockets
), which notification channels to use or whether successful checks should also be reported through these channels.
Socket List
The sockets
can be one of the following:
- A generic server exposing a TCP port
- An HTTP/S server or proxy
A simple configuration of a socket to be checked can be seen below:
{
"sockets": [
{
"id": "vxn_dev_https",
"socket_name": "vxn-dev HTTPS",
"host_name": "https://vxn.dev",
"port_tcp": 443,
"path_http": "/",
"expected_http_code_array": [200]
}
]
}
One way we can tell dish
which sockets should be checked is by using a JSON file as the source argument (in the format shown above):
dish /opt/dish/sockets.json
The other way is providing the same configuration format as a response from a JSON API:
dish myremoteapi.example.xyz/dish/sockets
Flags
There is a plethora of supported flags which alter the behavior of dish
. They can for example specify which channels to use for notifications or if caching of sockets from a remote API should be used.
This command will show the list of available flags:
dish -h
An example of the output would look like this:
dish -h
Usage of dish:
-cache
a bool, specifies whether to cache the socket list fetched from the remote API source
-cacheDir string
a string, specifies the directory used to cache the socket list fetched from the remote API source (default ".cache")
-cacheTTL uint
an int, time duration (in minutes) for which the cached list of sockets is valid (default 10)
-hname string
a string, name of a custom additional header to be used when fetching and pushing results to the remote API (used mainly for auth purposes)
-hvalue string
a string, value of the custom additional header to be used when fetching and pushing results to the remote API (used mainly for auth purposes)
-machineNotifySuccess
a bool, specifies whether successful checks with no failures should be reported to machine channels
-name string
a string, dish instance name (default "generic-dish")
-target string
a string, result update path/URL to pushgateway, plaintext/byte output
-telegramBotToken string
a string, Telegram bot private token
-telegramChatID string
a string, Telegram chat/channel ID
-textNotifySuccess
a bool, specifies whether successful checks with no failures should be reported to text channels
-timeout uint
an int, timeout in seconds for http and tcp calls (default 10)
-updateURL string
a string, API endpoint URL for pushing results
-verbose
a bool, console stdout logging toggle
-webhookURL string
a string, URL of webhook endpoint
Usage and Integrations
In this section, some more detailed examples of usage are presented.
Telegram
For alert notifications when one or more checks fail during a run, an integration with the Telegram IM provider is available. The integration presumes there is a Telegram group created with any registered Telegram Bot being present. To enable the integration, a set of two flags has to be appended to the CLI command:
telegramBotToken
to specify the secret to identify the Bot (the message is sent on its behalf)telegramChatID
to specify the chat group where the composed report message is to be sent
An extended example is shown below:
# Load sockets from a sockets.json file and use the Telegram provider for alerting
dish -telegramBotToken "123:AAAbcD_ef" \
-telegramChatID "-123456789" \
/opt/dish/sockets.json
The resulting Telegram notification then looks like this:
Prometheus Pushgateway
The other way to ensure quick notifications is an integration with Prometheus via Pushgateway. A potential notification will be delayed a bit, because Prometheus performs the target scraping periodically, usually in tens of seconds.
# Use a remote JSON API endpoint as the socket source and push the results to Pushgateway
dish -target https://pushgw.example.com/ \
https://api.example.com/dish/sockets
Remote API
Not only can you use your own API endpoint to provide dish
with sockets to be checked; you can also tell it to push the check results to your endpoint! This way, you can extend on the monitoring functionality in any way you may wish to.
Remote API integration also supports using a custom header (used mostly for authorization) via the -hname
and -hvalue
flags.
# Use a remote JSON API endpoint as the socket source and push the results to a result endpoint
dish -updateURL https://api.example.com/dish/results \
-hname X-Auth-Key \
-hvalue yourkey \
https://api.example.com/dish/sockets
dish
pushes the results to the target API in the following JSON format:
{
"dish_results": {
"openttd_TCP": false,
"text_n0p_cz_https": true,
"vxn_dev_https": true
}
}
Webhooks
You can also use the -webhookURL
flag for a typical webhooks integration:
# Use a remote JSON API endpoint as the socket source and push the results to a webhook URL
dish -webhookURL https://mywebhookurl.xyz \
https://api.example.com/dish/sockets
dish
pushes the results to the webhook in the same JSON format as when pushing to a remote API:
{
"dish_results": {
"openttd_TCP": false,
"text_n0p_cz_https": true,
"vxn_dev_https": true
}
}
Custom Integrations
What you do with the results is then entirely up to you. We have built multiple internal integrations which rely on dish
pushing these results to our own API endpoint. A few of these will be showcased below.
Status Page
Our very own status page loads our public services, their states and times of their last check from our internal API. dish
pushes the results of the checks of these services to the same API. This way, if any check fails, it is automatically and immediately reflected on the status page.
dish GUI
dish GUI
is our graphical interface built for managing dish
sockets stored in our API. It supports the usual CRUD functionalities and helps us quickly make any socket visible or hidden from our status page, set maintenance
status or mute the socket. Muted sockets are ignored by dish
when loading the socket list from our API.
dish GUI
also connects to our API's real-time events endpoint using SSE. This way, if any of the sockets is reported to the API by dish
as being down, we can get notified in real time right in the dashboard:
The same is true in case of the opposite situation, where a previously failed socket goes back up:
Under the hood
In this part, some technical aspects of the project will be explained in more detail.
Socket fetch
To run its tests, dish
needs a list of sockets to check. As mentioned in the introduction, both a local JSON file or a remote API endpoint can be used as a source.
If the source
argument contains an HTTP/S URL to a remote source, dish
will perform an API call to the destination address to fetch the socket list (see Fig. 8).
Socket List Caching
dish
also supports caching of the configuration pulled from a remote API. This is useful when using it to run frequent periodic checks (e.g. every minute) to ensure your important services and websites are up and running. In these cases, caching prevents hitting your API endpoint frequently and (most often) unnecessarily. If your remote API endpoint goes down, the cached configuration (if present) will be used to continue running until the endpoint is back up, even when the cache is considered expired (old/expired list of sockets to check is better than none!).
Concurrent run
When the socket list is loaded, dish
spins off a goroutine per socket. Therefore, each socket is being checked in its own goroutine:
- TCP sockets are checked directly by dialing the remote host + socket combination.
- HTTP/S endpoints are checked via a GET request using
dish
's HTTP client.
These checks run concurrently (not exclusively in parallel due to how concurrency works in Go). This approach shrinks the execution time from several seconds to a single one (sometimes even less than that) on average.
Comparison Against a Serial Run
If the serial approach was used, where sockets are tested one by one, a problem would arise: when a socket times out, the whole queue of the rest of the socket list has to wait until the current check times out (10 seconds by default). This could cause an unnecessarily long execution time, depending on the number of sockets timing out. For example, if the default timeout was used and we were checking 6 sockets, all of which would time out, this would result in a minute-long run.
Check reports
Each goroutine performing a check is assigned a dedicated channel. These channels are then combined into one common channel using the fan-in technique right after the last goroutine is spawned:
func fanInChannels(channels ...chan socket.Result) <-chan socket.Result {
var wg sync.WaitGroup
out := make(chan socket.Result)
// Start a goroutine for each channel
for _, channel := range channels {
wg.Add(1)
go func(ch <-chan socket.Result) {
defer wg.Done()
for result := range ch {
// Forward the result to the output channel
out <- result
}
}(channel)
}
// Close the output channel once all workers are done
go func() {
wg.Wait()
close(out)
}()
return out
}
This feature enables the consumer to get a single source of the socket check reports.
After all checks are performed (either by succeeding, failing or timing out), the common channel is ready to be read from. A message reporting the results of the checks is then prepared. There are 2 types of report message types: text (for text channels such as Telegram) and machine (for machine integrations such as webhooks or Pushgateway for Prometheus).
Conclusion
dish
started out as a small, learning project. Over time it grew on us thanks to its simplicity, ease of use and maintainability. We have been using it for over 3 years to monitor our services and could not manage without it. We hope you find it as useful as we do.
You can find the source code on our public GitHub profile or just visit the repository directly via this link.