Community post by Victor Chen, Engineer at Nightfall
Introduction
Fluent Bit is an open source log processor tool that was designed with performance and lower resource consumption than its predecessor FluentD in mind. There are 6 main components to its data pipeline:
- Input: Logs from different sources are gathered here, depending on the plugins enabled. Sources include HTTP, TCP, and operating system metrics, among others. An input can be tagged so that it only goes through specific filters and outputs.
- Parser: Parses unstructured data into structured data depending on the configuration.
- Filter: Alters/drops logs, e.g. adding a field, modifying/dropping fields, dropping the entire log, depending on the plugins enabled.
- Buffer: Stores logs either in memory or file system before they are flushed to any configured outputs.
- Router: Routes the logs to their designated outputs based on their tags.
- Output: Sends the logs to destinations such as remote services, local file systems, etc.
With the exception of buffer and router, these components work as plugins, and users can choose what they want to enable for their Fluent Bit deployment. Inputs and parsers are already fairly complete in their offerings, as there are only so many data protocols and log formats for Fluent Bit to receive and parse. That is why in this tutorial we will focus on filter and output plugins.
How to run Fluent Bit on your local machine
Follow the steps in https://docs.fluentbit.io/manual/installation/sources to build Fluent Bit from source. After it has been built, run
bin/fluent-bit
to start Fluent Bit locally.
Writing the Plugin
Source code for Fluent Bit plugins lives in the plugins directory, with each plugin having their own folders. This is where the source code of your plugin will go. Now we will go over the components of an example output plugin so you will know exactly what you need to implement in a Fluent Bit plugin. The plugin is a simple output plugin that appends an additional key-value pair to every log before sending them to a HTTP server. The source code of the plugin is here.
Required Interfaces to Implement in Plugins
All Fluent Bit plugins will have a set of function interfaces they have to implement where the logic of the plugin will live. Below is an overview of the interfaces filter and output plugins have to implement respectively.
- Filter
- cb_init: When a plugin is enabled in the config file, Fluent Bit will initialize an instance of the plugin on startup and will call cb_init. This is where you can read in config values for the plugin and initialize any state the plugin will need to function properly.
- cb_filter: This method is called whenever a log is routed to the filter. This is where you can modify or remove the log before it goes out to its destinations.
- cb_exit: This method is called when Fluent Bit shuts down. This is where you should free up any resources used by your instance, most likely the state you initialized in cb_init.
- Output
- cb_init: same as cb_init in filter plugins.
- cb_flush: This method is called when a chunk of logs in the buffer is ready to be delivered to its destination. This is where the logic to deliver to the designated destination should live. Note that cb_flush is called on a batch of logs while cb_filter is called on single log events.
- cb_exit: Same as cb_exit in filter plugins.
Since our example plugin is an output plugin, it implements cb_init, cb_flush, and cb_exit.
cb_init
This is where we do the setup work for our plugin so it can process logs when they come in later. We first initialize the context for our plugin, a struct that can be used to store state and will be available in cb_flush and cb_exit. Next, we read in configuration values for our plugin and store them inside our context with one of Fluent Bit’s built-in functions: flb_output_config_map_set. After that, we initialize an upstream, a struct that represents a host/endpoint we want to call, for our HTTP server where we will send logs to. Finally, we attach the context to the output plugin instance so it is persisted.
cb_flush
This function gets called when a chunk of logs is ready to be delivered. Logs that come into this function were serialized using msgpack, a library that Fluent Bit uses to serialize/deserialize data. We need to deserialize them before we can operate on them, so we initialize a msgpack_unpacked struct, a helper that helps us unpack/deserialize data. After the chunk of logs have been deserialized into individual log objects in the form of msgpack_object using msgpack_unpack_next, we check if the log is a map and skip if it isn’t as we can only append key value pairs to map objects.
Since we want to modify our log by adding an additional key value pair, we initialize a msgpack_sbuffer and msgpack_packer to help us pack/serialize the new modified log. We initialize the new map to be its original size + 1 to account for the additional key-value pair. Next, we pack in the original key value pairs in the log by iterating through the original log. Finally, we pack in our new key value pair.
Now that our new log has been serialized, it is time to send it to our HTTP server. First, we convert the log from msgpack serialization to JSON serialization. Next, we initialize an upstream connection, a struct that represents a connection to that upstream host for a single HTTP request, and initialize a HTTP request using that connection and our JSON request body. After that, we perform the request. Finally, we free up resources associated with structs we initialized.
cb_exit
This is where we free up any resources used by our plugin, which in our case would be the upstream struct and the context itself.
Reading in Configuration Values
Configuration values can be read and saved in a plugin’s context automatically using a built-in function called flb_output_config_map_set. You will also need to provide a config map that maps the configuration value to its destination in the context struct, as seen on line 168 in our example.
Enabling the Plugin
Now that we have the code for our example output plugin, it is time to build it. Add
option(FLB_OUT_EXAMPLE “Enable example output plugin” Yes)
to CMakeLists.txt in the root Fluent Bit directory. Next add
REGISTER_OUT_PLUGIN(“out_example”)
to plugins/CMakeLists.txt.
Once that is done, we can generate build files for our plugin by running
cmake -DFLB_OUT_EXAMPLE=On ../
in the fluent-bit/build directory. Finally, run
make
to build Fluent Bit and our example output plugin.
Running the Plugin
Create a configuration file similar to the following:
[INPUT]
name random
interval_Sec 3
[OUTPUT]
name example
match *
server_host <HOST_OF_YOUR_HTTP_SERVER>
server_port <PORT_OF_YOUR_HTTP_SERVER>
This configuration file enables the random input plugin to generate values and send it through Fluent Bit’s pipeline, as well as the example output plugin we just built. Next, start Fluent Bit with
bin/fluent-bit -c <PATH_TO_YOUR_CONF_FILE>
Your HTTP server will receive logs similar to the following. Note the additional key value pair appended to the logs just like what we expected.
2022/03/02 20:57:55 {"rand_value":16255892310423393680,"source":"fluent bit example plugin"}
2022/03/02 20:57:58 {"rand_value":16963087099013002297,"source":"fluent bit example plugin"}
Submitting your Plugin For Review
If you would like to contribute your plugin to Fluent Bit, here is a checklist of what needs to be done in order for it to be accepted:
- Make sure your code adheres to Fluent Bit’s coding style.
- Sign off your commits and format your commit messages as specified in the guidelines.
- Download Valgrind and use it to check for leaks in your plugin with valgrind –leak-check=full bin/fluent-bit -c <PATH_TO_CONFIG_FILE> and attach the summary to your PR to show that there are no leaks or memory corruption in the plugin. This is what the summary would look like if there were leaks and/or memory corruption errors.
==63179== LEAK SUMMARY:
==63179== definitely lost: 56 bytes in 1 blocks
==63179== indirectly lost: 0 bytes in 0 blocks
==63179== possibly lost: 0 bytes in 0 blocks
==63179== still reachable: 0 bytes in 0 blocks
==63179== suppressed: 0 bytes in 0 blocks
==63179==
==63179== For lists of detected and suppressed errors, rerun with: -s
==63179== ERROR SUMMARY: 94 errors from 12 contexts (suppressed: 0 from 0)
- Add documentation for your plugin by submitting a PR to Fluent Bit’s documentation repository. Good information to include in your documentation includes a brief introduction of your plugin, details about each configurable parameter, and example outputs/results of running your plugin.
Common Fluent Bit Library Functions/Types
Fluent Bit has a lot of convenience library functions for processing data and performing logic that will most likely be useful in building your plugin. Here is a brief overview of the most common ones.
Data representation (msgpack)
Fluent Bit uses msgpack to serialize and deserialize data.
- Deserialization
- Types
- msgpack_object: an object that represents deserialized data.
- msgpack_unpacked: an unpacker that unpacks/deserializes data into a msgpack_object.
- Functions
- msgpack_unpacked_init: initializes a msgpack_unpacked object.
- msgpack_unpacked_destroy: frees the resources associated with the msgpack_unpacked object.
- msgpack_unpack_next: deserializes the given data into a msgpack_object.
- Types
- Serialization
- Types
- msgpack_sbuffer: a buffer to store serialized data.
- msgpack_packer: a packer that packs/serializes data into a msgpack_sbuffer.
- Functions
- msgpack_sbuffer_init: initializes the underlying buffer to store serialized data.
- msgpack_sbuffer_destroy: frees the resources associated with the msgpack_sbuffer object.
- msgpack_packer_init: initializes a msgpack_packer that packs data into a msgpack_sbuffer.
- msgpack_pack_str_with_body: packs/serializes the specified string.
- msgpack_pack_object: packs/serializes the specified msgpack_object.
- msgpack_pack_array: packs/serializes an empty array of specified length. The array can be filled using other msgpack_pack functions, e.g. msgpack_pack_str_with_body for strings, msgpack_pack_int for integers, etc.
- msgpack_pack_map: packs/serializes an empty map of a specified number of key value pairs. The map can be filled using other msgpack_pack functions, e.g. msgpack_pack_str_with_body for strings, msgpack_pack_int for integers, etc.
- Types
Strings
Fluent Bit uses a version of the SDS library for string processing. SDS strings are fully compatible with any functions that take in a null-terminated sequence of characters.
- flb_sds_create_len: takes in a string and the length of the final SDS string to be returned and returns a SDS version of the string passed in with the specified length. Useful for getting the string representation of msgpack fields.
- flb_sds_create_size: initializes an empty SDS string of the specified size. Useful when used in conjunction with flb_sds_snprintf/flb_sds_printf to format strings.
- flb_sds_destroy: frees the resources associated with the given SDS string
HTTP Client
- flb_upstream_create: creates an upstream, a structure that represents a host/endpoint to call.
- flb_upstream_destroy: frees the resources associated with the upstream structure.
- flb_upstream_conn_get: gets a connection to an upstream host to use for a single HTTP request.
- flb_upstream_conn_release: releases the given upstream connection so it can be reused.
- flb_http_client: creates an HTTP client request with the given upstream connection, host, port and request body.
- flb_http_do: takes in an HTTP client request and performs it.