HTTP JSON inputedit
This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
Use the httpjson
input to read messages from an HTTP API with JSON payloads.
For example, this input is used to retrieve MISP threat indicators in the Filebeat MISP module.
This input supports retrieval at a configurable interval and pagination.
Example configurations:
filebeat.inputs: # Fetch your public IP every minute. - type: httpjson url: https://api.ipify.org/?format=json interval: 1m processors: - decode_json_fields fields: [message] target: json
filebeat.inputs: - type: httpjson url: http://localhost:9200/_search?scroll=5m http_method: POST json_objects_array: hits.hits pagination: extra_body_content: scroll: 5m id_field: _scroll_id req_field: scroll_id url: http://localhost:9200/_search/scroll
Additionally, it supports authentication via HTTP Headers, API key or oauth2.
Example configurations with authentication:
filebeat.inputs: - type: httpjson http_headers: Authorization: 'Basic aGVsbG86d29ybGQ=' url: http://localhost
filebeat.inputs: - type: httpjson oauth2: client.id: 12345678901234567890abcdef client.secret: abcdef12345678901234567890 token_url: http://localhost/oauth2/token url: http://localhost
Configuration optionsedit
The httpjson
input supports the following configuration options plus the
Common options described later.
api_key
edit
API key to access the HTTP API. When set, this adds an Authorization
header to
the HTTP request with this as the value.
http_client_timeout
edit
Duration before declaring that the HTTP client connection has timed out.
Defaults to 60s
. Valid time units are ns
, us
, ms
, s
(default), m
,
h
.
http_headers
edit
Additional HTTP headers to set in the requests. The default value is null
(no additional headers).
- type: httpjson http_headers: Authorization: 'Basic aGVsbG86d29ybGQ='
http_method
edit
HTTP method to use when making requests. GET
or POST
are the options.
Defaults to GET
.
http_request_body
edit
An optional HTTP POST body. The configuration value must be an object, and it
will be encoded to JSON. This is only valid when http_method
is POST
.
Defaults to null
(no HTTP body).
- type: httpjson http_method: POST http_request_body: query: bool: filter: term: type: authentication
interval
edit
Duration between repeated requests. By default, the interval is 0
which means
it performs a single request then stops. It may make additional pagination
requests in response to the initial request if pagination is enabled.
json_objects_array
edit
If the response body contains a JSON object containing an array then this option
specifies the key containing that array. Each object in that array will generate
an event. This example response contains an array called events
that we want
to index.
{ "time": "2020-06-02 23:22:32 UTC", "events": [ { "timestamp": "2020-05-02 11:10:03 UTC", "event": { "category": "authorization" }, "user": { "name": "fflintstone" } }, { "timestamp": "2020-05-05 13:03:11 UTC", "event": { "category": "authorization" }, "user": { "name": "brubble" } } ] }
The config needs to specify events
as the json_objects_array
value.
- type: httpjson json_objects_array: events
split_events_by
edit
If the response body contains a JSON object containing an array then this option specifies the key containing that array. Each object in that array will generate an event, but will maintain the common fields of the document as well.
{ "time": "2020-06-02 23:22:32 UTC", "user": "Bob", "events": [ { "timestamp": "2020-05-02 11:10:03 UTC", "event": { "category": "authorization" } }, { "timestamp": "2020-05-05 13:03:11 UTC", "event": { "category": "authorization" } } ] }
The config needs to specify events
as the split_events_by
value.
- type: httpjson split_events_by: events
And will output the following events:
[ { "time": "2020-06-02 23:22:32 UTC", "user": "Bob", "events": { "timestamp": "2020-05-02 11:10:03 UTC", "event": { "category": "authorization" } } }, { "time": "2020-06-02 23:22:32 UTC", "user": "Bob", "events": { "timestamp": "2020-05-05 13:03:11 UTC", "event": { "category": "authorization" } } } ]
It can be used in combination with json_objects_array
, which will look for the field inside each element.
no_http_body
edit
Force HTTP requests to be sent with an empty HTTP body. Defaults to false
.
This option cannot be used with http_request_body
,
pagination.extra_body_content
, or pagination.req_field
.
pagination.enabled
edit
The enabled
setting can be used to disable the pagination configuration by
setting it to false
. The default value is true
.
Pagination settings are disabled if either enabled
is set to false
or
the pagination
section is missing.
pagination.extra_body_content
edit
An object containing additional fields that should be included in the pagination
request body. Defaults to null
.
- type: httpjson pagination.extra_body_content: max_items: 500
pagination.header.field_name
edit
The name of the HTTP header in the response that is used for pagination control.
The header value will be extracted from the response and used to make the next
pagination response. pagination.header.regex_pattern
can be used to select
a subset of the value.
pagination.header.regex_pattern
edit
The regular expression pattern to use for retrieving the pagination information from the HTTP header field specified above. The first match becomes as the value.
pagination.id_field
edit
The name of a field in the JSON response body to use as the pagination ID.
The value will be included in the next pagination request under the key
specified by the pagination.req_field
value.
pagination.req_field
edit
The name of the field to include in the pagination JSON request body containing
the pagination ID defined by the pagination.id_field
field.
pagination.url
edit
This specifies the URL for sending pagination requests. Defaults to the url
value. This is only needed when the pagination requests need to be routed to
a different URL.
rate_limit.limit
edit
This specifies the field in the HTTP header of the response that specifies the total limit.
rate_limit.remaining
edit
This specifies the field in the HTTP header of the response that specifies the remaining quota of the rate limit.
rate_limit.reset
edit
This specifies the field in the HTTP Header of the response that specifies the epoch time when the rate limit will reset.
retry.max_attempts
edit
This specifies the maximum number of retries for the retryable HTTP client. Default: 5.
retry.wait_min
edit
This specifies the minimum time to wait before a retry is attempted. Default: 1s.
retry.wait_max
edit
This specifies the maximum time to wait before a retry is attempted. Default: 60s.
ssl
edit
This specifies SSL/TLS configuration. If the ssl section is missing, the host’s CAs are used for HTTPS connections. See SSL for more information.
url
edit
The URL of the HTTP API. Required.
oauth2.enabled
edit
The enabled
setting can be used to disable the oauth2 configuration by
setting it to false
. The default value is true
.
OAuth2 settings are disabled if either enabled
is set to false
or
the oauth2
section is missing.
oauth2.provider
edit
The provider
setting can be used to configure supported oauth2 providers.
Each supported provider will require specific settings. It is not set by default.
Supported providers are: azure
, google
.
oauth2.client.id
edit
The client.id
setting is used as part of the authentication flow. It is always required
except if using google
as provider. Required for providers: default
, azure
.
oauth2.client.secret
edit
The client.secret
setting is used as part of the authentication flow. It is always required
except if using google
as provider. Required for providers: default
, azure
.
oauth2.scopes
edit
The scopes
setting defines a list of scopes that will be requested during the oauth2 flow.
It is optional for all providers.
oauth2.token_url
edit
The token_url
setting specifies the endpoint that will be used to generate the
tokens during the oauth2 flow. It is required if no provider is specified.
For azure
provider either token_url
or azure.tenant_id
is required.
oauth2.endpoint_params
edit
The endpoint_params
setting specifies a set of values that will be sent on each
request to the token_url
. Each param key can have multiple values.
Can be set for all providers except google
.
- type: httpjson oauth2: endpoint_params: Param1: - ValueA - ValueB Param2: - Value
oauth2.azure.tenant_id
edit
The azure.tenant_id
is used for authentication when using azure
provider.
Since it is used in the process to generate the token_url
, it can’t be used in
combination with it. It is not required.
For information about where to find it, you can refer to https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal.
oauth2.azure.resource
edit
The azure.resource
is used to identify the accessed WebAPI resource when using azure
provider.
It is not required.
oauth2.google.credentials_file
edit
The google.credentials_file
setting specifies the credentials file for Google.
Only one of the credentials settings can be set at once. If none is provided, loading default credentials from the environment will be attempted via ADC. For more information about how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.
oauth2.google.credentials_json
edit
The google.credentials_json
setting allows to write your credentials information as raw JSON.
Only one of the credentials settings can be set at once. If none is provided, loading default credentials from the environment will be attempted via ADC. For more information about how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.
oauth2.google.jwt_file
edit
The google.jwt_file
setting specifies the JWT Account Key file for Google.
Only one of the credentials settings can be set at once. If none is provided, loading default credentials from the environment will be attempted via ADC. For more information about how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.
Common optionsedit
The following configuration options are supported by all inputs.
enabled
edit
Use the enabled
option to enable and disable inputs. By default, enabled is
set to true.
tags
edit
A list of tags that Filebeat includes in the tags
field of each published
event. Tags make it easy to select specific events in Kibana or apply
conditional filtering in Logstash. These tags will be appended to the list of
tags specified in the general configuration.
Example:
filebeat.inputs: - type: httpjson . . . tags: ["json"]
fields
edit
Optional fields that you can specify to add additional information to the
output. For example, you might add fields that you can use for filtering log
data. Fields can be scalar values, arrays, dictionaries, or any nested
combination of these. By default, the fields that you specify here will be
grouped under a fields
sub-dictionary in the output document. To store the
custom fields as top-level fields, set the fields_under_root
option to true.
If a duplicate field is declared in the general configuration, then its value
will be overwritten by the value declared here.
filebeat.inputs: - type: httpjson . . . fields: app_id: query_engine_12
fields_under_root
edit
If this option is set to true, the custom
fields are stored as top-level fields in
the output document instead of being grouped under a fields
sub-dictionary. If
the custom field names conflict with other field names added by Filebeat,
then the custom fields overwrite the other fields.
processors
edit
A list of processors to apply to the input data.
See Processors for information about specifying processors in your config.
pipeline
edit
The Ingest Node pipeline ID to set for the events generated by this input.
The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.
keep_null
edit
If this option is set to true, fields with null
values will be published in
the output document. By default, keep_null
is set to false
.
index
edit
If present, this formatted string overrides the index for events from this input
(for elasticsearch outputs), or sets the raw_index
field of the event’s
metadata (for other outputs). This string can only refer to the agent name and
version and the event timestamp; for access to dynamic fields, use
output.elasticsearch.index
or a processor.
Example value: "%{[agent.name]}-myindex-%{+yyyy.MM.dd}"
might
expand to "filebeat-myindex-2019.11.01"
.
publisher_pipeline.disable_host
edit
By default, all events contain host.name
. This option can be set to true
to
disable the addition of this field to all events. The default value is false
.