{ "uc_ref": "cribl_logstream_pipeline", "uc_vendor": "Cribl", "uc_description": "Monitors Cribl Logstream Pipelines", "uc_category": "cribl_logstream", "uc_earliest": "-5m", "uc_latest": "now", "uc_cron": "*/5 * * * *", "uc_replacements": "", "uc_implementation_comments": "This use case relies on Cribl logstream metrics indexed in Splunk, it monitors Pipelines in, out and dropped statistics. Detection of anomalies is achieved with ML Outliers detection to detect abnormal Incoming or Outgoing traffic flow on the pipeline, make sure to replace the name of the metric index and customize group as needed.", "uc_metrics": "cribl_logstream.pipeline.in_events,cribl_logstream.pipeline.out_events,cribl_logstream.pipeline.dropped_events,cribl_logstream.pipeline.pct_sent_events,cribl_logstream.pipeline.pct_dropped_events", "uc_search": "| mstats sum(cribl.logstream.pipe.in_events) as pipe_in_events, sum(cribl.logstream.pipe.out_events) as pipe_out_events, sum(cribl.logstream.pipe.dropped_events) as pipe_dropped_events where index=* host=* by group, pipeline\n| foreach pipe_in_events, pipe_out_events, pipe_dropped_events [ eval <> = if(isnum('<>'), round('<>', 0), 0) ]\n\n``` set group, cribl already has a field called group, rename it to allow our own convention ```\n| rename group as cribl_group\n\n``` set group and object ```\n| eval group = \"Cribl_Logstream:pipeline_traffic\"\n| eval object = \"pipeline|group:\" . cribl_group . \"|pipeline:\" . pipeline\n| eval object_description = \"Cribl Logstream Pipeline traffic: \" . pipeline . \", group: \" . cribl_group\n\n``` calculate percentage metrics (pct sent events and dropped) ```\n| eval pct_sent_events = round(pipe_out_events/pipe_in_events*100, 3), pct_dropped_events = round(pipe_dropped_events/pipe_in_events*100, 3)\n\n``` detection of anomalies is achieved via ML Outliers detection or custom threshold, init status ```\n| eval status=1\n\n| eval status_description_short = \"% sent/dropped: \" . round(pct_sent_events, 2) . \" / \" . if(pct_dropped_events=0, 0, round(pct_dropped_events, 2)) . \", in/out/dropped: \" . pipe_in_events . \" / \" . pipe_out_events . \" / \" . pipe_dropped_events\n| eval status_description = case(\nstatus=1, \"Cribl Pipeline: \" . pipeline . \" is healthy, in_events: \" . pipe_in_events . \" / out_events: \" . pipe_out_events . \" / dropped_events: \" . pipe_dropped_events . \", % sent events: \" . pct_sent_events . \", % dropped events: \" . pct_dropped_events,\nstatus=3, \"Cribl Pipeline: \" . pipeline . \" status is unknown or unexpected\"\n)\n\n``` set KPI metrics ```\n| eval metrics = \"{'cribl_logstream.pipeline.in_events': \" . pipe_in_events . \", 'cribl_logstream.pipeline.out_events': \" . pipe_out_events . \", 'cribl_logstream.pipeline.dropped_events': \" . pipe_dropped_events . \", 'cribl_logstream.pipeline.pct_sent_events': \" . pct_sent_events . \", 'cribl_logstream.pipeline.pct_dropped_events':\" . pct_dropped_events . \"}\"\n\n``` set ML Outliers ```\n| eval outliers_metrics = \"{'cribl_logstream.pipeline.in_events': {'alert_lower_breached': 1, 'alert_upper_breached': 0, 'time_factor': '%w%H', 'period_calculation': '-90d'}, 'cribl_logstream.pipeline.out_events': {'alert_lower_breached': 1, 'alert_upper_breached': 0, 'time_factor': '%w%H', 'period_calculation': '-90d'}}\"\n\n``` disruption queue: alert only if the condition stands for more than 15 min to limit risks of false positive, update this up to your preferences ```\n| eval disruption_min_time_sec = 900\n\n``` alert if inactive for more than 3600 sec```\n| eval max_sec_inactive=3600" }