Description
Demonstration of how an organization engages with the Selector platform. The video highlights capabilities of the Selector platform including natural language interactions, automatic correlation, leveraging metadata about the network, and remediation using collaboration tools. The operator receives a Smart Alert in Slack, investigates the issue using the Selector portal, and takes remediation actions directly through Slack. The platform’s features, including natural language interactions and automated correlations, streamline the process of diagnosing and resolving network problems.
Video length: 5:57
Transcript
Narrator: In this demo, we are going to show a real example of how an organization typically engages with a Selector platform. It highlights the capabilities of the platform, including natural language interactions, automatic correlation leveraging metadata about the network, and remediation using collaboration tools.
The demo begins with a user getting a Smart Alert in Slack, asking some clarifying questions about the event, and then pivoting to the portal to do more triage and eventually taking remediation action via Slack.
A popular endpoint for the Selector Smart Alert is Slack. As you can see in the Slack alert, we had good context about the LLY cause, the device affected, and the downstream effects. At first glance, it appears that an interface flap is causing downstream issues with BGP. After getting the alert, the operator wants to gain some more context about the current status. The first question would be to understand the device state, so the operator asks if the device is still reachable.
As you can see from the response, the device is still reachable, but that doesn’t mean there might not be a service degradation. Next, the operator wants to understand the potential impact. Router 1 and Router 2 appear to be in a degraded state with two KPIs in red for BGP flaps and interface oper status. With this information, the operator wants to do more research using the Selector portal. They click on the link in the alert notification to open a dashboard that is automatically tailored to the devices and events in this notification.
The first widget we’re going to look at is the one for BGP sessions. This specific KPI identifies if a BGP session is flapping. This widget is automatically filtered for the devices in the smart alert. As you can see in the Sunburst widget, the peer interface name and peer host tags indicate the failing session is related to a common interface between a pair of routers. By clicking on one of the interfaces on the outer ring, we will see more detail about the BGP flapping.
The timeline heat map gives a visual indication of the health, and the flap rate tells us the volume of the flaps for the time period. Scrolling down, we can see a line plot of all the eBGP peers for that device and additional relevant information such as the admin status of the peer, interface configuration events, and interface uptime.
At this point, we have a clear sense that we have a flapping interface that is, in turn, causing BGP to flap. It appears to be isolated to just this pair of interfaces. So now the operator goes back to the main alert page to learn more. We can see on another Sunburst widget that the interfaces are related to a common circuit and carrier. This will be useful later when we open a ticket with them.