PubSub+ Insights enables operations and applications teams to ensure their event broker service and event mesh infrastructure are available and ready for use by business applications, with centralized, at-a-glance status on the availability of various aspects such as:
- resource usage
- event mesh health
- message flow
- High-Availability (HA) status
- queue, topic endpoint, RDP, and bridge health
- message spool utilization
- capacity utilization
Using PubSub+ Insights provides a turn-key monitoring service to which you
can subscribe. The service helps you build an understanding of your estate. An estate refers to your deployment of event broker services, the messaging and activities that occur, and any capabilities that are part of your
- Keep your applications and event broker services running optimally as key performance indicators (KPIs), metrics, and events let you understand performance, usage, and proactively detect and address issues in your event-driven architecture (EDA).
- Better understand application behavior and size your event broker services from one place. You can see misconfigurations, understand how events flow between applications, better monitor your applications, and understand the performance of your EDA.
PubSub+ Insights allows you to:
- See visualizations that include account-level dashboards, service-level dashboards, and access to advanced monitoring with PubSub+ Insights dashboards for Datadog. For advanced monitoring, you can also create custom monitors and dashboards to align with your organization's monitoring requirements. For more information, see PubSub+ Insights Visualizations and Dashboards.
- Configure e-mail notifications of when an occurrence happens in your estate. For more information, see Notifications.
- Access to event broker service logs and the rich metrics collected in Insights. For more information, see Logs and Metrics.
- Access to PubSub+ Insights monitors for Datadog that leverage best practices and event monitoring expertise. These monitors are used in PubSub+ Insights dashboards and are available for custom dashboards you can create. For more information, see PubSub+ Insights Monitors for Datadog.
The collection and storage of the metrics and logs are handled by a centralized monitoring service that's part of the PubSub+ Insights service called Datadog (third-party provider). PubSub+ Insights uses the metrics, logs, and monitors to provide both visualizations as account-level and service-level dashboards which are available in the Cloud Console. The PubSub+ Insights dashboards that are accessible in Datadog by default provide advanced monitoring capabilities that can be optionally enabled for individual users in the PubSub+ Cloud account.
In the following diagram, you can see the logs, notifications and alerts, and dashboards available.
If you have an existing monitoring system, you may find that the out-of-the-box visualizations and email notifications that come with Insights are complementary to your existing monitoring system.
For a quick walk-through of what PubSub+ Insights offers, see the following video:
Insights takes care of collecting metrics to build visualizations and useful dashboards for you to best monitor your EDA. These visualizations and dashboards let you see historical and real-time metrics for your event broker services and access system log and event broker service logs. This can help you to better monitor, operate, and capacity plan your event broker services and event meshes.
Insights has three levels of dashboards that include:
- In the Cloud Console, this high-level dashboard summarizes the important aspects in your account (Workspace).
- This dashboard is available to all users when the account is subscribed to PubSub+ Insights. It also gives you an overview of usage and links to PubSub+ Insights dashboards for Datadog with advanced information that are available when Advanced Monitoring is enabled. For more information, see Using the Account Overview Dashboard.
- Service-Level dashboards for event broker services
- In the Cloud Console, there is a dashboard available in Cluster Manager on the Monitoring tab for each event broker service.
- Data is visible only when the account is subscribed to PubSub+ Insights. This dashboard is available to all users with access to view and edit event broker services in Cluster Manager. From this dashboard, you can see visualizations and graphs for various aspects of the event broker service. For more information, see Using Service-Level Dashboards for Event Broker Services.
- Advanced Monitoring Dashboards
- With PubSub+ Insights, Advanced Monitoring is available when a role is assigned to a user. The set of PubSub+ Insights dashboards for Datadog are available as soon as a user is assigned the Insights Advanced Editor or Insights Advanced Viewer role.
- The PubSub+ Insights dashboards have advanced visualizations that permit the user to see across all event broker services and to dive deeper into all of the information collected from the event broker services.
- The PubSub+ Insights dashboards are accessed from a Datadog account provided with a PubSub+ Insights subscription and is separate from any existing Datadog accounts that you may have.
- The advanced dashboards offer greater scale, scope, granularity, and interactivity. They let you improve your understanding of various aspects of your EDA that include capacity, message flow trends, queues, and endpoints usage. You can also create custom dashboards to monitor key metrics and manage the KPIs that are critical to your business. Advanced Monitoring provides improved capabilities to detect issues, recover from them, and monitor performance at a granular level. These dashboards leverage Solace's best practices and expertise to monitor event broker services. To access Advanced Monitoring, you must enable it for each user in your account (i.e., the Insights Advanced Editor or Insights Advanced Viewer role assigned to the user). For more information, see PubSub+ Insights Advanced Monitoring.
The monitors, log files, and metrics collected from event broker service are also used to provide alerts and warnings in the configurable notifications. When enabled, Insights allows you to send email notifications to notify people when certain events occur, key performance indicators are approached, or when thresholds are exceeded. The email is from Datadog, the third-party provider. An email for a notification looks like the following:
On your PubSub+ Cloud account, you can configure who receives the email notifications in the Cloud Console as shown below:
For more information about configuring and using notifications in Insights, see Understanding Notifications.
At the heart of Insights is the hundreds of metrics and many log files (system, command, and event logs) from event broker services that are collected. These metrics and logs are used by the many PubSub+ Insights monitors for Datadog that are provided as part of Insights that map to key-performance indicators and best-practices for monitoring event broker services and event-driven architectures. The collection and storage of the metrics and logs are handled by a third-party centralized monitoring service (Datadog).
Though you could use the Syslog Forwarding feature available in our platform to forward all of the logs to process and visualize the information in your own monitoring system, it won't include the many metrics and monitors that PubSub+ Insights provides. You may find that using PubSub+ Insights provides deeper monitoring capabilities that complement your existing monitoring system.
With Insights, logs and metrics are accessible from a set of PubSub+ Insights dashboards for Datadog to permit you to view a log for up to 30 days (90 days upon request). Metrics can be searched based on relevant tags.
For example, with Insights, the logs collected from your estate can be viewed directly as shown below from the Estate Overview dashboard.
The collected metrics contain derived and statistical information from the event broker services in your estate. You can further investigate the metrics within Datadog by navigating to Metrics > Summary and then clicking on a metric as illustrated below:
For detailed information about the metrics, see PubSub+ Insights Metrics and Checks.
Along with the metrics and logs, PubSub+ Insights includes pre-created PubSub+ Insights monitors for Datadog. There are over 50 monitors available with Insights and these monitors provide you with pre-canned best practices for event monitoring. The best practices leverage knowledge from Solace's broad customer base and from subject-matter experts so that you can effectively manage your event-drive architecture.
These are the basic types of monitors available:
- Log-based — These monitors evaluate key logs from the event broker services and are triggered when those specific logs are seen.
- Metric-based — These monitors evaluate metrics collected from the event broker services and are triggered based on thresholds. All of these monitors measure utilization as a percentage of available capacity.
- Status-based — These monitors evaluate status. This can be simple status of whether something is up or down or they may be derived.
Many of the PubSub+ Insights monitors are pre-configured with thresholds to send alerts and warnings. You can clone the PubSub+ Insights monitors and then change them (e.g., modify the thresholds) to create customized monitors of your own. Solace also provides a template monitor which you can clone and customize in order to monitor specific information relevant to your business. Users can see the monitors in the Monitors page in their Datadog account when they have the Insights Advanced Editor or Insights Advanced Viewer role assigned as a role as shown here:
The combination of logs, metrics, notifications, and customizable monitors that come with PubSub+ Insights provide powerful debugging and troubleshooting tools that can help identify problems you may encounter with your event broker services. With PubSub+ Insights' access to over 30 additional event related logs you can be sure that you have the tools you need to monitor a wide array of events that may occur with your event broker services within your estate.
For example, you have setup a new event broker service, but your clients appear to be having problems connecting to it. Worse, you only find out after hours or days. You can use the PubSub+ Insights access to Datadog and search the logs. There you may find the
SYSTEM_CLIENT_CONNECT_FAIL event show up in the logs, after the fact.
Clicking on an instance of the event in the log provides information about that instance, allowing you to address the issue. However, with PubSub+ Insights Advanced Monitoring, you can be proactive and create a monitor to catch this type of event, or other events that may affect your organization as they occur, allowing you to be informed and address problems proactively when they arise. PubSub+ Insights Advance Monitoring offers a range of events you can monitor, divided into three broad categories:
- Client events
- System events
- VPN events
The individual events in each of these categories are detailed in the Event log descriptions .
After seeing this event show up in your logs you decide to create your own monitor as outlined in Cloning and Customizing the Template Monitor
. Your new monitor is customized to trigger whenever the
SYSTEM_CLIENT_CONNECT_FAIL event occurs. Instead of waiting and searching through logs when problems are noticed, you (or the people you choose) are now notified immediately so action can be taken in a timely manner.
With PubSub+ Insights Advanced Monitoring you can design an array of custom monitors to deal with these events that may affect your organization and it's event broker services as they occur. You can be informed immediately when these events happen and troubleshoot them as they arise.
Forward PubSub+ Insights Data to Your Own Datadog Account
You can forward the metrics and logs collected by the Datadog monitoring agents that are part of the PubSub+ Insights service to your own Datadog account. For more information, see Forward PubSub+ Insights Data to Your Own Datadog Account (Controlled Availability).