Home Our Insights Client Success Stories Observability – Phase I

Observability – Phase I

1 min read

The Challenge

The customer deployed a complex, modern, cloud-based data platform using Azure Stack, Data Lake (ADLS), SQL for data storage, Azure Data Factory, Databricks, and Azure SQL for data processing, and PowerBI for reporting.

Problems to solve

  • Rapid deployment by multiple teams: Multiple data products were rapidly deployed on the platform by different cross-functional teams, resulting in a lack of unified visibility
  • Lack of visibility: The data platform owner struggled with a lack of visibility into platform usage and operations to ensure proper tool usage, optimal resource allocation, and increased platform adoption

The solution

  1. We architected and deployed Observability Solutionto monitor all major data platform components:
    • Data Processing Engines: Databricks, SQL, ADF (ETL)
    • Data Storage: Data Lake, SQL Databases, file stores.
    • User activities / queries
    • The solution uses open standards and API – Open Telemetry, allowing extensibility to other engines.
  2. Observability Dashboard that shows platform usage, alerts and trends, including
    • Data object volumes and usage
    • Processing job details with trends such as errors, processing times, and resource usage

Results

  • Improve data quality: Monitor and detect anomalies in data pipelines to reduce errors
  • Rapid problem resolution: Real-time monitoring of data pipelines to quickly identify and resolve problems, minimize downtime, and ensure data availability within established SLAs
  • Improve operational efficiency: Optimize and streamline data operations to reduce costs and improve resource utilization
  • Optimized resource allocation: Better optimize resources, including personnel and infrastructure, resulting in cost savings
  • Scalability and growth support: Maintain performance goals by optimizing the handling of growing data volumes
Observability – Phase I Type: application/pdf Size: 212 KB