Home Our Insights Client Success Stories Creation of GxP Compliant Data Lake

Creation of GxP Compliant Data Lake

1 min read

The Challenge

Implementing modern cloud platform to build deploy lakes and create 2 data lakes on top of the platform.

Problems to solve

  • Limited access to data due to on prem solution
  • Consolidate data from various commercial systems from manufacturing and supply chain areas
  • Multiple solutions connecting to multiple data sources and therefore duplicating the data steams
  • Very long data onboarding and consumption cycle
  • Mix of different technologies required bigger number of operational teams, high SLAs

The solution

  1. Create highly configurable, metadata driven, scalable, high uptime data, distributed globally lake to ingest from over 250 different data sources
  2. Actively maintained and extended by C&F:
    • Integrating new data source
    • Cleansing, curating and refining ingested data
    • Monitoring availability and data quality
    • Integration with Colibra for data governance
    • Supporting business in daily work Technologies used:
      • AWS (Kubernetes, Spark, S3, Airflow)

Results

  • Decrease the TCO spending on infrastructure by creating centralized lake
  • Democratization of refined data stored in Data Lake in parquet standard
  • Faster delivery and cheaper cycle thanks to semi-automated deployments
  • Reduce Operational Costs, by having singe support team across all data in Data Lake
  • Improved decision-making accuracy by improving timeliness
  • Enabler for autonomous supply chain planning
Creation of GxP Compliant Data Lake Type: application/pdf Size: 228 KB