In today’s data-driven world, building a robust and scalable data and integration platform is crucial for making data accessible and extracting its value. A highly effective way to structure data architecture is through the medallion architecture (as defined by Databricks), which organizes data into three levels: bronze, silver and gold. While this approach typically begins with raw data (bronze) that undergoes further refinement, I advocate for a “gold-first” approach, starting with the end-user’s needs and use cases. This approach ensures that valuable business data and insights are prioritized, while also establishing a strong foundation for advanced analytics, such as AI and machine learning.
Gold-First, Top-Down, API-First… You Name It
By adopting a “gold-first” strategy, you place concrete use cases and business requirements at the center. This resembles other technological approaches like API-first or UI-first design, where the final product—or user experience—guides how the system is built. In a gold-first approach, the aim is to identify which data is critical for business decisions and to process this data as quickly as possible into a gold layer, making it immediately ready for use.
This approach forces the organization to ask questions like: What insights are vital for our business processes? What data needs to be readily available for analysis to achieve our desired outcomes? By focusing on the most valuable data models in the gold layer, the platform not only collects data but also enables actionable insights from day one. Early user adoption becomes easier, helping the entire organization get on board.
It also ensures that the model does not end up as a suboptimal replica of the source system’s API model.
Bridging the gap between OT and IT
Another crucial aspect of a modern data and integration platform is bridging the gap between Operational Technology (OT) and Information Technology (IT). OT typically involves systems that monitor and control physical processes, machines and facilities, while IT comprises traditional business systems like ERP (Enterprise Resource Planning), CRM (Customer Relationship Management) and other business applications.
For many organizations, the key to innovation lies in integrating data from OT and IT. For instance, production data from OT systems can provide valuable insights when combined with business data from ERP and CRM systems. By merging production data (such as temperatures, machine performance, and downtime) with business data (orders, costs, revenue, customer satisfaction), companies can identify new optimization opportunities, reduce operating costs, and enhance customer experiences.
A gold-first approach is particularly helpful here, as it requires the organization to define which combinations of OT and IT data are most critical to their business model and ensures that this data is quickly made available in a valuable and user-friendly format.
Ensuring completeness: Don’t exclude potentially valuable data
While focusing on the gold layer is essential, it’s important not to overlook the value of raw or semi-processed data in the bronze and silver layers. Data that initially seems irrelevant for defined use cases may later prove valuable for advanced analytics or machine learning models.
AI and machine learning often require large amounts of data to detect patterns and trends that are not immediately apparent. Therefore, it’s essential to ensure that all relevant data is collected and stored, even if it’s not included in the gold layer right away. In a medallion architecture, it’s easy to return to bronze or silver data when new use cases emerge, without needing to restructure the entire data platform.
This consideration is especially relevant when deciding which OT data (typically time-series data) to include in the data platform and at what resolution.
Conclusion
Building a data and integration platform requires careful planning and prioritization of business requirements. With a gold-first approach, you ensure that the most critical data quickly becomes value-generating, while also allowing for deeper analyses in the long term. This approach is especially powerful when integrating OT and IT, potentially leading to significant business value. It’s also essential to include potentially valuable data in the platform for future AI and machine learning analysis, ensuring the organization can grow and adapt with technological advances.
And remember: An incremental approach is smart. Contact us at Incrementi, and we’ll help your business create a future-proof data platform.