Automating Metadata-Driven ETL Processes for Real-Time Business Intelligence in the Finance Sector
Ayu’s Luz
Ladoke Akintola University of Technology Ogbomoso
DOI: 10.63665/ijmlaidse-y1f3a001
View / Download Full Article (PDF)Abstract
Speed in the delivery of information is crucial in decision-making, compliance, and customer satisfaction in the fast-moving world of finance. Real-time business intelligence requires systems much faster and more adaptable than traditional ETL systems. This paper proposes leveraging metadata to automate ETL processes in a way that keeps up with the changing nature of financial data. We demonstrate that driving ETL design, transformation logic, and pipeline orchestration from metadata provides a more scalable approach to real-time financial analytics, reduces development time, and improves data quality. A reference architecture is proposed and evaluated using a practical financial use case against traditional ETL pipelines. The results show that metadata-driven automation performs effectively in real-time business intelligence scenarios, improving system efficiency and accelerating analytical insights.
Keywords
Metadata-Driven ETL, Real-Time Business Intelligence, Financial Data Integration, ETL Automation, Data Pipeline Orchestration, Streaming ETL, Metadata Management, Finance Analytics, Big Data in Finance, Data Governance
References
[1] Inmon, W. H. (2005). Building the data warehouse (4th ed.). Wiley.
[2] Kimball, R., & Caserta, J. (2004). The data warehouse ETL toolkit: Practical techniques for extracting, cleaning, conforming, and delivering data. Wiley.
[3] Golfarelli, M., & Rizzi, S. (2009). Data warehouse design: Modern principles and methodologies. McGraw-Hill.
[4] Batini, C., & Scannapieco, M. (2016). Data and information quality: Dimensions, principles and techniques. Springer.
[5] Vassiliadis, P., Simitsis, A., & Skiadopoulos, S. (2002). Conceptual modeling for ETL processes. Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP, 14–21.
[6] Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4), 3–13.
[7] Dayal, U., Castellanos, M., Simitsis, A., & Wilkinson, K. (2009). Data integration flows for business intelligence. Proceedings of the 12th International Conference on Extending Database Technology, 1–11.
[8] Wrembel, R., & Koncilia, C. (2007). Data warehouses and OLAP: Concepts, architectures and solutions. IRM Press.
[9] Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, 98–115.
[10] Stonebraker, M., Çetintemel, U., & Zdonik, S. (2005). The 8 requirements of real-time stream processing. ACM SIGMOD Record, 34(4), 42–47.
[11] Gedik, B., Andrade, H., Wu, K.-L., Yu, P. S., & Doo, M. (2008). SPADE: The system S declarative stream processing engine. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 1123–1134.
[12] Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A distributed messaging system for log processing. Proceedings of the NetDB Workshop, 1–7.
[13] Akidau, T., Chernyak, S., & Lax, R. (2015). Streaming systems and the future of real-time data processing. Communications of the ACM, 59(6), 50–57.
[14] Loshin, D. (2010). Master data management. Morgan Kaufmann.
[15] Otto, B. (2011). Organizing data governance: Findings from the telecommunications industry and consequences for large service providers. Communications of the Association for Information Systems, 29(1), 45–66.
[16] Russom, P. (2011). Big data analytics. TDWI Best Practices Report.
[17] Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS Quarterly, 36(4), 1165–1188.
[18] Marz, N., & Warren, J. (2015). Big data: Principles and best practices of scalable real-time data systems. Manning Publications.
[19] Sadalage, P. J., & Fowler, M. (2012). NoSQL distilled: A brief guide to the emerging world of polyglot persistence. Addison-Wesley.
[20] Khatri, V., & Brown, C. V. (2010). Designing data governance. Communications of the ACM, 53(1), 148–152.