Published inAWS in Plain English·Jun 22Member-only2023, the Year of Cloud Resource Cost OptimizationIn this article, we will focus on some key methods to effectively optimize your AWS cloud resource costs. — The after-effects of the pandemic are forcing organizations to perform cost-cutting in the digital landscape. In the past few years, organizations have been extremely busy migrating their use cases to the cloud. …Data Science7 min readData Science7 min read
Published inTowards Data Science·Feb 5Member-onlyUse Delta Lake as the Master Data Management (MDM) Source for Downstream ApplicationsIn this article, we will try to understand how the output from Delta Lake change feed can be used to feed downstream applications — As per ACID rules, the theory of isolation states that “the intermediate state of any transaction should not affect other transactions”. Almost every modern database has been built to follow this rule. Unfortunately, until recently the same rule could not be effectively implemented in the big data world. …Delta Lake9 min readDelta Lake9 min read
Published inAWS in Plain English·Jan 30Member-onlyRegulatory Compliance (GDPR and CCPA) Using Spark & Delta LakeIn this article, we will try to understand how we can handle regulations such as GDPR and CCPA using Delta Lake — In the world we live in, the power of data cannot be challenged. This power has helped several companies accelerate their revenue by monetizing and marketing personal data in ways that cross the boundaries of ethical use. In some cases, the power of data has been misused to the extent…Gdpr9 min readGdpr9 min read
Published inTowards Data Science·Jan 23Member-onlyHandling Slowly Changing Dimensions (SCD) using Delta TablesHandling the challenge of slowly changing dimensions using the Delta Framework — For a long time, the Kimball method has been a standard for dimensional data modeling techniques. As per Kimball “ The notion of time pervades every corner of the data warehouse”. What does this mean in the context of data analytics? At a high level, modern analytics can be seen…AWS10 min readAWS10 min read
Jun 7, 2022Member-onlyJust Keep Swimming —Dealing with Stress, Anxiety, and BurnoutThis article is an attempt to share my thoughts and experience regarding stress and anxiety management for professionals in the modern era of turbulence. — Never have I ever written on topics outside the realm of science and technology, but there always is a first time for everything. A lot of my readers may already be familiar that I have been training students to achieve higher in areas of big data engineering, data analytics, and…Stress Management7 min readStress Management7 min read
Published inAWS Tip·Jan 20, 2022Member-onlyBenchmarking Amazon Redshift (Provisioned Clusters), Amazon Redshift Spectrum, Amazon Redshift Serverless, and AthenaComparing query performance in the different flavors of Amazon Redshift and Athena. — For a while now I have been thinking of performing some benchmark tests on Amazon Redshift. As laziness settles in you tend to ignore a few things and this was one of them. That changed when I started to read about the new offering from Amazon Redshift. …AWS10 min readAWS10 min read
Published inAWS in Plain English·Dec 16, 2021Member-onlyLoad Balancing and High Availability Options for Amazon RDSA guide on how to make the RDS database highly available and load balance database requests using read and write request splitting. — For readers who have known me through my articles and book in this article, I am going to digress from my usual topics around data engineering, data science, or data analytics. …Data7 min readData7 min read
Published inAWS in Plain English·Dec 10, 2021Member-onlyExploring Computer Vision — Artificial Intelligence Service from AWS — Amazon RekognitionUnderstanding some use cases of computer vision using Amazon Rekognition — What is computer vision? Can computers see like humans? As per Wikipedia computer vision can be defined as “Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos.” Why is computer vision important in today’s world? Technically, computer vision…Computer Vision7 min readComputer Vision7 min read
Published inTowards Data Science·Dec 3, 2021Member-onlyAnomaly Detection in IoT Enabled Smart Battery Management SystemsUnderstanding the usage of data engineering and machine learning in the electric mobility world — We are living in the world of electric mobility. Globally, the adoption of electric cars and two-wheeler is steeply on the rise. Electric mobility devices rely on expensive rechargeable lithium-ion batteries for power. These batteries are integral for the fight against the bad effects of fossil fuels such as pollution…Data Science8 min readData Science8 min read
Published inAWS in Plain English·Oct 22, 2021Member-onlyCombining the Power of Data Lake and Data Warehouse — Lakehouse ArchitectureAn article on the evolution of big data architectures and the power of modern Lakehouse architecture. — In this article, we will travel through time and understand the evolution of big data architectures. We will also explore the power of modern Lakehouse architecture in greater detail. I just realized that I completed 10 years working in the field of big data. …Big Data6 min readBig Data6 min read