Open in app

Sign In

Write

Sign In

Manoj Kukreja
Manoj Kukreja

534 Followers

Home

About

22 hours ago

Regulatory Compliance (GDPR and CCPA) using Spark & Delta Lake

In this article, we will try to understand how we can handle regulations such as GDPR and CCPA using Delta Lake. — In the world we live in, the power of data cannot be challenged. This power has helped several companies accelerate their revenue by monetizing and marketing personal data in ways that cross the boundaries of ethical use. In some cases, the power of data has been misused to the extent…

Gdpr

9 min read

Regulatory Compliance (GDPR and CCPA) using Spark & Delta Lake
Regulatory Compliance (GDPR and CCPA) using Spark & Delta Lake
Gdpr

9 min read


Published in Towards Data Science

·Jan 23

Handling Slowly Changing Dimensions (SCD) using Delta Tables

Handling the challenge of slowly changing dimensions using the Delta Framework — For a long time, the Kimball method has been a standard for dimensional data modeling techniques. As per Kimball “ The notion of time pervades every corner of the data warehouse”. What does this mean in the context of data analytics? At a high level, modern analytics can be seen…

AWS

10 min read

Handling Slowly Changing Dimensions (SCD) using Delta Tables
Handling Slowly Changing Dimensions (SCD) using Delta Tables
AWS

10 min read


Jun 7, 2022

Just Keep Swimming —Dealing with Stress, Anxiety, and Burnout

This article is an attempt to share my thoughts and experience regarding stress and anxiety management for professionals in the modern era of turbulence. — Never have I ever written on topics outside the realm of science and technology, but there always is a first time for everything. A lot of my readers may already be familiar that I have been training students to achieve higher in areas of big data engineering, data analytics, and…

Stress Management

7 min read

Just Keep Swimming —Dealing with Stress, Anxiety, and Burnout
Just Keep Swimming —Dealing with Stress, Anxiety, and Burnout
Stress Management

7 min read


Published in AWS Tip

·Jan 20, 2022

Benchmarking Amazon Redshift (Provisioned Clusters), Amazon Redshift Spectrum, Amazon Redshift Serverless, and Athena

Comparing query performance in the different flavors of Amazon Redshift and Athena. — For a while now I have been thinking of performing some benchmark tests on Amazon Redshift. As laziness settles in you tend to ignore a few things and this was one of them. That changed when I started to read about the new offering from Amazon Redshift. …

AWS

10 min read

Benchmarking Amazon Redshift (Provisioned Clusters), Amazon Redshift Spectrum, Amazon Redshift…
Benchmarking Amazon Redshift (Provisioned Clusters), Amazon Redshift Spectrum, Amazon Redshift…
AWS

10 min read


Published in AWS in Plain English

·Dec 16, 2021

Load Balancing and High Availability Options for Amazon RDS

A guide on how to make the RDS database highly available and load balance database requests using read and write request splitting. — For readers who have known me through my articles and book in this article, I am going to digress from my usual topics around data engineering, data science, or data analytics. …

Data

7 min read

Load Balancing and High Availability Options for Amazon RDS
Load Balancing and High Availability Options for Amazon RDS
Data

7 min read


Published in AWS in Plain English

·Dec 10, 2021

Exploring Computer Vision — Artificial Intelligence Service from AWS — Amazon Rekognition

Understanding some use cases of computer vision using Amazon Rekognition — What is computer vision? Can computers see like humans? As per Wikipedia computer vision can be defined as “Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos.” Why is computer vision important in today’s world? Technically, computer vision…

Computer Vision

7 min read

Exploring Computer Vision — Artificial Intelligence Service from AWS — Amazon Rekognition
Exploring Computer Vision — Artificial Intelligence Service from AWS — Amazon Rekognition
Computer Vision

7 min read


Published in Towards Data Science

·Dec 3, 2021

Anomaly Detection in IoT Enabled Smart Battery Management Systems

Understanding the usage of data engineering and machine learning in the electric mobility world — We are living in the world of electric mobility. Globally, the adoption of electric cars and two-wheeler is steeply on the rise. Electric mobility devices rely on expensive rechargeable lithium-ion batteries for power. These batteries are integral for the fight against the bad effects of fossil fuels such as pollution…

Data Science

8 min read

Anomaly Detection in IoT Enabled Smart Battery Management Systems
Anomaly Detection in IoT Enabled Smart Battery Management Systems
Data Science

8 min read


Published in AWS in Plain English

·Oct 22, 2021

Combining the Power of Data Lake and Data Warehouse — Lakehouse Architecture

An article on the evolution of big data architectures and the power of modern Lakehouse architecture. — In this article, we will travel through time and understand the evolution of big data architectures. We will also explore the power of modern Lakehouse architecture in greater detail. I just realized that I completed 10 years working in the field of big data. …

Big Data

6 min read

Combining the Power of Data Lake and Data Warehouse — Lakehouse Architecture
Combining the Power of Data Lake and Data Warehouse — Lakehouse Architecture
Big Data

6 min read


Published in Towards Data Science

·Jan 8, 2021

The Smart Hadoop Administrator

A guide to deploying and administering Hadoop Clusters like a smart administrator — Several years ago while I was still a developer I got chosen to be part of the prestigious Data Administration group. It was a dream come true because not many got that chance. …

Data

8 min read

The Smart Hadoop Administrator
The Smart Hadoop Administrator
Data

8 min read


Published in Towards Data Science

·Dec 27, 2020

The Evolution of Big Data Compute Platforms — Past, Now and Later

A journey into the evolution of Big Data Compute Platforms like Hadoop and Spark. Sharing my perspective on where we were, where we are and where we are headed. — Over the past few years I have been part of a large number of Hadoop projects. Back in 2012–2016 the majority of our work was done using on-premises Hadoop infrastructure. The age of on premises clusters….. On a typical project we would take care of every aspect of the Big…

AWS

7 min read

The Evolution of Big Data Compute Platforms — Past, Now and Later
The Evolution of Big Data Compute Platforms — Past, Now and Later
AWS

7 min read

Manoj Kukreja

Manoj Kukreja

534 Followers

Big Data Engineering, Data Science, Data Lakes, Cloud Computing and IT security specialist.

Following
  • 💡Mike Shakhomirov

    💡Mike Shakhomirov

  • Michael King

    Michael King

  • Jesse J Rogers

    Jesse J Rogers

  • Pamela Ullmann

    Pamela Ullmann

  • Donna L Roberts, PhD (Psych Pstuff)

    Donna L Roberts, PhD (Psych Pstuff)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech