+1 (437) 318-6970 / tahakhalid@mail.com

Taha Khalid - Data Engineer
Hi there! I’m Taha Khalid, a Data Engineer passionate about turning complex data into actionable insights. If you’re looking for someone who builds scalable ETL pipelines, real-time data architectures, and BI dashboards that actually drive decisions, I’m your guy! I’ve worked at Sudo Trek and Dot Labs, building and optimizing data workflows with PySpark, AWS, Snowflake, and Power BI, and I’m currently pursuing my Master’s in Computational Science at Laurentian University.
About me!
I began my Master’s in Computational Science in 2025 at Laurentian University, building on my B.Sc. in Metallurgy and Materials Engineering from the University of the Punjab. My focus is on data engineering and developing scalable solutions that optimize workflows and improve analytics. Previously, I worked at Sudo Trek and Dot Labs, where I designed and optimized ETL pipelines, automated data flows, and created interactive BI dashboards.
Outside academics and work, I enjoy reading — two books that have shaped my mindset are Atomic Habits by James Clear and Deep Work by Cal Newport, both of which fuel my focus and consistency. I’m always eager to expand my expertise, particularly in big data, cloud technologies, and advanced analytics, and I aim to contribute to impactful projects that make complex data accessible and useful.


Skills
























Projects
Explore my portfolio of data engineering projects. Focused on building efficient ETL pipelines and scalable data solutions.Showcasing expertise in Python, PySpark, AWS, and Snowflake.
Real-Time Data Architecture for Change Data Capture, SnowFlake
An ETL pipeline was built to streamline the extraction, transformation, and loading of data from the Spotify API into Snowflake, enabling real-time analysis of streaming data.The solution utilized AWS for data extraction and storage in S3, transformed the data using AWS Lambda withPython, and loaded it into Snowflake via Snowpipe for easy access.Automation was implemented using CloudWatch triggers, allowing the pipeline to run daily without manual intervention.


ETL Solution for Spotify Data Using AWS, AWS and Snowflake
A robust data architecture was implemented to support Change Data Capture (CDC) using streams and manage Slowly Changing Dimensions (SCD) Type 2, enhancing data management and enabling rapid insights.The solution utilized Docker on Amazon EC2 for containerization, Apache NiFi for real-time data ingestion to Amazon S3, and Snowpipe for loading data into Snowflake's staging tables.Automated tasks and streams were used for CDC, improving data handling accuracy and efficiency.




Developed an ETL pipeline using Apache Spark, AWS S3, MySQL, and Snowflake, reducing data processing time by 0% through automated ingestion, schema validation, and parallel processing, enabling real-time analytics.Optimized data storage by structuring customer data marts in Parquet (S3) and integrating Snowpipe for automated ingestion into Snowflake, enabling faster query execution for business intelligence reporting.Enhanced data integrity by implementing schema validation and error handling, ensuring accurate sales and customer transaction tracking while minimizing data inconsistencies.Automated sales incentive tracking using business logic-driven transformations in Spark, improving salesperformance visibility and reducing reporting delays by 50%.
Automated ETL Pipeline Using PySpark for Real-Time Analytics, Pyspark
Experience
Professional Background
Dot Labs
June 2025 - August 2025
Built and optimized ETL pipelines with SQL and Airflow, resolving pipeline issues, automating data flows, and creating dashboards to drive business decisions.
Previous Role
Sudo Trek
November 2023 - May 2025
Designed and optimized ETL workflows for large-scale data processing.Applied Python, Pandas, and MySQL to ensure accuracy, integrity, and scalability.Delivered high-performance pipelines supporting complex analytics needs.




Contact Me
Let’s connect — happy to discuss projects, learning, or career opportunities.