Project
Developed scalable Azure data engineering solutions to ingest, process, and curate healthcare datasets for downstream analytics, reporting, and operational insights.
- Designed and developed scalable data pipelines on Azure Cloud, ensuring seamless data integration and processing across multiple sources.
- Implemented and optimized Azure Data Lake solutions for storing and managing large volumes of structured and unstructured data.
- Built ETL workflows using Azure Data Factory (ADF) to automate data extraction, transformation, and loading across multiple sources.
- Developed data processing and analytics solutions using Databricks and PySpark, improving performance and scalability.
- Integrated streaming and batch data processing capabilities to support real-time and historical data analysis.
- Participated in cloud cost optimization strategies, reducing storage and compute costs while maintaining performance.
Environment
Azure Cloud
Azure Data Lake
Databricks
PySpark
Azure Data Factory
SQL Server
PostgreSQL
Python
Azure DevOps