Loading...
Expertise
Menu
💡 I Will Help You Design Scalable Data Platforms and Modern Pipelines
Created a month ago in Technology / Software Development

With 7+ years of experience in Data Engineering across startups and enterprises, I specialize in building scalable, cloud-native data platforms using Spark, Airflow, Kafka, dbt, and BigQuery/Snowflake.
I’ve helped startups:
Cut cloud costs by 40%
Implement modern ELT pipelines
Enable ML teams with clean, reliable data
Build real-time dashboards from streaming data
Migrate legacy systems to cloud (AWS/GCP)
💬 On our call, I’ll help you:
✅ Architect your data platform
✅ Choose the right tools (no vendor fluff)
✅ Fix bottlenecks in your pipelines
✅ Design for scale, compliance, and future AI-readiness
You’ll leave with actionable steps, clear diagrams, and tooling recommendations tailored to your business stage.
Related Topics
Himanshu Jain
India
Send Message
WORK EXPERIENCE Senior Data Engineer - Data Infra Angle One, Bangalore, India (Jan 2024 – Present) ● Architected and deployed real-time data pipelines on Databricks using Apache Kafka and PySpark, enabling scalable processing of UPI, banking, equities, and fixed deposit datasets for 10M+ customers. ● Led mentorship for 2 data engineering interns, driving knowledge sharing on distributed data systems. ● Administered and enhanced a Consent Management System (CMS), ensuring secure PII handling and compliance while enabling seamless portfolio updates for end-users. ● Onboarded and integrated 5+ new Financial Information Providers (FIPs), expanding data coverage by 15% and optimizing ETL workflows to reduce data processing time by 20%, accelerating delivery of actionable insights. ● Engineered an automated Data Quality Check (DQC) framework using Great Expectations, validating 700+ tables across 40+ scheduled jobs daily to maintain data integrity, accuracy, and reliability. ● Designed and maintained high-volume data pipelines to ingest, transform, and validate gigabytes of KYC data daily, ensuring SEBI compliance for regulatory reporting and loan underwriting. ● Tuned and optimized Amazon Redshift queries and PySpark and Apache Flink jobs, achieving 40% improvement in performance and reducing duplicate data issues via efficient Databricks Delta Lake architecture and schema evolution strategies. ● Integrated Python with AWS services, EC2, EMR, Glue, S3, IAM, Athena, SQS using AWS APIs and Gemini APIs. (Took a brief career break to manage a temporary Parents health situation) Data Engineer III Walmart Tech Lab, Bangalore, India (Sep 2023 – Dec 2023) ● Achieved a 70% reduction in time complexity and a 50% decrease in cloud costs while working with GCP for data processing. ● Implemented audit tables for data integrity checks and incremental data validation. ● Successfully integrated real-time data ingestion with Kafka, Spark, and Scala for agile data processing. Data Engineer II Target Corporation, Bangalore, India (May 2022 – Sep 2023) ● Designed, developed, and deployed scalable ETL data pipelines using Apache Kafka, Apache Spark (Scala), Hadoop, Hive, SparkSQL, and SQL within an Agile development environment — improved data processing speed by 30% and data accuracy by 40%. ● Engineered and maintained a Data Correction Framework to automate anomaly detection and remediation, correcting over 500,000+ historical records and reducing data quality errors by 20% through proactive root cause analysis. ● Led the migration of legacy Hive-based workflows to Apache Spark on Hortonworks Hadoop, optimizing pipeline performance and reducing query latency by 25%. ● Developed a Data Retention and Archival Framework that decreased data storage costs by 30% while improving data discoverability and accessibility by 20% for analysts and business stakeholders. Data Engineer I Tekion Corporation, Bangalore, India (Jan 2020 – May 2022) ● Designed and deployed 30+ scalable ETL pipelines and 80+ Python APIs using AWS (Boto3, Lambda, API Gateway), Docker, and SQL, accelerating data processing by 50% and improving data accuracy by 30% across cloud-based systems. ● Engineered and optimized 40+ Apache Airflow DAGs to automate ETL workflows, enabling seamless data integration with AWS Redshift, PostgreSQL, and Enterprise Data Warehouses (EDWs), and reducing overall pipeline runtime by 50%. ● Collaborated cross-functionally in Agile and DevOps environments, driving data governance, CI/CD automation, and Tableau-based analytics for actionable insights — resulting in 20% system efficiency gains and higher transaction success rates via Razorpay integration. ● Implemented Python-based data quality monitoring and alerting systems, ensuring 35% improvement in data integrity and compliance with GDPR; mentored interns on AWS, SQL, ETL best practices, and Apache Airflow, boosting team productivity by 10%.
the startups.com platform
Copyright © 2025 Startups.com. All rights reserved.