Lorven Jobs

<< Back to all jobs

Lead Developer – Big Data Engineering

Pune | Posted: 2021-03-10 10:37:40

Easy Apply Apply Now

Ideal Candidate

The ideal candidate will come with hands-on experience as a Lead developer using Big Data technologies within the Banking and Financial services domain. This person will have a proven track record of implementing successful Big data solutions to business clients in financial services.

Requirements

5-7 years experience as Big Data Developer
In-depth knowledge of Big Data technologies - Spark, HDFS, Hive, Kudu, Impala
Solid programming experience in Python, Java, Scala, or other statically typed programming language
Production experience in core Hadoop technologies including HDFS, Hive and YARN
Strong working knowledge of SQL and the ability to write, debug, and optimize distributed SQL queries
Excellent communication skills; previous experience working with internal or external customers
Strong analytical abilities; ability to translate business requirements and use cases into a Hadoop solution, including ingestion of many data sources, ETL processing, data access, and consumption, as well as custom analytics
Experience working with workflow managers like Airflow, Prefect, Luigi, Oozie
Experience working with Data Governance tools like Apache Sentry, Kerberos, Atlas, Ranger
Experience working with streaming data with technologies like Kafka, Spark streaming
Strong understanding of big data performance tuning
Experience handling different kinds of structured and unstructured data formats (Parquet/Delta Lake/Avro/XML/JSON/YAML/CSV/Zip/Xlsx/Text etc.)
Experience working with distributed NoSQL storage like ElasticSearch, Apache Solr
Experience deploying big data pipelines in the cloud preferably using GCP and AWS
Well versed with Software Development Life Cycle Methodologies and Practices
Spark Certification is a huge plus
Cloud experience is a must have preferably with GCP
Contribution to open source community and Apache committer will be big plus

Responsibilities

Integrate data from a variety of data sources (data warehouse, data marts) utilizing on-prem or cloud-based data structures (GCP/AWS); determine new and existing data sources
Develop, implement and optimize streaming, data lake, and analytics big data solutions
Create and execute testing strategies including unit, integration, and full end-to-end tests of data pipelines
Recommend Kudu, HBase, HDFS, and relational databases based on their strengths
Utilize ETL processes to build data repositories; integrate data into Hadoop data lake using Sqoop (batch ingest), Kafka (streaming), Spark, Hive or Impala (transformation)
Adapt and learn new technologies in a quickly changing field
Be creative; evaluate and recommend big data technologies to solve problems and create solutions
Recommend and implement best tools to ensure optimized data performance; perform Data Analysis utilizing Spark, Hive, and Impala
Work on a variety of internal and open source projects and tools

Posted Date

2021-03-10 10:37:40

Experience

5 -7 years

Primary Skills

Big Data,Spark, HDFS, Hive, Kudu, Impala,Python, Java, Scala,Parquet/Delta Lake/Avro/XML/JSON/YAML/CSV/Zip/Xlsx/Text,ElasticSearch, Apache Solr,SQL

Required Documents

Resume

Contact

bhawya@lorventech.com