Full Job DescriptionJob Description and Qualifications
Air Products’ goal is to be the safest, most diverse, and most profitable industrial gas company in the world, providing excellent service to our customers. Our 4S principles are Safety, Simplicity, Speed, and Self-confidence. Effective use of data and analytics is critical to help the company achieve these goals. Our IT Data and Analytics team is seeking an Analytics Data Engineer to help us build and maintain our Amazon Web Services (AWS) S3 Data Lake.
Nature and Scope
The Analytics Data Engineer is responsible for operationalizing data pipelines that support analytics initiatives for the company. The primary responsibilities include building, managing, and optimizing data flows from various sources such as SAP ECC, into our S3 data lake and Redshift cluster. The primary skills needed include proficiency with Qlik (Attunity) Replicate, Qlik (Attunity) Compose, AWS Glue, Athena, Redshift, EMR, Hive, Spark, and S3. Experience operating and tuning EMR and/or Redshift clusters is desirable.
The data lake enables consumers such as data scientists and business and IT data analysts to complete advanced analytics projects as well as business reporting. The data engineer is expected to collaborate with data scientists, data analysts and other data consumers to productionize data models and algorithms developed by those users in order to improve the overall efficiency of advanced analysis projects. Additionally, the data engineer is responsible for ensuring data quality, governance and data security procedures are met while curating data for use in the Data Lake. Advanced proficiency programming in PySpark and Python ETL modules is required.
Build and maintain data pipelines in support of the enterprise AWS S3 Data Lake.
Contribute to the core design of data architecture, data pipelines, data models and schemas, and implementation plans for the data lake
Enable an innovative approach to data platforms in order to greatly increase the flexibility, scalability, and reliability of IT services at an optimal cost
Work in cross-disciplinary teams to understand enterprise needs and ingest rich data sources.
Work with analytics and data science team members to optimize data platforms to better meet their needs
Maintain the proper infrastructure to support ETL from a variety of sources using SQL, SAP Data Services, and big data technologies
Design ETL processes based on enterprise architecture and custom project needs
Perform design reviews, plan, develop, and resolve technical issues
Work closely with management to prioritize business and information request backlogs
Ensures data governance and data security procedures are followed
Perform data replication with Qlik Replicate and maintain data marts with Qlik Compose
Leverage EMR and Hive to process data mart ETL and inserts, updates, deletes
Implement data warehouses on platforms such as AWS Redshift
Research, experiment, and utilize leading data and analytics technologies in AWS
Educate and train yourself and others as you evangelize the merits of data and analytics
Be proactive in keeping your skills fresh
Generate new ideas, never say or think “that’s not my job.”
4-year College Degree required; Bachelor’s Degree in Information Technology field or related technical discipline preferred
4+ years as a Python, PySpark, Scala, Java software developer building scalable real-time streaming ETL applications and data warehouses.
Proficient experience working within the AWS and AWS tools (S3, Glue, EMR, Athena, Redshift)
Proficient programming experience with Python and PySpark, Hive
Experienced in maintaining infrastructure as code using Terraform or cloud formation
Advanced understanding of both SQL and NoSQL technologies such as MongoDB / DocumentDB
Solid understanding of data warehouse design patterns and best practices
Experience in working with and processing large data sets in a time-sensitive environment while minimizing errors
Hands-on experience working with big data technologies (Hadoop, Hive, Spark, Kafka)
Hands-on experience working with Qlik (Attunity) Replicate and Qlik (Attunity) Compose
Ability to develop test plans and stress test platforms
Demonstrated strength in process development, process adherence, and process improvement
Experience with complex Job scheduling
Effective analytical, conceptual, and problem-solving skills
Must be organized, disciplined, and task/goal oriented
Able to prioritize and coordinate work through interpretation of high-level goals and strategy
Effective team player with a positive attitude
Strong oral and written English language communications skills
Business Sector / Division
5 hours ago Full Job Description Location: Gurgaon Experience: 4 – 8 years Notice Period: Imm to 30 days About...Apply For This Job
Qualifications Leadership Agile Big data Full Job DescriptionChase Merchant Services is the global payment processing business for JPMorgan Chase &...Apply For This Job
1 hour ago Full Job Description Consults with internal business groups to provide high-level application software development services or technical...Apply For This Job
Full-time5 hours ago Full Job Description Manager – CoreApp Developer – LIF004845 With a startup spirit and 90,000+ curious and...Apply For This Job
Full-time22 hours ago Full Job Description We are a new-age Digital Solutions Provider. We help you with Multiple services: Codebrik...Apply For This Job
About the Company:Posidex is an Enterprise Information Insights and Analytics company empowering leading enterprises by unfolding the many facets of...Apply For This Job