In the role of Data Engineer, you will be responsible for defining, designing and delivery of cloud-native, high-throughput, scalable and distributed data centric products and services. The role focuses on creating value chain to help address the challenges of acquiring the large complex data, evaluating its value, distilling, and analyzing. This individual will be responsible to develop robust, scalable, and maintainable data systems with a combination of in-house tools and standard technologies.
Career Level - IC4
Responsibilities:
- Lead the development and implementation of data centric products, apply data engineering concepts, data architecture design and systems performance tuning.
- Design and implement high throughput, distributed data pipelines, real-time data analytics processing, interactive dashboards and ML/AI based services.
- Partner with product, and engineering stakeholders to identify software requirements, translate to technical design and implementation plan.
- Develop and maintain technical documentation, including architecture diagrams, design specifications, and system diagrams.
- Work with development teams to ensure software projects are delivered on time, within budget, and to the required quality standards.
- Provide guidance and mentorship to junior developers.
- Stay up to date with industry trends and developments in software architecture and development practices.
Qualifications:
- 8+ Years of expertise in designing and implementing data architectures, including data lakes, data warehouses, distributed and real-time data processing systems.
- Experience with modern data stack: data ingress/egress, ETL/ELT, DataOps, Apache Spark, Kafka, Flink, NiFi
- Experience with SQL, NoSQL databases and data warehouse solutions: Oracle ADW, MySQL, MongoDB, Cassandra, etc.
- Knowledge of data-at-scale processing tools: Oracle Datalake, Databricks, Cloudera, etc.
- Demonstrated ability in building and deploying software applications on one or more public cloud providers such as OCI, AWS, Azure, GCP, or equivalent.
- Hands-on programming skills using Python, Java, SQL, PL/SQL
- Experience with data modeling concepts and tools: Data Modeler, dbt, Apache Avro, Parquet.
- Experience in DevOps practices involving containers in Kubernetes, CI/CD and Canary Deployments.
- Experience with Microservice architecture patterns such as but not limited to API Gateways, Event Driven & Reactive Architecture, etc.
- Proficiency in using data analytics tools, experience creating data visualizations, dashboards, and reports: Oracle Business Intelligence, (BI), Tableau, Power BI, Looker, Grafana, D3.js, Plotly.
- Conceptual knowledge and practical experience with statistical analysis and machine learning frameworks: Scikit-learn, TensorFlow, PyTorch.
- Proficiency in using Jupyter Notebooks or similar environments for model development.
- Experience with MLOps practices for managing the lifecycle of ML models including versioning, deployment, monitoring, and governance: Oracle AI Service, Oracle Data Science Services, MLflow, Kubeflow, Apache Airflow, SageMaker or equivalent.
- Conceptual knowledge and experience with generative AI models, techniques, and tools: Large Language Models (LLM), Vector DBs, LangChain, LlamaIndex, Hugging Face Transformers.
- Excellent communication skills to convey technical concepts to non-technical stakeholders.
- Strong collaboration skills to work with data scientists, analysts, and business stakeholders.
- Proven leadership skills in managing and mentoring a team of data engineers.
- Ability to influence and drive cross-functional teams towards a unified data strategy.
- Strong analytical and problem-solving skills to address complex data challenges.
- Ability to translate business requirements into data solutions.