-
Develop the Data Infrastructure and related services for real-time and streaming data analytics
-
Deploy automation frameworks to achieve highest levels of engineering quality
-
Assist in data exploration, feature engineering, model training, testing and deployments at scale
-
Develop quick prototypes and demonstrations to showcase key data insights
-
Design and develop time-series machine learning and statistical models for anomaly detection, forecasting, pattern identification, data aggregation and transformation at scale
-
Evaluate and deploy the developed models in distributed and non-distributed environments
-
Collaborate and communicate effectively with the business and technical teams to deliver strong results
-
Experience handling massive data pipelines using messaging systems such as Kafka (preferred) /SQS/RabbitMQ/ActiveMQ
-
Experience building Streaming Analytics and Real Time processing systems using Confluent Streams/ SPARK Streaming or similar
-
Ability to learn and use new technologies quickly
-
Excellent understanding of data structures, algorithms and distributed systems
-
Experience and understanding with some of the traditional, NoSQL and columnar databases such as Oracle, MySQL, PostgreSQL, Cassandra, DynamoDB, Redshift, Vertica
-
Experience and understanding of high performance real-time analytics databases such as Druid
-
2+ years of experience in developing highly distributed systems by leveraging cloud and open source technologies
-
4+ years of strong experience in some of the programming languages: Python (Preferred), Java, Scala, C++,
-
Great team player with excellent written & verbal communication skills
-
At least an undergraduate (postgraduate preferred) engineering degree in Computer Science or related technical disciplines such as IT or ECE from reputed institutions. Masters with 2+ years of experience or Bachelors with 4+ years of experience.