• 8+ years of experience designing, building, and deploying highly scalable, high-performance AI/ML systems, with a strong foundation in software engineering (Java/Python) and solid understanding of object-oriented design and data structures.
• Hands-on experience with Machine Learning lifecycle management, including data preprocessing, feature engineering, model training, evaluation, and deployment, following best practices for reproducibility and scalability.
• Strong experience implementing MLOps practices, including automated training, testing, and deployment pipelines using tools like Jenkins or GitHub Actions, integrated into CI/CD workflows.
• Proficient in Python-based ML ecosystems, with hands-on experience in frameworks such as TensorFlow, PyTorch, and Scikit-learn for building and deploying models in production environments.
• Experience working with large-scale data processing and streaming platforms such as Kafka, RabbitMQ, or Google Pub/Sub to enable real-time and batch ML pipelines.
• Deep expertise in distributed systems and data infrastructure, including working with large datasets and NoSQL databases such as Apache Cassandra and data warehouses like BigQuery.
• Experience designing, implementing, and optimizing search, recommendation, or ranking systems using tools like Elasticsearch or Apache Solr.
• Hands-on experience with containerization and orchestration technologies, especially Docker (Kubernetes is a plus), for deploying and scaling ML workloads.
• Strong experience building and maintaining CI/CD pipelines for ML systems, focusing on automation, reliability, model versioning, and fast delivery.
• Experience designing and managing infrastructure on Google Cloud Platform (GCP), including services like BigQuery, Cloud Functions, Cloud Run, Dataflow, and ML-specific services.
• Familiarity with model monitoring, observability, and performance optimization techniques, including data drift detection, logging, and alerting for production ML systems.
Preferred / Nice-to-Have Skills
• Strong communication and collaboration skills; ability to work cross-functionally with data scientists, engineers, QA, and product teams, proactively driving ML adoption and automation initiatives.
• Working knowledge of data visualization and frontend technologies (JavaScript, ReactJS, NodeJS) to support model interpretability and dashboards.
• Hands-on experience with feature stores, experiment tracking, and model versioning tools (e.g., MLflow, Vertex AI).
• Familiarity with caching solutions such as Memcached or Redis and optimization of low-latency ML inference systems.
• Experience using AI-assisted development tools (e.g., GitHub Copilot) to improve productivity and automate repetitive ML/DevOps tasks.