Original listing text, shown exactly as published by the company.
What will your responsibilities be?
- Developing and applying models and algorithms.
- Training, validating, and optimizing Machine Learning and Deep Learning models.
- Integrating Generative AI and LLMs into various solutions.
- Documentation, versioning, and maintenance of models.
- Developing, maintaining, and optimizing high-quality, reliable, and robust data pipelines.
- Extraction, cleaning, and validation of large datasets.
- Data interpretation to uncover solutions and business opportunities.
- Data analytics using Business Intelligence tools such as Apache Superset.
Technical Requirements
Data Science & Machine Learning
- Solid foundation in mathematics and statistics.
- Expert knowledge of machine learning models and analytical/mathematical modeling.
- Statistical knowledge of Time Series and forecasting evaluation metrics.
- Advanced proficiency in Python (OOP best practices, testing, Pandas, PySpark, NumPy,
Scikit-learn, LightGBM / XGBoost / Catboost, Matplotlib, TensorFlow, Hugging Face).
- Graph analysis and algorithm development.
Generative AI
- Knowledge of the state-of-the-art in Language Models and GenAI.
- Agentic AI Architecture: agent loops, tools, MCPs, and Python agentic libraries.
- Ability to deploy agents, build RAG systems, and extraction techniques using
visual/computer vision models.
Data Engineering & Big Data
- Strong mastery of SQL and query optimization.
- Knowledge of the Big Data ecosystem (Spark, Glue, Redshift, etc.).
- Airflow: Ability to build DAGs to orchestrate data flows.
- Experience handling Big Data file formats such as Parquet and DuckDB.
- Pipeline design and ETL architecture (scheduling, backfilling management, fault recovery,
etc.).
Infrastructure & Operations (MLOps)
- Knowledge of AWS and basic data infrastructure (S3, Redshift, Bedrock, EC2, EKS, ECR).
- Docker: Containerization for service deployment.
- Kubernetes: Ability to operate and interact with clusters.
- Scalable architectures for ML operations.
- Proficiency with GIT version control.
- Experience putting models into production.
- Continuous monitoring and model retraining for live operations.
Business Intelligence
- Knowledge of data visualization tools (Apache Superset, Power BI).
- Dashboarding: Ability to create dashboards for business stakeholders.
What do we offer?
- 100% Remote work.
- Flexible schedule: Daily stand-up at 08:30 AM with core hours until 02:00 PM; you
manage the rest of your day.
- Competitive salary.
- Great working environment.