A remote Data & ML role at 3Pillar.
How Sydicom helps: we read this listing’s requirements and tune your CV and cover letter to the keywords its ATS (Lever) is scanning for, for candidates in India, then help you apply.
Original listing text, shown exactly as published by the company.
Data Pipeline Engineering : Build, test, and maintain production pipelines (batch & real-time) on Snowflake, PySpark, Delta Lake, and Kafka.
Implement data quality checks, schema validation, and alerting at every pipeline stage.
Migrate legacy ETL/DWH to cloud-native AWS/Azure architectures with measurable latency and cost improvements.
Maintain CI/CD pipelines: automated testing, deployment, rollback, and IaC (Terraform, GitHub Actions).
RAG, Vector & Retrieval Infrastructure: Build end-to-end retrieval infrastructure: document ingestion, embedding pipelines, vector store management (Pinecone, FAISS, ChromaDB, OpenSearch), and hybrid retrieval layers.
Implement chunking, metadata filtering, and re ranking — tuning for precision, recall, and latency.
Maintain data freshness and index consistency; instrument with context relevance and faithfulness metrics.
Semantic Layer & Knowledge Infrastructure: Implement and maintain business entity mappings, ontologies, and knowledge graphs (Neo4j) per Architect design.
Build and version the feature store and semantic data contracts serving both ML models and LLM applications.
Manage metadata, data lineage, and audit trail instrumentation across the platform.
ML/LLMOps Pipeline Support: Build ML data infrastructure: training curation, feature engineering, MLflow experiment tracking, dataset versioning.
Support LLM fine-tuning workflows — corpus curation, quality filtering, dataset formatting.
Implement automated evaluation pipelines: factual accuracy, hallucination detection, regression tracking.
Maintain production monitoring dashboards for pipeline health, model metrics, and alerting.
Agentic Data Infrastructure: Build and maintain data APIs, tool schemas, and memory/state stores that autonomous agents depend on.
Implement agent observability: capture inputs, retrieved context, tool calls, reasoning traces, and outputs.
Maintain text-to-SQL layers, semantic query interfaces, and context APIs for conversational AI consumers.
Governance, Security & Data Quality: Implement RBAC, attribute-based access, PII detection/masking, data classification, and audit logging.
Enforce data contracts and schema governance with automated breaking-change detection and versioned migrations.
Build data quality monitoring (completeness, freshness, consistency) with automated alerting and root-cause tooling.
Support compliance readiness: audit trails, data provenance, and regulatory documentation.
aligned engineering.
•
Secondary Skills : LangChain, LlamaIndex, LLM APIs (OpenAI, Bedrock, Claude, HuggingFace), Pinecone, FAISS, ChromaDB, OpenSearch, MLflow, FastAPI, Neo4j, LangGraph, prompt engineering, RLHF dataset prep, LLM fine-tuning workflows
At 3Pillar, we create an environment where people can do their best work while maintaining a healthy work-life balance.
Our culture is guided by four core values: Collaboration, Outperform, Respect, and Evolve—the principles that shape how we work, grow, and succeed together.
3Pillar
Data & ML
76 open roles on Sydicom
3Pillar Global is a digital product development company that partners with businesses to build innovative software solutions. They specialize in custom software development, product strategy, and user experience design to help clients accelerate their digital transformation.
Generated by Sydicom AI