Experience
Changelog from my journey
ML Engineer · AI Platform
AILY LabsMay 2024 – Present
Barcelona, Spain
- •Maintain and improve the org's shared ML/AI platform on Kubernetes (EKS), eliminating per-team infrastructure duplication and cutting model deployment lead time with shared GHA CI/CD workflows.
- •Engineered a standardised FastAPI model-serving framework adopted org-wide: factory pattern, request/response middleware, health checks, Datadog APM tracing, and multi-tenant MLflow model loading.
- •Authored and maintain an internal scikit-learn-compatible ML utilities library covering MRMR/SHAP feature selection, Optuna hyperparameter tuning, statistical drift detection, and MLflow lifecycle management.
- •Designed a Knowledge Graph platform from greenfield to production on Neo4j: NLP-driven entity extraction, matching and merging pipelines, GraphRAG, LLM exploration agents, Pydantic data models.
- •Maintain a shared GenAI library (embeddings, LLMs, vector DBs, Langfuse integration) and a unified data access layer enabling 10+ services to share a single tested data surface.
- •Delivered production LLM and agentic systems: hybrid + vector RAG on OpenSearch, real-time ReAct agents via PydanticAI + MCP servers, a unified LLM gateway (OpenAI, Bedrock) with quota management.
- •Productionalizing a LoRA model to be served on GPU Nodepools, culminating in a purpose-built autocomplete model served via a vLLM inference engine on Kubernetes.
- •Built an OpenSearch data platform end-to-end: Textract LAYOUT + Anthropic contextual retrieval, retrieval evals, cluster health and indexing pressure monitoring, cluster infra management with Terraform.
- •Designed and shipped an MCP server exposing OpenSearch to LLM agents via a PydanticAI query agent with a semantic index catalog, DSL validator, and inline Bedrock vector injection. Agentic RAG with a safe, validated query surface.
- •Contributed to a pull-based lakehouse query orchestrator (DuckDB/DuckLake over Redis + Kubernetes) for agentic services: pod family sizing, gradient-based proactive scaling, crash-safe inflight recovery, and a ~1s → ~50ms tail latency improvement.
- •Contributed to a shared semantic layer (metadata and context layer) providing agents and services a unified catalog API over tables, indexes, and skills, with tenant-aware metadata resolution and a dbt/YAML publish pipeline into Postgres.
KubernetesFastAPIMLflowNeo4jPydanticAIRAGLLMsOpenTelemetryPython
Data Scientist · Projects Officer
IMFNov 2023 – Apr 2024 · Apr 2025 – Jun 2025
Remote
- •Quantified causal effects of IMF interventions on member-state conflict under a structural causal framework.
- •Built a RAG pipeline over MONA policy documents to accelerate evidence retrieval for economists.
- •Designed a Human-in-the-Loop annotation pipeline (Label Studio + Few-Shot Learning) that reduced manual labelling effort while maintaining research-grade label quality.
RAGCausal InferenceNLPPythonLabel Studio
Data Analyst
GameloftJul 2023 – Dec 2023
Barcelona, Spain
- •Shipped funnel dashboards tracking CTR, DAU and RPU across titles.
- •Designed A/B testing frameworks for seasonal campaigns that measurably lifted player retention and monetisation metrics.
A/B TestingSQLDashboardsAnalytics
Graduate Researcher
Novartis & Barcelona School of EconomicsApr 2023 – Jul 2023
Barcelona, Spain
- •Implemented a novel inductive GraphSAGE variant (GNN/GCN/GAT) to learn embeddings from Knowledge Graphs extracted from earnings-call transcripts.
- •Used the embeddings to estimate cumulative abnormal returns in the 30-day post-event window.
GNNGraphSAGEKnowledge GraphsPyTorchNLP
Data Engineer
IQVIAApr 2021 – Aug 2022
Kerala, India
- •Replaced a legacy Informatica pipeline with Spark SQL-based automated report generation, cutting processing time significantly.
- •Built ELT pipelines in Snowflake integrating pharmaceutical sales data with SAP (Star Schema).
- •Implemented GDPR-compliant data anonymisation for HCPs and onboarded new market data models in Reltio MDM.
SnowflakeSparkInformaticaSQLReltio
Education
MSc Data Science
Barcelona School of Economics2022 – 2023
GPA 8.62 / 10
Causal InferenceBayesian StatisticsDeep LearningNLP / LLMsGraph Neural NetworksProbabilistic Inference
B.Tech Mechanical Engineering
Mar Athanasius College of Engineering, Kerala2017 – 2021
CGPA 9.13 / 10