Senior AI/ML Engineer · Agentic AI · LLMs · Healthcare

Building agentic & generative AI
that actually moves healthcare metrics.

I design and deploy enterprise-grade AI systems that automate audits, surface clinical insights, and make healthcare operations measurable – from multi-agent LLM platforms to cloud-native MLOps across GCP, AWS, and Azure.

Experience 7.5+ years AI/ML · Agentic AI · Healthcare
Clinical audit automation 65%↓ manual time 12k+ episodes audited monthly
Model impact 82% precision TVT & trauma eligibility pipeline
Portrait of Vineet Srivastava
Vineet Srivastava Agentic AI & LLM Engineer · Healthcare
Shipping production AI
Current Delivery Consultant – Senior AI/ML Engineer @ AWS
Focus Agentic AI · LLMs · MLOps · Healthcare / IoT
Multi-agent orchestration Graph RAG Vertex AI · Bedrock · Databricks
Trusted by teams at
AWS logo HCA Healthcare logo Dana-Farber / Harvard logo Qualcomm logo Dexcom · Capgemini logo L&T Technology Services logo Mirafra Technologies logo Databricks Generative AI accreditation badge

About

I work at the intersection of agentic AI, LLMs, and healthcare operations – with a bias toward measurable, production outcomes.

I’m a Senior AI/ML Engineer with 7.5+ years of hands‑on experience building and shipping data‑intensive systems – from agentic AI audit platforms over clinical notes to IoT telemetry pipelines on AWS and Azure.

Recent work spans:

  • Designing multi‑agent A2A workflows on Vertex AI + Neo4j that automatically audit guideline adherence and revenue cycle gaps across 100k+ clinical notes per month.
  • Building LLM‑driven TVT & trauma pipelines that process both structured data and scanned PDFs via OCR / vision LLMs, pushing eligibility detection precision from 65% to 82%.
  • Architecting NLP2SQL agentic systems to query heterogeneous healthcare assets on AWS using Bedrock, AgentCore and custom agents.
Chicago, IL
US healthcare, oncology, mental health, IoT / wireless
Quick snapshot
AWS · GCP · Azure Vertex AI · Bedrock · MedLM LangGraph · LangChain Neo4j · Vector DBs Terraform · Kubeflow · MLflow
Education
M.S. Business Analytics · UIC B.Tech ECE · VIT University
What I’m good at
Turning messy clinical workflows into robust agentic pipelines Production‑grade MLOps / LLMOps Responsible AI & hallucination detection Explaining tradeoffs to both clinicians and CTOs

Skills

Strong depth in agentic / LLM systems, plus full‑stack MLOps across major clouds for healthcare and IoT workloads.

Agent Developer Kit (ADK) Multi‑agent orchestration (A2A) MCP (Model Context Protocol) LangGraph · LangChain agents Graph RAG (Neo4j) RAG over vector DBs (FAISS, Chroma, Pinecone) Gemini, GPT‑4/3.5, Claude Sonnet 4 MedLM & domain LLMs Fine‑tuning · PEFT · LoRA RLHF, SFT, DPO/GRPO Prompt design · CoT · Retrieval‑guided prompts
Vertex AI (Pipelines, Run, Feature Store) AWS Bedrock · AgentCore · SageMaker Azure WebApps · Data Factory · Synapse Terraform · CloudFormation Kubernetes · GKE Docker · GitHub Actions · Cloud Build Databricks Workflows & MLflow Kafka · Data lakes · ETL pipelines Monitoring · drift checks · evaluation pipelines
RCM analytics & audit automation Clinical guideline compliance FHIR / HL7 integration patterns Oncology · mental health · trauma care OCR + LLM for scanned documents Entity extraction · pre‑population pipelines
Python · SQL · R PyTorch · TensorFlow PySpark · Pandas · NumPy FastAPI · Flask · Streamlit Embedded C · OOP

I care less about “framework of the week” and more about maintainable, observable systems that are easy for teams to operate at scale.

100k+ notes / month LLM‑driven audit pipelines over clinical documentation with explainable outputs and graph‑backed evidence trails.
Vertex AI · Gemini · Neo4j · ADK
65%↓ manual review Automated clinical pathway & RCM compliance checks with multi‑agent workflows and structured augmentation.
Agentic AI · A2A orchestration
8%↑ F1 in mental health Fine‑tuned LLM stack and Graph RAG for mental health note classification & summarization.
Databricks · Flan‑T5 · BERT
22%↑ device throughput IoT telemetry pipeline on AWS to surface Bluetooth failures and proactively flag faulty wearables.
SageMaker · FastText · AWS data lake

Experience

I’ve seen AI in the wild – across large hospital systems, cancer institutes, IoT / wireless, and now cloud consulting.

Delivery Consultant – Senior AI/ML Engineer
Amazon Web Services (AWS) · Multiple domains
Oct 2025 – Present
  • Building end‑to‑end NLP‑to‑SQL agentic AI solutions for hospitals and healthcare asset tracking using AWS Bedrock, AgentCore, and custom multi‑agent patterns.
  • Focused on operationally safe agents– guardrails, evaluation, and observability baked in from day one.
AWS Bedrock AgentCore · Multi‑agent Healthcare assets
ML Engineer II (Senior AI/ML Engineer)
HCA Healthcare · Healthcare
Jan 2024 – Oct 2025
  • Co‑architected an agentic AI audit platform on Vertex AI + Cloud Run + Neo4j, orchestrating multi‑agent workflows for clinical guideline and RCM compliance, processing 100k+ notes/month.
  • Led Terraform‑based IaC, CI/CD, and autoscaling; automated audit for 12k+ patient episodes monthly and cut manual review time by 65%.
  • Designed TVT & trauma eligibility solution with BERT‑based pipelines and OCR, improving precision from 65% → 82% and speeding inference by 40% via GPU‑backed Vertex AI pipelines.
  • Built a generic hallucination detection framework combining semantic similarity, fuzzy matching, and LLM‑as‑judge to give humans fast, explainable quality checks on LLM outputs.
Agentic AI · A2A Gemini · MedLM Graph RAG · Neo4j Terraform · GCP
AIOps Intern
Dana‑Farber Cancer Institute (Harvard Medical School) · Mental health
May 2023 – Aug 2023
  • Fine‑tuned ALBERT, ClinicalBERT, GPT‑3.5/4, and LLaMA2 on 300k+ labeled notes to detect mental disorders.
  • Implemented Graph RAG with knowledge graph DB for contextual retrieval, improving diagnostic and treatment recommendation relevance by 25%.
  • Orchestrated the full LLMOps stack with Databricks Workflows and MLflow; used PEFT / LoRA to compress a 1GB Flan‑T5 model to ~30MB without losing performance.
Graph RAG Databricks ClinicalBERT · Flan‑T5
Senior Engineer
Qualcomm · IoT / Wireless
Nov 2021 – Jul 2022
  • Designed a full AWS‑hosted pipeline to predict Bluetooth connection failures for wearables; classification models achieved 84% AUC and identified 1000+ faulty devices.
SageMaker PySpark · Data lake FastText
Consultant / Senior Engineer / Engineer
Capgemini (Dexcom) · Mirafra · L&T Technology Services
Aug 2016 – Oct 2021
  • Built analytics framework for Dexcom’s CGM BLE signal‑loss events, deployed on Azure WebApps with CI/CD, improving device reliability via proactive detection.
  • Led IoT ML pipeline for smart‑building battery prediction (AWS EC2/Kafka/MongoDB), cutting installation cost by 30%.
  • Developed LSTM‑based indoor localization model, improving accuracy by 20% on noisy RSSI data.
Azure · WebApps Kafka · MongoDB LSTM · Forecasting
Bottom line: I’m not a “demo” person. I ship auditable, monitored AI services that plug into existing hospital workflows and IoT stacks – and I stay close to the metrics they move.
Hallucination framework Generic, pluggable evaluation layer combining semantic similarity, fuzzy matching, and LLM‑as‑judge for safer agentic systems.
Reusable across use‑cases
Procedural keyword engine High‑speed bedside procedure detection (Aho‑Corasick + LLM context + NegEx) with ~95% accuracy on BigQuery scale.
Enabled rapid audit queries

Selected Work

A few concrete systems that show how I think about architecture, safety, and measurable business value.

Clinical Audit Agentic Platform
HCA Healthcare · Production

Multi‑agent system on Vertex AI + Neo4j that reads clinical notes, enriches with structured data, and runs graph‑based checks against guidelines / RCM rules. Outputs explainable audit reports with evidence trails.

→ 100k+ notes/month · 12k+ episodes audited · 65% reduction in manual review · higher documentation accuracy.

Gemini · MedLM Graph RAG · Neo4j Cloud Run · Terraform Agent Developer Kit (ADK)
What I owned Architecture, agent design, prompt/data flows, IaC, CI/CD, and alignment with compliance teams. Key differentiator Explains every model decision with linked evidence in the graph – auditors can challenge, not blindly trust.
TVT & Trauma Eligibility Pipeline
HCA Healthcare · Production

Hybrid BERT + OCR / vision‑LLM stack that reads medical histories, scanned PDFs and structured EMR data to identify candidates for transcatheter valve therapy and trauma care.

→ Precision lifted from 65% → 82%, pipeline speed up by 40%, integrated into existing enterprise workflow.

Vertex AI · Kubeflow BERT · custom spaCy models OCR · multimodal LLMs Cloud Run · CI/CD
What I owned Model selection & tuning, RAG design, GPU‑based deployment, and performance / precision trade‑offs with clinicians. Key differentiator Handles both unstructured notes and low‑quality scans in the same pipeline with clear confidence signals.
Mental Health Graph RAG
Dana‑Farber / Harvard · Research → LLMOps

Graph‑based RAG over 300k+ mental‑health notes: nodes encode symptoms, diagnoses, and treatments; LLMs retrieve context via graph traversals for more grounded diagnosis suggestions and summaries.

→ ~25% boost in relevance of treatment recommendations and +8% F1 over existing models.

ClinicalBERT · Flan‑T5 Graph DB · RAG Databricks · MLflow PEFT · LoRA
What I owned Pipeline design, RAG strategy, and full Databricks workflow from staging to production with experiment tracking. Key differentiator Graph RAG preserved nuanced comorbidities that vanilla vector search kept losing.
IoT Wearables Failure Prediction
Qualcomm · Production

AWS‑native pipeline aggregating BLE telemetry to flag devices likely to drop connections in the field. Outputs fed into QA and support to pull or patch hardware proactively.

→ AUC ~84%, 1000+ faulty devices identified, production efficiency up by 22%.

SageMaker Studio Athena · Glue · Lambda FastText embeddings
What I owned End‑to‑end architecture, feature engineering, and model deployment integrated into existing AWS data lake. Key differentiator Gave hardware teams concrete failure signatures instead of vague “it’s flaky” reports.
How I like to work

My default approach for any AI initiative:

  • Start from the target metric (throughput, accuracy, cost, risk) – not the model.
  • Prototype quickly with honest evaluation; kill bad ideas early.
  • Design for observability: traces, guardrails, and business‑level dashboards.
  • Make outputs easy for non‑technical stakeholders to challenge and override.

If you’re looking to take an AI project from “demo deck” to “quietly running in production every day”, that’s where I’m useful.

Publications & IP

Peer‑reviewed work, patents, and media coverage around healthcare AI, passive sensing, and enterprise AI adoption.

MindWatch: Passive Detection of Mental Disorders from Wearables & Behavior Signals
Preprint · 2023 · Multimodal sensing + ML for mental‑health insights

Research on passive mental‑health monitoring using wearable and behavioral signals, focusing on predictive modeling and early‑warning systems.

Read preprint
JMIR AI – Clinical LLM Work
JMIR AI · 2025 · Large language models for mental‑health/clinical workflows

Work on using LLMs safely in clinical contexts – evaluation, bias, and responsible deployment patterns.

Read article
Systems & Methods for Context‑Aware Notifications
US Patent · US20200039784A1

Patent on delivering context‑aware notifications using sensor data and adaptive logic – early work that informs how I think about context for today’s agents.

View patent
Writing & Thought Leadership
Medium · AI, healthcare, MLOps

Regular writing on pragmatic AI in healthcare, agentic architectures, and lessons from production systems.

Read articles
Certification
Databricks Accredited Generative AI Fundamentals
Issued 2024 · Expires 2026

Reinforces my work on LLM / RAG systems with Databricks – used heavily for mental‑health and clinical pipelines. :contentReference[oaicite:4]{index=4}

Featured by industry media

My journey from healthcare AI to AWS – and how enterprises can move from pilots to autonomous AI systems – has been covered by industry outlets.

Contact

If you’re serious about production AI in healthcare or IoT, I’m happy to dig into specifics – architecture, risks, and what it’ll actually take.

Best fit: roles where you need someone to own end‑to‑end AI systems – from idea, to data pipelines, to LLM/agent design, to deployment and monitoring – and communicate clearly with both clinicians and engineering leadership.

Prefer short, concrete messages: the problem, current stack, constraints, and what “success” would look like.

Logistics

Location: Chicago, IL · Central USA time zones.
Work setup: Open to remote, hybrid, and short on‑site engagements for critical phases (architecture, deployment, or executive workshops).

Ideal conversations
  • Hospitals or payors planning serious agentic / LLM adoption.
  • Teams tired of pilots who want production‑grade MLOps / LLMOps.
  • IoT players needing predictive maintenance / telemetry‑driven insights.