Skip to content
View Venkat185's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Venkat185

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Venkat185/README.md

Hi, I'm Venkata Ratnam Vasabathula

AI/ML Engineer • LLM Systems • Agentic AI • MLOps

LinkedIn Email GitHub Profile views


About Me

Senior AI/ML Engineer focused on architecting and delivering production-grade LLM systems, RAG pipelines, and multi-agent orchestration frameworks at enterprise scale. I work end-to-end across the AI platform stack — from vector-backed retrieval and LLM observability to distributed MLOps infrastructure on Azure, GCP, and AWS.

  • Building agentic AI workflows with LangChain, LangGraph, and the MCP ecosystem
  • Designing RAG pipelines with hybrid semantic search across Pinecone, Weaviate, FAISS, and ChromaDB
  • Fine-tuning and deploying foundation models (GPT-4o, Claude, LLaMA, Mistral, Gemini) in regulated, high-availability environments
  • Driving LLM observability, evaluation, and responsible AI practices across ML organizations
  • Reach me at venkat.vasabathula@gmail.com

What I Work On

  • LLM Systems & Generative AI — Production RAG, prompt engineering, fine-tuning (LoRA/PEFT), Chain-of-Thought reasoning, multi-modal pipelines.
  • Agentic Frameworks — MCP client/server architectures, LangGraph orchestration, tool-using agents integrated with enterprise data sources.
  • MLOps & Platform Engineering — Model versioning, A/B testing, evaluation pipelines, CI/CD for ML, distributed inference serving.
  • AI Observability — LangSmith tracing, drift detection, latency/cost monitoring, model confidence and quality metrics.

Tech Stack

Languages
Python SQL JavaScript

LLMs & Generative AI
OpenAI Anthropic LLaMA Mistral Gemini

Frameworks & Libraries
LangChain LangGraph LangSmith Hugging Face PyTorch TensorFlow scikit-learn FastAPI

Vector & Data Stores
Pinecone Weaviate FAISS ChromaDB PostgreSQL Redis MongoDB Kafka Spark

Cloud & Infrastructure
Azure AWS GCP Docker Kubernetes Terraform

MLOps & Observability
MLflow GitHub Actions Jenkins Prometheus Grafana


Selected Engineering Highlights

Highlights from my professional experience (company names omitted).

  • Architected a production-grade agentic MCP client–server framework using LangChain and LangGraph, integrating 5+ enterprise data sources and reducing Tier-1 support ticket volume by 20%.
  • Engineered an enterprise RAG pipeline on Azure AI Search with FAISS / Pinecone / Weaviate, automated ingestion, and real-time KB sync — cutting query resolution time by 34%.
  • Built an end-to-end LLM observability framework with LangSmith and distributed tracing for confidence, latency, and drift — reducing model degradation by 25%.
  • Owned automated model evaluation pipelines benchmarking 40+ configurations across GPT-4, LLaMA, and open-source LLMs — achieving 42% performance improvement.
  • Delivered scalable multi-model inference serving on Azure (FastAPI + Docker + Kubernetes) with caching, batching, and model versioning — sub-2s latency and 28% lower API cost.
  • Built distributed Spark + Airflow pipelines on Databricks processing 100+ TB of data, and mentored engineers on AI architecture and LLM integration patterns.
  • Earlier in my career: fine-tuned LLMs with LoRA/PEFT, deployed cloud-native inference on GCP/AWS (38% lower latency), and led MLOps migrations achieving 99% uptime SLA.

Featured Project

Production AI Services PlatformPython, FastAPI, LangGraph, LangChain, Pinecone, Docker, Kubernetes, Azure
Cloud-native AI services platform with agentic workflows, vector-backed RAG, hybrid semantic search, and full observability. Serves 200K+ documents with sub-2-second responses and 99.5% uptime. CI/CD on Azure via GitHub Actions; model quantization reduced serving latency by 31%.


GitHub Stats

GitHub Stats GitHub Streak

Top Languages


Education & Certifications

  • M.S. Computer Science — California State University, Channel Islands
    Coursework: Large Language Models, Neural Networks, NLP, Machine Learning, Distributed Systems
  • Microsoft Certified: Azure AI Engineer Associate
  • AWS Certified Solutions Architect – Associate
  • Cisco – Data Analytics Essentials

Achievements

  • CSUCI Plot-A-Thon 2024 — 1st place, data visualization & analysis

Open to collaborating on production AI/ML systems, LLM platforms, and agentic AI research.

Pinned Loading

  1. Agent_Ariya Agent_Ariya Public

    Upload a CSV, ask a question in plain English, get interactive charts and insights in seconds. Full-stack AI app: FastAPI + React + GPT-4o + E2B sandbox.

    Python

  2. rag-pipeline rag-pipeline Public

    Python

  3. Uber_Data_Analytics Uber_Data_Analytics Public

    Jupyter Notebook

  4. New-york-city-Airbnb_Dashboard New-york-city-Airbnb_Dashboard Public

  5. Yelp_Data-analysis Yelp_Data-analysis Public