Available for opportunities · Dublin, Ireland

Hi, I'm Kanishk Kapoor

MSc Computing (Data Analytics) at Dublin City University · Building production LLM pipelines, agentic AI systems, and real-time data infrastructure. Previously at IBM & Medicidiom.

30+
Projects
2
Internships
5M+
Records Processed
1:1
Expected MSc
Scroll
About Me

Who I Am

A passionate technologist bridging the gap between cutting-edge AI research and production-grade engineering.

I'm an AI Developer & Data Engineer currently pursuing my MSc Computing (Data Analytics) at Dublin City University, expecting a 1:1 (First Class Honours).

I specialise in building LLM-powered agentic systems, production ML pipelines, and real-time data infrastructure using tools like OpenAI API, LangChain, Apache Kafka, Azure, and AWS. I've shipped 30+ projects spanning FinTech, healthcare, energy, and logistics.

Currently interning at Medicidiom (Spain, Remote) where I build production AI automation workflows and document-intelligence pipelines that process 1,000+ documents with 25% accuracy gains.

Interests

Agentic AI SystemsLarge Language ModelsReal-time Data PipelinesCloud ArchitectureBusiness IntelligenceML ResearchGamingTravelling

MSc Computing (Data Analytics)

Dublin City University · 1:1 Expected · 2025–Present

B.Tech Computer Science

UPES · CGPA 8.7/10 · 2020–2024

AI Automation Intern

Medicidiom, Spain (Remote) · Feb 2026–Present

Based in Dublin, Ireland

Open to remote & on-site opportunities

Quick Facts

Dublin, Ireland
Originally from India
Python Expert
Fuelled by coffee

Impact by the Numbers

Real results from real production systems

0

Projects Shipped

Spanning AI, Data Engineering, ML & Full Stack

0

Records Processed

Across ML pipelines and data engineering projects

0

Manual Effort Reduced

At Medicidiom via AI automation workflows

0

Pipeline Uptime

Production data pipelines at Medicidiom

0

Accuracy Improvement

ML models at IBM for threat detection

0

Documents Processed

Via LLM-powered intelligence pipelines

Technical Skills

My Toolkit

A comprehensive stack spanning AI, data engineering, cloud, and full-stack development.

AI & LLMs

OpenAI API (GPT-4)HuggingFace TransformersLangChainMCP AgentsAgentic AIPrompt EngineeringDistilBART / BERTRAG Pipelines

ML & Deep Learning

Scikit-learnXGBoostLSTM (TensorFlow/Keras)SARIMA / TBATS / ETSSHAPRandom ForestLogistic RegressionA/B Testing

Languages

Python (Expert)SQLJavaScriptJavaCRTypeScriptBash

Data & Pipelines

Apache KafkaApache SparkApache AirflowSnowflakeDelta LakeETL / ELTPandas / NumPyPostgreSQL / MongoDB

Cloud & DevOps

Azure (ADF, Databricks)AWS (S3, Lambda)DockerFastAPICI/CDGit / GitHubREST APIsLinux

Visualisation & BI

Power BI (DAX)StreamlitSnowsightMatplotlib / SeabornPlotlyExcel (Power Query)MERN StackReact

Core Language Proficiency

Python95%
SQL88%
R82%
JavaScript75%
Java70%
Bash / Linux72%
Experience & Education

My Journey

From classrooms to production systems at IBM and beyond.

Work Experience

AI Automation & Operations Intern

LIVE
MedicidiomSpain (Remote)Feb 2026 – Present
  • Architected LLM-powered document-intelligence pipelines (Python + OpenAI API) processing 1,000+ documents — improving data accuracy ~25% and cutting manual review by 35%
  • Built agentic AI automation workflows eliminating ~45% of manual effort and reducing analytics turnaround by 30%
  • Production pipelines maintained 99%+ uptime with ~20% latency reduction
  • Created Power BI dashboards surfacing live operational KPIs, reducing ad-hoc reporting requests by ~40%
PythonOpenAI APILLMsPower BIAgentic AIFastAPI

Cybersecurity & Data Analysis Intern

IBMIndiaJun 2023 – Sep 2023
  • Applied ML classification models (Python, Scikit-learn) to millions of security records — improved detection accuracy by 22% and reduced false positives by 15%
  • Built and evaluated multiple model architectures on multi-year datasets
  • Improved outbreak forecasting accuracy by 18% through systematic experimentation
  • Delivered analytical findings to senior analysts to directly inform remediation decisions
PythonScikit-learnMLSQLSecurity AnalyticsForecasting

Education

M.Sc. Computing (Data Analytics)

Dublin City University

2025 – Present1:1 Expected

B.Tech Computer Science

University of Petroleum & Energy Studies

2020 – 2024CGPA: 8.7/10

Certifications

Google Data Analytics Professional Certificate

2024

Forecasting in Business — Deakin University

2024

Data Analytics for Investment

2024

Projects

What I've Built

30+ projects spanning AI, data engineering, machine learning, and full-stack development.

Pinned

Product Analytics MCP / LLM Agent

Agentic AI system that lets users query product analytics in plain English. Eliminates manual SQL or BI tool access entirely via a Model Context Protocol (MCP) server with a natural language interface.

AI / LLM
PythonOpenAI APIMCPAgentic AI+2
Dual-model fallback

News Intelligence Dashboard

Real-time data pipeline: news API → dual-model summarisation (OpenAI GPT-4 + HuggingFace DistilBART fallback) → Streamlit dashboard with smart API rate-limit handling.

AI / LLM
PythonGPT-4HuggingFaceStreamlit+2
Multi-layer Medallion

Project Aeroflow — Real-Time Pipeline

End-to-end real-time airline delay data pipeline: FastAPI producer → Apache Kafka → Azure Event Hubs → Databricks PySpark (Bronze/Silver/Gold) → Snowflake → live Snowsight KPI dashboards.

Data Engineering
PythonKafkaAzureDatabricks+2

E-Commerce Sales Pipeline

Real-time order streaming pipeline: FastAPI event source → Kafka → Spark stream processing → structured JSON in AWS S3 with Airflow orchestration and Docker containerisation.

Data Engineering
KafkaSparkAWS S3Airflow+2
5M+ records

European Water Quality ML Model

Research-grade ML pipeline on 5M+ European environmental records. Spatio-temporal feature engineering, gradient boosting for nitrate/phosphate pollution risk prediction across 4 water body types.

Machine Learning
PythonXGBoostScikit-learnPandas+1

Transaction Fraud Detection

FinTech fraud detection on 18K+ transactions with feature engineering, statistical validation (ANOVA, Mann-Whitney U), XGBoost with deliberate class-imbalance handling.

Machine Learning
PythonXGBoostANOVAScikit-learn+1
Get in Touch

Let's Connect

Open to internships, graduate roles, and exciting project collaborations. Let's build something great together!

Say Hello

Whether you're a recruiter, a fellow developer, or someone with an interesting project — I'd love to hear from you. I typically respond within 24 hours.

Email

kanishkkapoor15@gmail.com

Phone

+353 899 595 536

GitHub

kanishkkapoor15

LinkedIn

in/kanishkapoor

Location

Dublin, Ireland 🇮🇪

Send a Message