Sid Valecha

Machine Learning Engineer and Data Scientist

I am a recent graduate from the University of Wisconsin–Madison, passionate about the intersection of AI and music, and exploring AI as a tool in the creative process.
Currently working on Sonorus Labs

Headshot of Sid Valecha

Experiences

73 Strings

June 2024 - August 2024

AI/ML Engineering Intern

New York, New York, United States

  • Built and deployed LLM-based financial document parsers end-to-end, reducing processing time by 30% and improving extraction accuracy for private-equity analytics.
  • Developed, trained, and productionized a chart detection and classification model, improving precision to 93% (+12%) and recall to 94.4% (+11%). Achieved 92.2% mAP, and performed hyper-parameter tuning to reduce false positives and improve model reliability in production.

United Nations (OICT)

May 2024 - June 2024

AI/ML and Full Stack Intern

Remote

  • Built and deployed a Retrieval-Augmented Generation (RAG) system for a UN research portal, enabling context-aware querying over large collections of unstructured documents.
  • Designed and implemented document ingestion, preprocessing, and embedding pipelines, supporting scalable retrieval and low-latency inference in production.
  • Implemented research repository file-handling and storage workflows to support AI-driven document processing, contributing beyond core ML responsibilities and learning production systems on the job.

Skills & Technologies

Key technologies, programming, and analytics skills.

Programming Languages

Python Java R SQL JavaScript

Data Science & ML

Pandas NumPy SciPy Scikit-learn XGBoost TensorFlow PyTorch Matplotlib Seaborn Hugging Face OpenCV Ultralytics YOLO

AI/ML & NLP Technologies

Natural Language Processing LangChain Ollama Embeddings Google GenAI Embeddings Prompt Engineering MLOps LLMs

Web & App Development

HTML CSS JavaScript Flask RESTful APIs Streamlit

Data Engineering & Databases

MongoDB MySQL PostgreSQL Snowflake Elasticsearch FAISS Airbyte

DevOps & Infrastructure

Git Docker Kubernetes AWS EC2 GCP Jenkins

Analytics & Visualization

Tableau Microsoft Excel Kibana

Project Management & Collaboration

Jira Trello Postman

Featured Projects

Some of my data science projects showcasing my skills in machine learning, data analysis, and visualization.

Chord Probability Engine

Chord Probability Engine

A chord progression probability explorer built with N-gram models. Analyzes chord sequences from the Chordonomicon dataset and provides an interactive web interface for exploring chord transition probabilities using unigram, bigram, and trigram models with interpolation.

Python Machine Learning N-gram Models Markov Chains Web Development
Student Financial Outcomes Project

Student Financial Outcomes

Analyzed student loan debt and earnings outcomes by field of study, comparing STEM vs non-STEM fields using College Scorecard data.

R ggplot2 Statistical Analysis Data Visualization
Tic-Tac-Toe AI using the Minimax Algorithm

Tic-Tac-Toe AI using the Minimax Algorithm

Implementation of an unbeatable Tic-Tac-Toe AI using the minimax algorithm with optimal game strategy.

Python Algorithm Game Theory Minimax Recursion Adversarial Search Tree Search AI
Neural Network from Scratch using NumPy

Neural Network from Scratch using NumPy

Building a fully functional neural network from scratch using only NumPy, implementing forward and backpropagation algorithms.

Python NumPy Deep Learning Neural Networks
Shiny App for Visualising Milwaukee Bucks Performance

Shiny App for Visualising Milwaukee Bucks Performance

Interactive R Shiny application for visualizing and analyzing Milwaukee Bucks team performance statistics and metrics.

R Shiny Data Visualization Sports Analytics
Shiny App for Video Game Sales Analysis

Shiny App for Video Game Sales Analysis

Interactive dashboard built with R Shiny for analyzing video game sales data with dynamic visualizations and filters.

R Shiny Data Visualization Analytics
Library Management System using SQL and Docker

Library Management System using SQL and Docker

Full-stack library management system with SQL database backend, containerized with Docker for easy deployment.

SQL Docker Database Backend

Education

University of Wisconsin-Madison campus

University of Wisconsin-Madison

Bachelor of Science in Data Science May 2025

Certificate in Computer Sciences

  • Passionate about data analysis, machine learning, and software development
  • Relevant coursework: Machine Learning, Statistical Computing, Database Systems, Data Management, Data Visualization, Data Ethics and Policy, Linear Algebra, Database Design, Artificial Intelligence, Data Science Modelling, Computer Engineering

Get In Touch

Whether you have a question or just want to say hi, feel free to reach out!

Email

sidvalecha4@gmail.com

Phone

+ 1 (848) 219-6614

Location

Madison, WI