Tech Musings – Shubham Gupta

Curriculum Learning 🤝 DSPy: Optimization

Optimzing DSPy programs for ConvFinQA

Sep 14, 2025

1 min

Curriculum Learning 🤝 DSPy: Modelling

Curriculum design for ConvFinQA, paving the way to program optimization

Sep 1, 2025

1 min

Curriculum Learning 🤝 DSPy: Exploration

Mapping ConvFinQA and crafting a curriculum for financial QA

Aug 31, 2025

10 min

Cracking the Anthropic CTF

Intersection of steganography x neural networks, with API credits as a reward!

Aug 5, 2025

15 min

GPUs go brrr with Mojo: Algorithms

Moar GPU puzzles with slide-n-sum pooling, tile-flipping convs & warp-speed scans

Jul 20, 2025

56 min

GPUs go brrr with Mojo: Fundamentals

Learning GPU programming fundamentals through hands-on Mojo implementations

Jul 6, 2025

16 min

BERT + BM25 = BISON

A new framework for information retrieval from documents

Aug 31, 2020

7 min

LongFormer

Transformers for loooong documents

May 11, 2020

9 min

TagLM

Bidirectional LM embeddings for sequence tagging

Apr 23, 2020

3 min

T5’s Closed-Book Exam

Measure the amount of information stored in a model

Apr 21, 2020

4 min

Attention is all you need

New architecture based solely on attention mechanisms called Transformer. Gets rids of recurrent and convolution networks completely.

Apr 20, 2020

5 min

REALM: Retrieval-Augmented Language MOdel Pre-Training

A better Q&A system based on knowledge retrieval

Mar 14, 2020

7 min

Bayesian Golf Putting Model

Are you the next Tiger Woods?

Mar 12, 2020

6 min

Yo!

First post

Jan 14, 2020

1 min