Update: India and therapeutics
September 8, 2025
I thought I’d write an update about what I’ve been up to. I graduated in May, and in July, I decided to move back home to India, to join PopVax. I’m excited to work on hard technical problems for the purpose of advancing human health, and plan to spend the next few years helping build the engine which can bring therapeutics from idea to market as fast as physically possible.

If you haven’t heard from me in a long while: After spending all of high-school building web-apps, I found myself bored doing the same thing in college. After my first two years at Berkeley, I was convinced I wanted to work in biology. The field was technically interesting - it is nano-technology that is real, can be engineered, and can be scaled to planetary proportions. The field was also unambiguously good: the same technology can be used to build therapeutics which can make life better for every human being out there.
I’ve been thinking about what to work on in biology for a while. Last summer, I worked on ML for protein dynamics — and seriously considered doing a PhD to work on this problem. I spent a semester trying to start an institute in India to accelerate basic biology research, and came very close to doing this full time. For two semesters, I was in the wet lab, working on building high-throughput droplet based assays intended to generate the dataset I thought we’d need to make progress on the “virtual cell” problem, and almost spun out a company to work on this (with a very talented scientist at Berkeley). Each of these taught me a lot about the field — but I chose not to work on any of them.
Machine learning has been making very real progress in biology, making it much easier to build several classes of therapeutics — and simultaneously, several useful data generation engines in biology are well past the scale that you can train models on. However, most people in the field seem to be excited to “solve” biology with machine learning — I think we do not know what this means, are unaware of the shape of data it will take to achieve this, and have no clue on how we can actually generate this data.
Instead, I want to build models which exist to help me find the best version of a single drug that I have scoped out — not one that can help me get mediocre “hits” on thousands of different targets. A single drug can save millions of lives, and accrue over $20B/year of revenue. It is fully worth it to spend $1m to train a “niche” model that could help you engineer the best version of even one of these drugs — and more useful to do so than to spend $100m building some form of an incrementally better general “foundation” model in biology.
But even once you do see your scientific hypothesis play out, and use models to have some efficacy in preliminary assays, shipping a therapeutic has major scientific and logistical challenges. There are several axes to optimize your drug on (stability, developability, delivery, off-target effects), which again, we can collect specific data for and machine learning can help with. But there are also problems in testing your therapeutic in-vivo, manufacturing doses quickly, and building clinical trial infrastructure to reach people as fast as possible, all while attempting to reach an inflection point of bringing your molecule to the first patient. Building a machine that can do this repeatedly is challenging enough for us to very rarely see true pharmaceutical companies getting products from bench to approval. Instead we see biotechs which hope to demonstrate proof of principle, and then sell off their assets, if not their whole company.
The world I aspire for is one in which novel therapeutic ideas can be taken to the clinic for under $2m, and within 2 years. We are far off from this, but if we somehow manage to get there, I think we will see an explosion in novel therapeutics not unlike the explosion we saw in startups in the early 2000s when Y Combinator began funding hundreds of ideas for cheap.
Unfortunately, building all this infrastructure for therapeutics has become incredibly expensive in capital, labor, and time. Moreover, America has fallen so far behind on capital efficiency that many, if not most, new biotech companies have an exec team in Boston but run all wet-lab operations in China. This cannot be how we’re going to get rapid iteration and improvement in human health.
My friend Soham Sankaran started PopVax in 2021 with the intention of developing truly novel drugs in India. India exports ~$30b of pharmaceuticals every year, but almost none of this revenue is from novel medicines — instead, Indian pharma companies make generics and biosimilars: copying therapeutics developed outside the country once the patents protecting them have expired. Soham started PopVax because he thought Hyderabad was to biotech in 2021 what Shenzhen was to electronics in 2010. The company started a little over 3 years ago, and has built a 80 person R&D and manufacturing team which is close to taking their first product, a broadly protective COVID mRNA vaccine to the clinic in 2026 — all in about ~$12m.
PopVax will build many more therapeutics end-to-end: from developing early data in the lab, to working with patients in the clinic, and eventually bringing drugs to market. I think the rapid timelines, scaled data generation, and capital efficiency we need can be reached here in India. In addition to our work in vaccines, we are collecting billions of data points to train several different task specific models which will be instrumental in building new cancer therapeutics and in developing a fully novel modality for autoimmune diseases –– both of which we hope to bring to the clinic by 2027.
These are exciting times and things are moving fast here in Hyderabad! If you're interested in helping us build therapeutics for millions of people, please reach out — samarth@popvax.com :)