To the Students - Skip to main content

Dithered Tree

TO THE STUDENTS

If all goes well, I’ll leave Brandeis University with a PhD degree in computer science this summer. Here are my parting thoughts for our remaining students, divided into technical and non-technical sections.

I write this out of self-interest; students after me will represent our university in years to come, and it’s in my best interest for them to be as effective as possible.

NON-TECHNICAL

  • Students often lack crucial skills. Technical students may lack communication skills, or social butterflies may lack depth. Some lack both. This doesn’t really matter, in my opinion, if they are willing to learn. Just like neural networks, they may be all over the place initially, but students who reflect on your feedback will (eventually) be worth more than those who never update their parameters. Don’t give up on those who listen and try; they often just need the right kind of mentorship.

  • Some students might be eager to learn, but have a poor approach to learning. It’s often easy to correct; if you ask how they approach a problem and why they’re approaching it that way, you’ll unearth the faults in their process. Most of the time, you don’t need to correct them—you can often describe how you’d approach their problem, and, perhaps after a few more failed attempts, they’ll try it your way, all on their own. This worked on me, anyway.

  • The packaging of a message matters more than its content. I can pretty much tell anyone anything if I phrase it with thought, respect, and tact. This takes time, effort, and lots of practice. If someone gets defensive and reacts negatively to your feedback, you likely could have phrased it better.

  • Effective communication is as important for your career as a top-tier publication—perhaps moreso. If you can’t describe why your work is significant, it probably doesn’t matter what your results are. Conversely, if your results are less stellar than you hoped, you can still sell them to a reader with the right discussion and analysis.

  • To communicate well, you need to read and write—a lot. More than you think. Some resources that have helped me:

  • Insight comes from many sources. Read widely, and read weird things. I find computing history to be particularly illuminating, as it shows how technology changed over time.

  • Know your peers—this may be your last chance to become friends with a group of PhDs, most of whom will join different companies; this is an prime time to expand your network. Plus, the journey’s more bearable with company.

  • Care about your work. If you don’t care about your project, nobody will. Learn about the area you work in; learn about the technology you’re using; learn about why things are the way they are. Only two types of people have a chance at success anymore—those with passion, and those who crave status. I prefer those with passion.

  • If you don’t care about your work, make yourself care—life is better this way.

  • Take walks.

  • Learning yields an exponential return on investment. The bigger your knowledge graph, the easier it is to add edges and nodes. Reading papers in a field you know well can be effortless.

  • If you are still in the game, you haven’t lost. Don’t give into despair.

  • Be reliable. Respond to messages and fulfill your commitments, without needing a reminder. It’s nuts how many people can’t manage this.

  • Don’t burn bridges; people can change.

  • Be kind, but not soft.

  • Hold yourself and those around you to a realistic but high standard. A community aspiring to excellence can flourish; but those who see themselves part of a sinking ship are destined to drown.

  • Each lab should have at least one student who is friendly with the rest of the department. Students often suffer in silence, and a well-connected social graph is critical for detecting and addressing serious problems.

TECHNICAL

AI

  • Keep (relatively) up to date with the latest developments, even if they aren’t relevant to your current project; they may prove critical for your next one.

  • Understand the purpose of every layer in a foundation model, every stage of training, and how inputs and outputs are represented and sampled. Don’t skip these.

  • Learn about the memory hierarchy of a GPU. The original FlashAttention paper should make it clear why.

  • Learn why reinforcement learning is a pain; LLMs provide a good starting point.

  • Understand the intuition behind the PPO algorithm.

  • Read seminal papers, like the original attention is all you need paper and ResNet paper. They have insight and perspective which blogs and youtube videos lack.

NOT AI

  • Learn your tools, whether it’s Vim, Bash, PyTorch, VsCode, or even ChatGPT. They have many useful features that you likely aren’t aware of—Vim and Bash especially.

  • Test your code using specific input/output examples, especially the mathematical sections which may produce incorrect data instead of a crash. Seriously, test your code.

  • You probably should learn the basics of Tmux, especially if you use the cluster.

  • Equations are machines, and we can understand them from an intuitive, functional point of view—each component does something. They don’t need to be scary.

  • If you are confused by something, ask clarifying questions until you aren’t.

  • An hour well-spent with pen and paper can save days of debugging.

  • Keep a lab notebook, and date your entries—you won’t remember what you did two months from now. I just use a single text file.

  • Computing is an end, not a means.

  • Don’t use Windows.