Tag: Ai

Typing Speed Might Matter Now

With tools like Claude Code and Codex, I think typing speed is starting to matter more, and I haven’t seen anyone mention this yet.

Before AI, typing speed didn’t matter much for programming, because everyone needed to take time to think through problems, come up with a good solution, and debug. These activities were not typically bound by typing speed. A fast typer might code up a solution faster than a slow typer, but if the slow typer had a better thought process, they could still implement working code in a shorter amount of time.

2026-03-24

/blog/typing-speed-might-matter-now/ jarbus

Low-Rank Factorizations are Indirect Encodings for Deep Neuroevolution

My latest paper is available on arxiv: Low Rank Factorizations are Indirect Encodings for Deep Neuroevolution.

The general idea is that we can search for stronger neural networks in a gradient-free fashion by restricting search to networks of low-rank. We show that it works well for language modeling and reinforcement learning tasks. It’s essentially a crossover between the following papers:

I’ll be presenting it virtually for the Neuroevolution@Work workshop at GECCO 2025.

2025-05-29

/blog/neuroevo-lora/ jarbus

From REINFORCE to R1: an Abridged Genealogy of Reinforcement Learning

Starting from REINFORCE, the original deep reinforcement learning algorithm, we will trace the evolution of policy gradient methods to the Group Relative Policy Optimization algorithm used to train Deepseek r1.

This post ignores the LLM side of things, less-related developments in RL, and most of the equations used for these algorithms, but captures the essence and intuition of the RL-timeline without wasting your time. This is all self-study, so feel free to send me any corrections/suggestions.¹

2025-02-21

/blog/from-reinforce-to-r1-an-abridged-genealogy/ jarbus

The Penultimate Wave of AI

I don’t think r1 will get us to artificial super intelligence, but whatever comes next probably will.

We are reaching a familiar bottleneck in AI. Previously, humans had to manually hardcode the patterns that AI could recognize. With deep learning, machines began to learn patterns on their own, without human assistance. With (relatively) expensive humans out of the loop, we threw machines at the world’s data until they began to talk, code, and paint. Many people believed this would be sufficient to reach artificial super intelligence–but it wasn’t.

2025-01-28

/blog/the-penultimate-wave-of-ai/ jarbus

Originality in the Age of AI

It used to be good enough just to copy others. Now, with AI in the hands of billions, there’s little value in copying.

For instance, take programming. Five years ago, building apps, websites, or games required a non-trivial amount of skill, and getting your first project off the ground was an accomplishment. Now, AI can generate most starter projects in hours, if not minutes. I think this decimates the reward, both internal and external, of actually completing the first few projects.

2024-10-05

/blog/upwards-pressure-on-originality/ jarbus

Emergent Trade and Tolerated Theft Using Multi-Agent Reinforcement Learning

I’ve been an author on a few papers before, but I recently published the first research project where I was responsible for most of the work and direction. It’s in the first 2024 issue of the journal Artificial Life, which you can find here. You can find a non-paywalled version here Below, I tell the chronology of the project and summarize our findings.

We explore the conditions under which trade can emerge between four deep reinforcement learning agents that pick up and put down resources in a 2D foraging environment. Agents are rewarded for having both resources once, but the resources are distributed far apart from each other. To maximize reward, agents need to split up the work - agent 1 goes to resource A, agent 2 goes to resource B, etc, and then they meet to exchange resources, since meeting halfway can get them the most of each resource in the shortest amount of time.

2024-02-04

/blog/emergent-trade/ jarbus

AI Index

An ever-expanding list of concepts in the field of AI to give myself and others an easy reference. Each item in the list contains a short, rudimentary definition I’ve written, as well as a link to a resource that can explain it better.

Ablation Study: Removing some parts of a machine learning model to measure impact on performance

Advantage Function: The difference between a Q-value for a state-action pair and a value for the state. Useful to determine how good an action is relative to its state.

2021-03-19

/blog/ai-index/ jarbus

Tesla and False Advertising in AI

Here’s the problem with advertising AI-based technology that doesn’t exist:

You cannot promise anything about your product.

We’ve all seen AI advertised to the masses that doesn’t work as advertised, just look at any voice-to-text system. When I got my Apple Watch, I hoped to use it to respond to messages without getting distracted by my phone. I quickly realized that wasn’t a viable solution: I had to repeat my message multiple times per text in order to get the correct dictation.

2020-07-22

/blog/tesla-and-false-advertising-in-ai/ jarbus