A Crucial Particle Physics Computer Program Risks Obsolescence

A Crucial Particle Physics Computer Program Risks Obsolescence

Recently, I watched a fellow particle physicist talk about a calculation he had pushed to a new height of precision. His tool? A 1980s-era computer program called FORM.

Particle physicists use some of the longest equations in all of science. To look for signs of new elementary particles in collisions at the Large Hadron Collider, for example, they draw thousands of pictures called Feynman diagrams that depict possible collision outcomes, each one encoding a complicated formula that can be millions of terms long. Summing formulas like these with pen and paper is impossible; even adding them with computers is a challenge. The algebra rules we learn in school are fast enough for homework, but for particle physics they are woefully inefficient.

Programs called computer algebra systems strive to handle these tasks. And if you want to solve the biggest equations in the world, for 33 years one program has stood out: FORM.

Developed by the Dutch particle physicist Jos Vermaseren, FORM is a key part of the infrastructure of particle physics, necessary for the hardest calculations. However, as with surprisingly many essential pieces of digital infrastructure, FORM’s maintenance rests largely on one person: Vermaseren himself. And at 73, he has begun to step back from FORM development. Due to the incentive structure of academia, which prizes published papers, not software tools, no successor has emerged. If the situation does not change, particle physics may be forced to slow down dramatically.

FORM got its start in the mid-1980s, when the role of computers was changing rapidly. Its predecessor, a program called Schoonschip, created by Martinus Veltman, was released as a specialized chip that you plugged into the side of an Atari computer. Vermaseren wanted to make a more accessible program that could be downloaded by universities around the world. He began to program it in the computer language FORTRAN, which stands for Formula Translation. The name FORM was a riff on that. (He later switched to a programming language called C.) Vermaseren released his software in 1989. By the early ’90s, over 200 institutions around the world had downloaded it, and the number kept climbing.

Since 2000, a particle physics paper that cites FORM has been published every few days, on average. “Most of the [high-precision] results that our group obtained in the past 20 years were heavily based on FORM code,” said Thomas Gehrmann, a professor at the University of Zurich.

Some of FORM’s popularity came from specialized algorithms that were built up over the years, such as a trick for quickly multiplying certain pieces of a Feynman diagram, and a procedure for rearranging equations to have as few multiplications and additions as possible. But FORM’s oldest and most powerful advantage is how it handles memory.

Just as humans have two types of memory, short-term and long-term, computers have two types: main and external. Main memory—your computer’s RAM—is easy to access on the fly but limited in size. External memory devices like hard disks and solid-state drives hold much more information but are slower. To solve a long equation, you need to store it in main memory so you can easily work with it.

In the ’80s, both types of memory were limited. “FORM was built in a time when there was almost no memory, and also no disk space—basically there was nothing,” said Ben Ruijl, a former student of Vermaseren’s and FORM developer who is now a postdoctoral researcher at the Swiss Federal Institute of Technology Zurich. This posed a challenge: Equations were too long for main memory to handle. To calculate one, your operating system needed to treat your hard disk as if it were also main memory. The operating system, not knowing how big to expect your equation to be, would store the data in a collection of “pages” on the hard disk, frequently switching between them as different pieces were needed—an inefficient process called swapping.

This xkcd comic illustrates the situation well.

Illustration: xkcd.com

FORM bypasses swapping and uses its own technique. When you work with an equation in FORM, the program assigns each term a fixed amount of space on the hard disk. This technique lets the software more easily keep track of where the pieces of an equation are. It also makes it easy to bring those pieces back to main memory when they are needed without accessing the rest.

Memory has grown since FORM’s early days, from 128 kilobytes of RAM in the Atari 130XE in 1985 to 128 gigabytes of RAM in my souped-up desktop—a millionfold improvement. But the tricks Vermaseren developed remain crucial. As particle physicists pore through petabytes of data from the Large Hadron Collider to search for evidence of new particles, their need for precision, and thus the length of their equations, grows longer.