Artificial intelligence gains an intuition for hunting exoplanets

For the first time, astrophysicists and computer scientists have partnered together to find planets outside our solar system with artificial intelligence. One of these planets is even the first eighth planet ever discovered in a system other than our own.

Chris Shallue, senior software engineer at Google Brain, their artificial intelligence research hub, had just read about NASA’s Kepler mission, which focused on finding planets outside our solar system. But there was a problem.

Chris Shallue | Miles O'Brien Productions
Chris Shallue. Credit: Github.

“The Kepler mission had generated so much data that it was impossible for humans to search it for themselves,” Shallue said. “My ears pricked up because I’m someone who deals with large data sets all the time. With neural networks becoming popular only recently, I kind of assumed that probably nobody had tried to apply a neural network to this data before.”

A quick Google search turned up nothing of note. So, Shallue reached out to University of Texas at Austin astrophysicist, Andrew Vanderburg, who was at the time a Harvard graduate student specializing in finding exoplanets.

Andrew Vanderburg | Miles O'Brien Productions
Andrew Vanderburg. Credit: Harvard University.

“I was very interested,” Vanderburg said. “I had never really done machine learning before, but I knew it had huge potential, that it was the new hot thing. Eventually, we decided on this project to classify light curves into planets and not planets.”

First, here’s how it usually works.

Planets go around other stars the same way we go around our Sun. If we’re looking at a star very far away and just happen to be at the right angle–looking at its system edge-on–then we see a slight dip in the intensity of that star’s light when a planet passes in front of the star.

How the transit method of finding exoplanets works | Miles O'Brien Productions
Finding exoplanets with the transit method. Credit: Google.

You can think of it sort of like a partial solar eclipse… very partial. Since this is happening trillions of miles away, the effect is so small that it can only be seen with a powerful telescope.

In 2009, NASA deployed a space-based telescope named Kepler to scan for stars with these distinct dimming patterns. Even from the beginning, software was required to sift through the mountains of data.

“It’s basically a huge automated test,” Vanderburg said. “It asks the question, ‘Could there be a planet here?”, lots and lots of times. So it asks, ‘Could there be a planet with a period of 1.0001 days?’ Then it asks, ‘What about 1.0002 days?’”

If the signal observed matches one of these predicted orbits above a certain threshold, it is logged for further analysis. This is what is called a brute force method, a straightforward and time intensive trial-and-error approach.

During the four year long mission, Kepler looked at roughly 150,000 stars and filed 30,000 as potential candidates for further review. Humans, aided by computer programs, pored over that smaller dataset and found over 2,000 of those signals corresponded to actual exoplanets.

Kepler spacecraft | Miles O'Brien Productions
The Kepler spacecraft. Credit: NASA.

Astrophysicists are very cautious when looking through this data; this final batch represented the clearest, strongest signals of exoplanets transiting in front of their stars.

“False positives are an ever-present reality you have to deal with when you search for exoplanets,” Vanderburg said. It’s not uncommon for scientists, especially inexperienced and overly-excited graduate students, to see planets where there are none.

“I know that I had this problem as well,” Vanderburg said. “When I first started trying to find planets in the Kepler data, I found a lot of false positives.”

It took Vanderburg years to develop an intuition for classifying these tricky signals. Shallue wondered if an artificially intelligent program could learn that same intuition and pick out even fainter signals the humans had missed.

“What about the signals that are slightly weaker?” Shallue mused. “Maybe there’s some real gems hiding in there.”

So, Shallue and Vanderburg wrote a machine learning algorithm for identifying exoplanets from Kepler’s starlight data. To train it, they fed the algorithm examples of exoplanets that had already been correctly identified. The software then wrote its own rules for classifying exoplanets.

The machine learning algorithm learns from past data and makes its own classifying rules | Miles O'Brien Productions
The machine learning algorithm learns from past data and makes its own classifying rules. Credit: Google.

After just 90 minutes of training, the machine learning algorithm developed an intuition rivaling any astrophysicist: it’s matching rate hovered at 96% that of the scientists’. This is a much more elegant and efficient process than the brute force check in use currently.

Perhaps, it is even better than people. “Maybe some of the human labels were wrong!” Shallue laughed.

So far, the duo has used the algorithm to sort through over 600 of the weaker signals, with some exciting findings. The machine learning software was able to pick out a new planet in two solar systems that had previously been combed through by humans.

Grabbing all the headlines was Kepler 90i, the first exoplanet discovered that brings a solar system to eight planets, making it the first system outside our own with the same number of planets. Kepler 90i, slightly larger than Earth, sits so close to its star that its surface is likely more than 800 degrees Fahrenheit and orbits its sun in an orbit of a dizzying 14.4 days.

Rendition of the Kepler 90 system | Miles O'Brien Productions
Rendition of the Kepler 90 system. Credit: Google.

“Kepler 90i is the one that gets all the attention because it’s the eight planet system,” Vanderburg said. “But, to be honest, I was more fond of Kepler 80g.”

That’s because Kepler 80g, the other exoplanet the duo discovered, is part of what’s called a resonance chain. A resonance chain is where the planets of a system end up in orbits that are neat ratios of each other–for example, an inner planet of such a system might go around its star three times in the same time it takes a planet further out to complete two orbits.

In the Kepler 80 system, there were already four planets known to be in just such a resonance chain. These chains are like finely tuned cosmic harmonies, and they provide a special advantage for the researchers to be sure what their algorithm found truly were new exoplanets.

“If you’re in this special configuration, you can make extremely strong predictions about what the orbital periods for any additional planets in that system are,” Vanderburg said. “When I started playing around with that math, I realized that if there was another planet in the Kepler 80 system and if it was in this resonance chain, the predicted orbital period was less than two minutes different than the orbital period that we totally independently measured.”

Considering that the orbital period of Kepler 80g is also around 14 days, having the observation and the theory agree to the tune of minutes is incredibly accurate.

“That’s like measuring the height of a person to better than the width of a human hair… having only met their siblings,” Vanderburg said.

Another testament to the skill of the machine learning classifier is its ability to discern a specific, particularly tricky false positive called a secondary eclipse.

Imagine a system where instead of one star with a bunch of orbiting planets, there are two stars orbiting each other. If one star is smaller and dimmer than the other, it’s passage in front of its brighter companion would also block out some light and cause a dip in the starlight intensity reminiscent of an exoplanet transit.

The secondary transit is where the bigger star passes in front of the smaller star, which also dims the light, but much less so. This very small dip can get lost in the noise of the data, causing the system to be misclassified as having a planet.

This kind of false positive is especially difficult to discern and requires a healthy dose of experience and intuition–but the machine learning algorithm was able to pick out even this.

Secondary eclipse detected by algorithm | Miles O'Brien Productions
The machine learning algorithm is able to pick out, in green, the hard-to-detect secondary eclipse false positive. Credit: Harvard University.

“It’s hard for a human to see, but the model has very clearly picked it out,” Shallue said. “That’s my favorite one, that the model is picking up on the subtleties.”

It’s this ability to learn from experience and pick up an intuition for solving a problem that makes machine learning such a powerful tool for use in science.

As we are able to collect more and more data in our experiments, “we need automated ways to examine that data,” Shallue said. He is excited about other Google projects working to integrate machine learning into the pipeline of the scientific method, from the molecular modeling of pharmaceuticals to the analysis of fusion energy experiments.

Right now, it seems like it is astronomy’s turn, and the pair is excited to keep crunching through the Kepler data with their new machine learning algorithm.

“I think that machine learning is really about to make waves in astronomy,” Vanderburg said.

This begs the question: if artificial intelligence discovers something really intelligent on another planet, will the machines decide we’re not smart enough to handle the news?

Banner image credit: JPL, modified by author.

Notify of
Oldest Most Voted
Inline Feedbacks
View all comments

Get our latest stories delivered to your inbox.