12/08/2021, jeff

how to read a scientific paper all the way through

i. the intro

last time, we focused on how scientific papers are typically built, as well as how to extract the surface level information from them. this time, we'll focus on engaging more completely with a paper, thinking more critically about what it's trying to say, and picking up threads to follow from there.

like before, i picked a paper to read alongside this piece. the paper is titled 'direct imaging of a cold jovian exoplanet in orbit around the sun-like star gj504', written by kuzuhara and a small army of scientists in 2013. this article is available as a free pdf from its publication, so i encourage you to download it there. additionally, i'd be happy to send you a copy of my marked-up version, if that would be a helpful reference - just get in touch and i'll email it to you. i'll go through my reading process first, then we'll apply it to this article.

before we start reading the paper, there are a few things that we can look at to contextualize it within its field of study. the first is where it was published, the astrophysical journal. this is an open access publication that has been publishing several volumes a year for decades. as of writing, they have 922 volumes going back past 1995. this is a long-running journal, and is affiliated with a couple leading scientific organizations, which means it is most likely a reputable source.

the next thing to take a peek at is the number of times the article has been cited by peers in other papers. the more often a paper is cited, the more likely the information inside is viewed as valuable by the community. the paper we're reading has been cited 191 times, which is a shit ton. i feel like most good papers will get a few dozen. this is definitely an outlier, and makes me wonder if i accidentally stumbled upon some sort of foundational work of the last ten years.

finally, while this isn't as important an indicator, the sheer number of authors on this paper lends a little credibility to it as a rigorous scientific work, or at least as one recognized by the scientific community. there is not nearly enough time or money in modern academia for this many people to waste on a bogus project.

that's kind of a lot to do before even getting to the paper, but these kinds of things are really important to understanding how much you can trust the authors and the methodology in the paper, especially if you are not an expert in whatever you're reading.

ii. the method

starting the same way we did last time, we'll try to pull as much relevant information as we can from the title, the abstract, and the conclusion. once we've got a handle on the gist of the article, we can dive a little deeper. what i like to do is make a list of the major takeaways as well as any questions i have. that way, i have both a reference to look back at and a framework for how i read through the rest of the paper.

and from there, it becomes a bit of a free-for-all. most of the time, i'll head right back to the introduction and dig through it for context. i usually find answers to at least one or two of my questions in the introduction, so i've found that it's a good place for me to start.

continuing on through the paper, i'll usually do an incredibly light pass over the methods and results sections. these sections are very dry, and are tough to get through without the requisite education or experience. maybe i'll read the first couple paragraphs, or take a peek at the figures and tables, but ultimately i try not to get bogged down here. by now, the authors should have made clear what they're doing and how they're doing it. if you need details, this is the place to get them.

before long, i'll find myself at the dicussion section, where all the good stuff happens. my graduate advisor would always say that observations belonged in the results section, and interpretations belonged in the discussion section. i think that's a great way to measure expectations about what you'll find in each. along with the introduction, the discussion is where i find the answers to most of my questions, so i'll read through this section a little more carefully.

this is how i'll approach articles most of the time, but you can do this bit in any order that suits you. if you're building a model, you might want to head right to the methods, where the assumptions and conditions of the authors' model would be laid out. if you got through the abstract and conclusion and are still having trouble understanding the authors' point, you might head right to the discussion to find it. use a list of takeaways and questions to guide you through the paper. if you're not sure what to ask, that's fine too! the questions just help you stay on task as you wade through dense writing or arcane results tables.

iii. kuzuhara et al., 2013

okay, this is already getting quite long, so let's quickly go through the paper. as always, i started with the title and the abstract. the title is very straightforward, and lets me know that the authors have directly measured a planet in a solar system similar to ours. the abstract builds on this, claiming that the planet is significant in that it is older and colder than other directly imaged planets, and that this is important because it reduces uncertainties in the formation models that allow us to identify age and temperature. i'm a bit skeptical, as they estimate the age of the planet to be 160 million years old, which does not seem much older than the 50 million year old planets they are comparing against. either way, i head to the conclusion to try and get a bit more information.

i love this conclusion. the authors write one or two sentences about how they calculated the age of the planet, then put down seven! bullet! points! summarizing their interpretations. this is how people should be writing conclusions. big takeaways for me here were the authors backing up their 'this planet is special' claims with numbers, as well as referencing the methodologies they used for calculating the age. they also mention observing the planet for one year.

so here's what i've got so far, from the title, abstract, and conclusion:

some takeaways:

  1. gj504 is a sun-like star, and gj504b is a jupiter-like planet that orbits it
  2. gj504b is the oldest exoplanet to be imaged, making observations more reliable than younger planets
  3. the planet's age has been estimated as a combination of gyrochronology and chromospheric activity
  4. studying colder, less cloudy exoplanets offers greater insight into their atmospheric composition

and some questions:

  1. what is the seeds direct-imaging survey, and how does it observe exoplanets?
  2. is one year a sufficient exoplanet data collection period?
  3. what are gyrochronology and chromospheric activity, and how do they relate to age?
  4. what is the basis for gj504b being old enough to be less dependent on model uncertainty?
  5. what do the authors mean in conclusion point 3 by 'mass near the solar mass'?
  6. what do the authors mean in conclusion point 6 by 'novel parameter space of atmospheres'?

from here, i moved up into the introduction, which offered a wealth of answers to my questions. seeds uses a ground-based telescope to measure the exoplanet's position and luminosity. these observations, combined with the age of the host star, can be used to estimate the planet's mass. there are two models that are typically used to estimate mass with this data, and they will consistently give the same result if the planet is older than 100 million years. in the first few paragraphs, questions 1 and 4 have already been answered. the rest of the introduction lays out the structure of the paper, which is nice too.

after that, i read the very beginning and very end of the stellar properties section, skipping four pages of detail on different methods for determining the age of a solar system. if i'm interested, i'll go back and read it, but gyrochronology and chromospheric activity are described in the first paragraph, answering question 3, so i've got all i need for now. i give the observations section the same treatment, reading the first paragraph for some juicy tidbits on the telescope specs (and the first formal definition of seeds, which is coming way too late), before hopping five pages ahead, past the results section, stopping only to interpret the figures and charts on the way. a lot of it is proof that gj504b is a planet, and that its age estimates are consistent. cool, but i'm not interested in getting in the weeds about that right now.

finally, we reach the discussion. the authors demonstrate how the light observed coming from gj504b is different than other observed exoplanets, and relate it to atmospheric makeup, answering question 6. there's more interesting reading on planetary formation methods and how they stack up against the observations of gj504b, and a quick blurb on where studies can go from here, and that's a wrap on the paper.

iv. reading further and wrap-up

looking at my question list, i got explicit answers for four of my six questions, and the other two i'm able to wave away with a 'probably' (question 2) and a 'i guess that's just a clarifying descriptor' (question 5). but let's say i wasn't satisfied with those answers, or i wanted to learn more. what do i do in that case? well, at the end of the article is a long list of references cited in the paper. we can use these to do more reading on the topic, picking up on threads we found particularly interesting.

let's take question 2 as an example: is one year a sufficient amount of time to collect data on an exoplanet? because these observations form the basis of the argument, this is a really good question that helps us understand whether the authors did their due diligence or whether they rushed their observations. the answer may be in the text, but i also know that the authors repeatedly mention other directly-observed exoplanets. so, heading back to the introduction, i find this line in the first few paragraphs: 'previously imaged exoplanets are all younger than 50 myr (marois et al. 2008; lagrange et al. 2010; carson et al. 2013).'

the authors and dates in parentheses refer to papers or books that back up whatever claim the author is making. the article information can be found in that references section. heading back down there, we find three articles:

the information in these references is quite sparse, unfortunately, but the pdf has direct links embedded in the text that you can follow. with this information, we can look at other surveys of directly-imaged exoplanets and see for how long those studies made similar observations. even without being astronomers, we can compare these papers to get an idea of whether the method is rigorous or not.

i know this was a lot to get through, but hopefully this sheds some light on what should be a much easier process. the more papers you read, the easier this process is, so don't be discouraged if it's not very easy at first. and as always, please feel free to get in touch with me if you have any questions or comments.