Saturday, January 9, 2010

The Joys Of Working With Data!

One great thing I like about my situation is I get to do both theory and data analysis.  I am allowed to try to work out predictions of different inflation scenarios and am them told to take the publicly available WMAP data and go looking.

However, working with data is such a pain!!!  Raw data isn't ready for you to make your specific measurement without a headache.

I would like to illustrate this with a basic example.  Many of you are familiar with one of the most important results from CMB research, the power spectrum.  (Seen at the upper right.)

I was told: "As a first step take the raw maps, use Healpix and first try to reproduce WMAP's power spectrum above. "

"Okay, easy enough!" I thought.  And here is the result:

You can imagine how depressed I felt seeing that for the first time.  A post doc reminded me the data contains the galactic plane (As seen to the right) and is in different units than the plot above.  Remove the plane and correct for the units and it should be better.

After I spent a couple days figuring out how to do this properly, I replotted the data and it came out looking like this:

Okay, much better, but it still looks bad.  By this time it was the weekend, postdocs weren't around and I decided to email the WMAP first author on the power spectrum paper himself to figure out what was going on. (Probably not the wisest thing, but it worked).

He was nice and I found out I needed to still:
  1. Correct for the "beam" of the satellite.
  2. Correct for the noise of the satelite.
  3. Make other corrections since removing the galactic plane has to be compensated for through messy corrections.
  4. Check your code for bugs.
  5. etc...
After all this I finally got a plot looking like their's with all the peaks and features in their correct places.  (So, I can say I have verified WMAP's results.  Nothing fishy they're hiding!)

Now that is how good working with data is when you correspond with people who already made the same measurement

My how messy things have been now that I am doing new things with new problems nobody else on earth has taken the time to figure out how to correct for. 

Let's just say, analyzing data is a messy stressful headache and I salute people who pioneer analyzing new data sets with all the "unknown unknowns" that rear their ugly heads.


  1. I just had an experience (as in, I just got out of the meeting) similar to that. In my models I have to put the system into hydrostatic equilibrium which in almost any class is presented as being a simple task. First take the conservation of momentum equation, make some assumptions (i.e. static) and then all you have to do is solve for the pressure. In most text books it is presented as a straight forward and simple problem.

    It is a simple problem until you actually have to do something useful, then it is a headache. I was meeting with my advisor and I was expressing my thoughts and his initial response was, "Oh it's simple, you just..." and then he started thinking about it and he ended up with, "Hmmm. How WOULD you do that?" Then I didn't feel so bad for not figuring it out over the weekend. (fortunately our particular problem turns out to be linear so my particular problem can be solved through simple addition, but again it is a case of having a special case)

    So as it frequently happens most problems in physics, especially those that involve actual data, start out with "Oh it's simple, you just..." and end up with "Hmmm. I don't know how to do that...I will have to think about it."

  2. Hello, there is something about a petition which could interest you in the top right-hand corner of my blog.


To add a link to text:
<a href="URL">Text</a>