Don Backer, 1943 – 2010

Don Backer

Don in 2007 (picture from Colby Gutierrez-Kraybill)

On Sunday I received some terrible news: that Don Backer, a professor in the UC Berkeley Astronomy Department, had collapsed outside his home and died, probably due to a heart attack. Compared to many of the old hands in the department, I only knew Don very briefly; but in the relatively short time that I’ve been at Berkeley, he became someone that I really looked up to and had tremendous respect for, both as a scientist and as a person. For someone his age (which wasn’t young, but was definitely far younger than it should have been) he was amazingly active on all fronts: pushing genuinely exciting research, performing valuable service for the department and the radio community, and being a great mentor to a younger generation of radio astronomers, of which I’m fortunate to count myself as a member. I’ll miss him a lot.

(The Nature blog published a short obituary for Don here.)

Posted in Uncategorized | Leave a comment

Qual Reference: Radio Transient Surveys

I’ve already tabulated a bunch of surveys in the scientific merit post, but here I’ll add some extra information on a few of the key ones. I’ve attempted to compute quantities as uniformly as possible, but this can be pretty difficult. Carets denote quantities I’ve estimated indirectly myself, rather than gotten fairly directly from one of the references.

Quantity AGCTS Hyman+ GMRT Becker+ VLA Bower+ VLA
Total time (h) ~200^ 66 ~200^ 315^
FWHM (deg) 1.13 2 0.17^ 0.17, 0.1^
Freq (GHz) 3.09 0.235 4.86 5, 8.4
Snapshot RMS (mJy/bm) ~35^ 3-10 0.3^ 0.04-0.05
Snapshot duration (m) ~5^ 138 ~1.5^ ~20
2-epoch eff. area (deg2) 854 69 23.2 30.97

Still not quite sure that I’m compute the 2-epoch effective area correctly, and I’m definitely sure that I’m being sloppy since that metric is a technically function of the detection limit. I’m basically taking N_snapshot * A_snapshot for that assessment.

(Hand-written table in not-ATA notebook #1 p. 73.)

Posted in Uncategorized | Leave a comment

Qual: Feasibility Conclusions

Based on the legwork I’ve done, what arguments can we make about the feasibility of my thesis and how should I modify my plans to make them more realistic?

Emphasize gamma rays in Cyg X-3 work

Even though I keep on getting emails about X-ray observations of Cyg X-3 I shouldn’t lose sight of the fact that the key novelty for our Cyg X-3 work is the recent detection of it in gamma rays by AGILE and Fermi (Tavani+ 2009, Fermi LAT Collaboration 2009). We’ll have near-continuous monitoring of X-3 from Fermi’s all-sky mapping activities, though I need to get a better handle on what that dataset will look like. I also need to think about what kind of radio/gamma variability correlations we might find. Geoff suggests that previous work on blazars should provide a good starting point — say, things building on Maraschi+ 1992. For instance, the Fermi team was only able to constrain the delay between radio/gamma emission peaks as 5 +- 7 days — can we do better?

Emphasize gal90 as part of AGCTS

We’re collecting data for the Cyg X-3 project in exactly the same way as we are for the AGCTS and we should make sure to emphasize that these data can be used for transient searching as well. The gal90 project data add to the AGCTS dataset:

  • 5 pointings (including Kepler field)
  • 164 scans at 3.14 GHz
  • 66 hours of data
  • Mean duration of 0.40 hr
  • Median duration of 0.325 hr

Which is nothing to sneeze at.

Explore Metrics for AGCTS

From my attempts to compare the AGCTS and other existing surveys, and my attempts to apply existing measurements of transient rates to the expected AGCTS dataset, it’s become pretty clear that the transient parameter space is too large to allow one or two simple, definitive comparisons. I need to think about a few metrics for transient searches and rates that will demonstrate most clearly the way in which AGCTS is different than existing surveys and in which existing transient rate measurements apply (or fail to apply) to the AGCTS dataset.

The context for this point is that I had failed to appreciate how severely the move to higher frequencies affected the survey speed of the AGCTS. From my initial computations, I had expected it to work out to be clearly superior to the Hyman+ survey; now it’s pretty much inferior. Because of the different chunk of parameter space that we’re carving out, I think we still have interesting things to say about radio transients, but that perspective is definitely a shift in how I think about the AGCTS project.

Reference Posts

There are a few smaller topics that I should write reference posts on:

  • Characteristics of relevant radio transient surveys in the literature
  • Transient rate measurements in the literature
  • Summary of blazar gamma/radio correlation research and its potential applicability to Cyg X-3.
Posted in Uncategorized | Tagged | Leave a comment

Qual: Feasibility Legwork

(Second deadline: also blown, but no slippage relative to the previous one.)

Can I reasonably expect to accomplish what I’m setting out to accomplish? This post will present some of the quantitative legwork needed to answer that question.

Cyg X-3

There are two key feasibility issues for the Cyg X-3 project in my view:

  1. Can we successfully process the data?
  2. Will the dataset allow us to do interesting science?

1. We can successfully image the data:

An ATA continuum map of the Cygnus X-3 region (3.14 GHz, 80 MHz BW).

It’s a complicated region, but we can clearly get something decent out, and we can detect Cyg X-3 nicely.

But the actual dataset that we’re looking to create is a time-series of flux measurements for Cyg X-3. Can we obtain sufficiently precise and high-cadence fluxes? This is, in a sense, a moving-goalpost problem: we can always scale back our goals for precision and/or cadence, and sooner or later they’ll be achieved. It might bode ill for the well-posedness of the scientific problem that there isn’t any particular cutoff at which we think it’ll become impossible to obtain interesting results, though if we have to scale back so much that our dataset will be very similar to existing radio datasets, that’s a bad sign.

From analysis of some of the images that I’ve made of the Cygnus and GC regions, it appears that I can expect to achieve a “noise factor” (NF) of 10 mJy sqrt(hr) / bm for one correlator — that is, a 1-hr integration should give me an RMS noise of 10 mJy/bm, a 10-minute integration about 25 mJy/bm, etc. Still considering a single correlator, this is about twice what I see on long integrations on a calibrator, which involve much less crowded fields but are quite possibly dynamic-range-limited (DR ~ 2000-3000 in the images I was looking at). This is also about 15 times the naive expectation you get based on the performance of the individual antpols (ignoring that data get flagged, that we don’t get all baselines, etc.) This number is definitely conservative, and could be improved by improvements in the analysis pipeline, calibration, array performance, etc. I actually haven’t done a whole lot of dual-correlator work, so discounting a little bit you’d estimate having a dual-correlator NF of 8 or so, and with other analysis improvements, an NF of 5 seems quite achievable.

But let’s proceed with the current figure of NF = 10. Since Cyg X-3 is about 100 mJy, that means that if we chunk our data into 10-minute intervals we get ~4-sigma detections in each chunk, which seems good. This also yields about 30 samples per 4.8-hr orbit, which is also good.

The main limiting factor here is unknown systematics in the data. From UV modeling work, I know they’re in there, and they seem to be present at levels comparable to the brightness of Cyg X-3. (This is what I get if I time average over all baselines and pols and look at the resulting amplitudes, which seem as if they might have some continuity to them.) This could be a very big deal, but it’s hard to assess — maybe I’ll figure out how to make the fit residuals go away entirely, maybe I’ll have to figure out a way to live with them. I postpone deeper consideration of this to the next feasibility post.

2. Then there’s the question of whether our dataset will actually yield any scientifically interesting results. This is the trickiest one because it’s really hard to answer before we actually have nice radio and X-ray lightcurves to look at. To provide some grist for consideration, here’s a simple graphic showing spans of ATA observations, X-ray observations, and outbursts by Cyg X-3. It’s not at all legible in its current form but it’s good enough to show a few important points.

Don't bother squinting.

The X axis is time, with the vertical lines spanning the entire plot showing the beginnings of months, with the leftmost line denoting January 1 2010. The top row of shorter marks shows selected ATA observations of Cyg X-3: ones at useful 3 GHz in red, ones at frustrating 1.4 GHz in purple. I’m only showing observing runs that either have 1) a lot of time spent on X-3 or 2) close proximity to an X-ray observation. We have runs every few days throughout this entire timespan, though not all of them have significant time on X-3. The second show shows X-ray observations: INTEGRAL in green, RXTE in sea-foam-ish. The third row shows a period of outburst that Cyg X-3 went through. If you zoom in, you can see that we have 5 epochs of (near) simultaneous radio/X-ray observations, even if you don’t count the lengthy epochs at 1.4/2.01 GHz in December 2009. Two of those are during the general outburst period. The backing data for this plot, with precise timing information canonicalized to MJD, are in /cosmic1/pkwill/ata/cygx3/extra-obs.txt.

I think this is encouraging for the notion that if we can reduce the data, we’ll have interesting comparisons to make.

AGCTS

There are three key feasibility issues for the AGCTS:

  1. Can we successfully process the data?
  2. Can we expect to find transients or put interesting constraints on their rates?
  3. Will this be an improvement on existing work?

1. As with Cyg X-3, we can image the data. This is made “interesting” with the summer 2010 AGCTS dataset because we have very consistent hour angle coverage for all of the fields, which allows nice interepoch comparisons but makes for lousy images. We hope to be working in the UV domain as much as possible, so hopefully this will turn out to be a good thing overall.

Compared to Cyg X-3, the only additional difficulty that I can think of with imaging the GC is that Sgr A* is much brighter than anything in the Cygnus field. (Cyg A typically comes in a 40 mJy.) In one sample dataset it came in at ~45 Jy, and the image had a dynamic range of … ~400. That’s certainly a challenge. Hopefully, we can get some dedicated observations with good hour angle coverage, use those to develop a high-quality model, and be able to get much better results than we do now with short observations for which Sgr A* is significant.

2/3. Talking about expected transient rates and comparing to previous work are closely connected, because there are a lot of different ways to assess transient rates and different surveys probe different aspects of these rates, so you end up with more-or-less a 1:1 map between a particular rate measurement and a particular survey in the literature.

Ignoring all of the data at 1.43/2.01 GHz, there are currently (2010 Jul 16) 35 AGCTS epochs after 2010 Apr 28, when we switched frequencies. In that dataset we currently have 688 source scans at 3.14 GHz for a total of 115 hours of data, with a presumably comparable number at 3.04 GHz. The effective area of each scan is almost precisely 1 sq.deg out to FWHM. The mean scan duration is 604s and the median scan duration is 420s, leading to expected RMS fluxes of 25 or 30 mJy/bm, respectively, for a single correlator. To some extent, that’s the capsule summary of the survey: 1 sq.deg x 115 hr x ~28 mJy/bm.

We can also consider the survey speed of various instruments used for radio transient surveys. The relevant ones here are ATA/3.14 GHz (me), GMRT/0.235GHz (Hyman+), VLA/0.33GHz (Hyman+), and VLA/4.86 GHz (Becker+). A figure of merit for the survey speed of a particular radio instrument is:

\textrm{FOM} = \Omega \left(\frac{A_\textrm{eff}}{T_\textrm{sys}}\right)^2 \propto \left(\frac{\textrm{FWHM} \times N D^2}{T_\textrm{sys}}\right)^2

Using deg, m, and K, I get ATA/3.14: 284, GMRT/0.235: 278784, VLA/0.33: 65373, VLA/4.86: 2822. Sadly these are not typos. What these numbers come down to is that the ATA gets murdered in effective area. (Not-ATA notebook #1 p. 64.)

There are several ways to make up for this. One is time-on-target: AGCTS has 115 hours so far and is on track for perhaps 200, assuming that we accomplish nothing with the ~200 additional hours of 1.43/2.01 GHz data. The 2008 Hyman+ paper reported about 66 hours of observing. The CORNISH survey, the intermediate version of which the Becker+ work is based, was allocated 400h; it appears that the Becker+ work was performed with about half of the data in, so call that 200h. We beat Hyman+ on time but not by nearly a sufficient margin to make up for our inferior effective area.

But AGCTS is exploring a different region of parameter space than Hyman+. We’ll be sensitive to brighter, and thus rarer, events. Our total area covered is much larger than Hyman+, giving us an edge for things that in particular are rare in an areal sense. (There’s also rarity in a duty-cycle sense.) We’re comparable to the Becker+ work in area, and we have less time, and our sensitivity is much poorer … but at least we’re focusing on the GC while they’re looking elsewhere in the plane, and their work in fact indicates that the areal transient rate goes up nontrivially as you get closer to the GC.

Our cadence is more rapid than that of Hyman+ or Becker+. We’ll revisit the same area once every few days in most cases, whereas Hyman+ operates on a monthly basis. So our hope is that there will be bright, brief transients that last on ~week timescales.

There’s also the question of spectral index. Obviously, relative to Hyman+ we’ll be more sensitive to flat/inverted-spectrum sources, while relative to Becker+ we’ll be more sensitive to normal/steep-spectrum sources. Hyman+ reports steep spectral indices for some of their events, alpha ~ -2, but those results seem a bit sketchy. (I think they report alpha ~ -6 for one event.) Becker+ see spectral indices ranging from -3 to 2 with a median of ~-1, which is encouraging. If the Hyman+ sources really do have somewhat steep spectra, we will basically be unable to see them.

I think the positive spin on all this is that we’ll be carving out a very novel chunk of parameter space. We have an intermediate frequency, a high cadence, a large area, and low sensitivity. If there are bright, flat-spectrum transients that last about a week, we’ll own them.

This conclusion also implies that it’s hard to confidently draw conclusions from published transient rates. Despite the large difference in observing frequency, the most directly comparable survey is probably that of Hyman et al. They have found several sources in ~100 hours of observing (adding in older VLA observations that found the Burper and their first source) with typical fluxes of ~100 mJy. We’ll have somewhat more time and somewhat worse sensitivity, but are a fairly different cadence and a very different frequency. Whether we’d expect to find something depends strongly on the population of transient sources, so, conversely, whether we find anything will provide useful information on the population of transient sources.

Posted in Uncategorized | Tagged | Leave a comment

Figure of the Day: The Nature of Computing Power Increases

Far too few people in the community seem to be ready to act upon what this figure is telling us:

Moore's Law phase space

From Barsdell et al. 2010, arxiv:1007.1660v1.

Taken from Barsdell et. al on arxiv.

Posted in Uncategorized | Tagged | Comments Off

Qual: Scientific Merit

(First deadline: blown. I ended up taking on a few more topics in this post than I originally envisioned.)

For my thesis, I’m observing the dynamic radio sky. The centerpiece project is a search for radio transients towards the Galactic Center, the ATA Galatic Center Transient Survey (AGCTS). A second major component is monitoring of Cygnus X-3. As we find interesting things in the both of these datasets, we’ll pursue them in focused, smaller-bore scientific projects.

This post is about whether these are interesting projects to pursue. To be honest, the two pieces are pretty independent of each other, so I’ll treat them independently.

AGCTS

Is a search for galactic radio transients scientifically compelling? The abstract argument is that systematic surveys of the variability of the radio sky (as opposed to its constant component) are very new, and every time a new way of probing the sky has been developed, significant discoveries have been made.

It’s important to pinpoint what’s new: surveys of the radio sky for variability, rather than monitoring of particular radio sources for variability. Due to the technological limitations of radio astronomy, the former are just becoming feasible, but the latter has been around for a while, often in much more sophisticated form than it has for other wavebands. (E.g., pulsars.) As best I can see, we should think about this distinction being about a vastly increased ability to discover, rather than monitor, radio-variable sources. So, would it be exciting to find more of these sources? Here are some of the kinds of objects that we might find, with the galactic ones italicized:

  • Pulsars
  • RRATs (if worth distinguishing from pulsars)
  • Flare stars
  • Brown dwarfs
  • Non-variable sources subject to interstellar scintillation (could be either galactic or XG, but likelier to be XG since depends on chance alignments and there are more XG radio sources by areal density — I think this line of argument is valid)
  • ESEs (distinct from above?)
  • RSNe
  • (O)GRBAs
  • XRBs
  • AGN
  • Masers (also XG; not relevant here since we’re working in the continuum)
  • Objects of unknown nature that have been discovered in radio transient/variable surveys or serendipitously (there are suggestions that at least some of these are galactic: Becker+ 2010, though they don’t consider scintillation; the Burper; Galactic Center transients; etc.)

I think that list speaks for itself. Now, even if there are many interesting radio-variable sources, it’s not necessarily worthwhile to hunt for them by surveys of radio variability, especially if we consider source classes in isolation. Assessing this worthwhileness blocks on my looking into feasibility — though, of course, a radio variability survey is the only path to discovering radio variables of an unknown nature.

AGCTS Novelty

Another aspect “interestingness” is whether the AGCTS is novel and competitive with other ongoing efforts. We certainly hope that the answer is yes: the ATA is supposed to be a unique instrument for performing surveys of this kind. The more detailed answer depends upon consideration of the competition:

  • Hyman+ VLA/GMRT GC 330 MHz: 2002, 2003, 2005 (Burper), 2009
  • Becker+ GP VLA archival: 2010
  • Langston+ NRAO Galactic Plane A survey: 2000, website
  • Bower+ VLA archival: 2007
  • Carilli, Ivison, Frail Lockman Hole search: 2003
  • Levison+ FIRST/NVSS comparison: 2002, 2006
  • Frail+ GRBA search: 2003
  • Kida+ drift scans: 2008, among others
  • Gregory & Taylor drift scans: 1981, 1986
  • McLaughlin+ Parkes multibeam piggyback: 2006, 2010a, 2010b
  • Cordes Arecibo pulsar survey: 2006

The last two are concerned with fast transients, which is a pretty different region of parameter space than what I’m interested in. All of the others except for the first three are concerned with extragalactic sources, whereas I’m interested in the galactic population. The NRAO GPA survey never published any transient results. That leaves the Hyman and Becker works. I’m working on comparing the expected AGCTS results with these two projects — this gets a little hairy, and ties in with feasibility questions, so I’m OK with this aspect needing a bit more investigation.

Cyg X-3

Are microquasars interesting? Definitely — they should be able to teach us a lot about full-size quasars. (See eg Mirabel & Rodriguez 1998 for a short version of the case and Mirabel & Rodriguez 1999 for a review article that also goes into justifications.) Fundamentally, while many of the physics seem to be analogous, the timescales are much, much shorter and the spatial resolution is much, much better. And there’s definitely a lot that we still don’t understand about accretion disks and relativistic jets.

Is it interesting to monitor the radio variability of a microquasar? Well, empirically, yes, because people are doing it.

  • Cyg X-3 is being monitored with the OVRO 40m by Fermi LAT Collaboration members (mentioned in FLC 2009, no direct ref I can find)
  • Ditto for with AMI (e.g. Pooley 2006)
  • And the RATAN-600 (mentioned in Tavani+ 2009, no direct ref)

How is our project different than these? The OVRO monitoring seems to generate one flux measurement every few days. The AMI monitoring has bursts where they get fluxes every ~hour but generally is at the same cadence. RATAN seems to have a 2-day cadence. We, on the other hand, are aiming to get fluxes on a 5-to-10-minute cadence. At least some of these measurements are simultaneous with X-ray and observations.

Will this different approach yield novel results? This also starts getting a bit tricky to address, and starts getting into the “feasibility” domain. Szostek and Zdziarski 2007 argue that there won’t be orbital modulation of Cyg X-3′s emission, but not because of scattering effects, so shorter-timescale variability would still be possible due to e.g. jet variability. Hjalmarsdotter+ 2004 says that periods between 0.01 and 1000 days were searched for in a Ryle telescope run around 2002 Dec 22-23 and that none were found, with no word on what the variability may have looked like. (Given the lower limit to their period search, they were probably sampling every 0.005 d ~= 8 min.) This is a little worrisome, but given the skimpiness of that reference and the others above, it looks like no one has really sat down and looked at the radio variability of Cyg X-3 on timescales of less than a day — for instance, there’s no word as to what the Cyg X-3 radio variability looked like just as a simple time series. We’ll also have simultaneous and near-simultaneous X-ray data, so I think there’s ample opportunity for us to find something interesting that others haven’t. Unfortunately, before the reduced data are in hand, it’s hard to predict what, if anything, we might find.

Spinoffs

Another point in favor of the projects I’m proposing is that they have good potential for spinoffs. Firstly and most obviously, if we find anything interesting, we can follow it up and try to do some science with the particular results. Secondly, we’ll generate static sky maps as byproducts, and those should be interesting in their own right. We expect to have 200 MHz of bandwidth centered on 3.09 GHz, so there should be some leverage on spectral indices. We should also land in a nice medium-resolution point, with larger area covered than high-resolution surveys but much better resolution previous radio surveys covering a similar area. With the Cyg X-3 project, while the main goal is to get good Cyg X-3 lightcurves, we should be able to search for transients using the AGCTS pipeline pretty easily. The Cygnus region is very crowded so it’s a good place to look for galactic transients too.

Posted in Uncategorized | Tagged | Comments Off

Public Talk, July 14

I’m giving another rendition of my radio astronomy talk, to the Sonoma County Astronomical Society, on Wednesday, July 14th, at 7:30 PM. The talk will take place at the Proctor Terrace School in Santa Rosa. The abstract is once again identical to that of September’s edition. See you then!

Posted in Uncategorized | Tagged | Comments Off

Qual Planning: Timeline

[P]lans are useless, but planning is indispensible. — Dwight Eisenhower

My qualifying exam is coming up, and I’ve started thinking about how I’m going to prepare for it. This post will develop a rough timeline for my preparations up until the exam on September 8th.

There are two components to the examination:

  • A talk presenting a plan for my thesis project
  • Questions on the talk and the background science

The talk has to do a only few things, in a very schematic sense:

  • Summarize the work that I’ve done
  • Explain what I plan to do in the future
  • Justify the scientific merit of the overall project
  • Justify the feasibility of the future plans
  • Present a rough, plausible schedule for the work I’ll do until graduation

So, what preparation will I need to do? Obviously, there will be some brushing up on the background physics for the question phase. The talk is the more interesting challenge. Now, of course, Geoff and I already have a pretty good idea of what work I’ll do for my thesis. But, to be honest, that “pretty good idea” doesn’t rest on the firmest foundation: I still haven’t sat down and thought through the justifications (regarding both merit and feasibility) for my project. It seems to me that the first order of business should be to fix that. (Of course part of the intention of the qual exam is kicking students into doing exactly these exercises.) Once I’ve collected my thoughts there, I’ll hopefully be able to work out the plan of attack in a bit more detail than I have before.

Based on this line of thinking, here’s an ordered list of steps for the talk preparation, with time estimates. I’ve tried to give (very approximate) 1-sigma error bars on the estimates, assuming that it’s likely that something will take much longer than complete rather than much shorter. I doubt the simple linear ordering will hold in practice.

  1. Research and think through scientific merit case. Product: write up a summarizing blog post and put together some references. I think I have a pretty good handle on this, so I think it should take 1 +/- 0.5 solid days of work.
  2. Research specifics regarding feasibility. Given that the plans ought to be malleable, I’m trying to think of this as research to help me answer the question “What can I reasonably expect to achieve in my projects?” Products: summarizing blog post, references, key numbers / equations. This will take some time: 3 +3/-1 days.
  3. Digest my feasibility results, reassess plans. Product: summarizing blog post. This will also take some time: 3 +3/-1 days.
  4. Review previous work, think about how it fits in. Shouldn’t be hard. Product: blog post with references. Time: 0.5 +/- 0.5 days.
  5. Work out a timeline. Should also be straightforward. Product: timeline. Time: 0.5 +/- 0.5 days.

Then there’s studying and various logistical things to do. Here’s a first-cut schedule of milestones, taking into account various non-qual obligations in my calendar, some expected delays, sleep, etc:

  • Today, now – Publish post outlining prep timeline, working out main tasks.
  • July 9 (Friday!) – Publish post summarizing scientific merit work.
  • July 16 – Publish post summarizing feasibility background work.
  • July 27 – Publish post summarizing deeper feasibility thoughts, their effect on plans.
  • July 29 – Publish post summarizing what I’ll discuss about my previous work in the talk, relevant issues to bone up on.
  • August 2 – Publish post with example graduation timeline.
  • August 6 – Publish post with rough plan for studying.
  • August 20 – Finish first-draft slides and script for talk.
  • August 23 – Have assembled materials for pre-exam meetings with committee members
  • August 27 – Give practice talk
  • September 8 – Take exam

Obviously, I’m under no illusions that this schedule will be rigidly met. But I will try to stick to it. If my time estimates happen to all be perfect, I have 27 days of work to do, about about 60 days in which to do it, which seems like the right ratio to me, so hopefully the sequencing and spacing are reasonable. If I start slipping the schedule early, I think that will be genuine cause for concern. (I suspect that I could do a good-enough qual with less prep effort, but as a matter of pride and personal development I’ll aim for a very good qual.)

Also obviously, I don’t need to demonstrate my preparation by writing about it on this blog, but I find that writing entries such as this one helps me clarify my thoughts (as is indeed the case now), and I think I’ll be motivated to avoid blowing deadlines on the posts, arbitrary and insignificant though they may be.

Well, I think I’ve dealt with everything I wanted to do for the first milestone. I feel like my plan is a good one, and I feel good knowing that I have it in hand, even if it will surely evolve. It’s easy to go overboard with meta-procedural stuff, and it did take me several solid hours to work out this stuff today, but I feel pretty confident that this, and future posts on the topic, will have been worthwhile.

Posted in Uncategorized | Tagged | Comments Off

Too Many Scientists, Revisited

I thought I’d flag that the article about the oversupply of US PhDs that appeared in Scientific American in April and that I discussed has resurfaced in expanded form (same author, Beryl Lieff Benderly) in the Miller-McCune online magazine. The basic thrust is essentially the same as before: the crisis in scientific careers is that there aren’t sufficient (or, perhaps, worthy) career opportunities for newly minted PhDs, not that there are insufficient numbers of potential American scientists.

I still agree that the thesis is correct, though this article in particular seems to endorse the idea that it’s really rough to be a graduate student or postdoc, which is only true in a very limited sense. Unlike the previous iteration, this article also argues that US science and math scores are generally strong, and rising, at the K-12 and college levels. I’m not familiar with the numbers, and I could believe that this is true, but having seen average Berkeley students try to work with fractions, even if it is true, we need to do better.

As a side note, I’ve never heard of this Miller-McCune thing; it seems to be a relatively new organization funded by essentially one wealthy person. Given that, the website is pretty professional-looking and there’s a lot of content. Generally seems to be an earnest endeavor.

Posted in Uncategorized | Tagged , | Comments Off

“Read past end of mask file” Notes

More boring reference material! This time, certain ATA datasets are rejected by MIRIAD with a “Read past end of mask file” error. These datasets are almost entirely good, and fixable, so it’s a shame to discard them.

Example. See

 /ataarchive/2010/03/11/gcs/d1cb41d2/s000

Diagnosis. As one might guess: there are more visibility records than there are flags. So far, it seems to be  the case that only the flags for the very final UV record are missing. This appears to be caused by killing the datacatcher instead of leaving it to shutdown on its own volition.

Detection. Run itemize on the dataset:

itemize: CVS Version 1.6, 2009/10/27 02:55:56 UTC

 npol     = 4
 obstype  = mixed-auto-cross
 nwcorr   = 0
 ncorr    = 8897536
 vislen   = 36202896
 history    (text data, 10834 elements)
 flags      (integer data, 286985 elements)
 vartable   (text data, 430 elements)
 c_tstop    (text data, 38 elements)
 c_instr    (text data, 30 elements)
 visdata    (binary data, 36202896 elements)

Each dword “element” in the flags record has 31 bits used for flag information, except for the very last one, which may be padded out with junk data. So, let X = <ncorr> – <# flag elements> * 31. If X is between 0 and -30, inclusively, we have a valid dataset. If X is between 994 and 1024, inclusively, we have an invalid dataset that is missing its final record (assuming 1024 channels of course). If neither of these holds, we have something weird going on.

Instead of using “itemize” to get the number of flag data, you can use the file size. There’s 31 bits per four bytes and a header dword, so the expression is 31 * (<flags file size in bytes> / 4 – 1).

Correction. I wrote a program to do it. See

viskit/demo/ataflagfix.c
Posted in Uncategorized | Tagged | Comments Off

Scintillation Notes

I gave myself a crash course on interstellar scintillation (ISS) today. (Sounds fancier than “twinkling”.) For posterity, here are some quick notes of the key results I found.

First of all, scintillation is twinkling. For my work, it’s in the radio and caused by plasma in the ISM. The Narayan paper (below) was a very helpful reference for booting up.

The basic model is an infinitely distant point source with incoming plane waves. There’s a zero-thickness phase screen at distance D defined by its phase change as a function of position phi(x,y). We’re observing at wavelength lambda. The formal expression for the effect of the scattering screen is the Fresnel-Kirchoff integral (cf Narayan eq 2.1):

\psi(X,Y) = \frac{e^{-i\pi/2}}{2\pi r_F^2}\int\int exp\left[i \phi(x,y) + i \frac{(x-X)^2 + (y - Y)^2}{2 r_F^2}\right]dx dy

Here r_F is the Fresnel length, which can easily be interpreted as an angle at the distance of the screen:

r_F = \sqrt{\lambda D / 2 \pi} \theta_F = \sqrt{\lambda / 2 \pi D} \ll 1 \textrm{ for this to work}

The Fresnel length is the transverse displacement on the scattering screen that causes a change in the arrival phase of the incoming wave due to path length differences. There’s a right triangle with adjacent of length D and opposite of length r_F, and you’re thinking about the phase change of the hypotenuse versus the adjacent. Assuming a gentle (or absent) phase screen, the dominant contribution to the integral comes from within r_F, where the phase isn’t changing much. As you get farther out, phase variations become rapid and signals tend to interfere and cancel.

For ISS and other natural cases, the phase screen has some length scale r_d on which it affects phases by ~1 radian. The usual assumption is that the phase structure function is Kolmogorov-distributed

D_\phi(x,y) = \langle[\phi(x'+x,y'+y) - \phi(x,y)]^2\rangle_{x',y'} = (r / r_d)^{5/3}

but the precise distribution doesn’t seem to matter so long as D_phi increases with r and there’s some characteristic scale r_d.

There are two main scintillation cases: weak and strong.

If r_d > r_F, we have weak scintillation. As mentioned above, most of the contribution to the received flux comes from within r_F, and the phase screen is nearly constant over that scale. The fluctuations will mostly be on this spatial scale, and so if the scattering screen is moving transversely at velocity v, the characteristic timescale will be r_F / v. A point source will broaden to the Fresnel angular scale,

\theta_s \approx \theta_F \approx \sqrt{\frac{\lambda}{D}}

If r_d < r_F, we have strong scintillation. The incoming phase is stirred up even within the Fresnel patch, so r_F basically is irrelevant. The “size” of a scatterer is the phase coherency length, so the typical angular scale is

\theta_s = \frac{\lambda}{r_d}

This is what point sources get scatter-broadened into, and it’s bigger than theta_F. (Smaller scatterer -> more diffraction.)

Strong scintillation has contributions from two components. First of all, there’s the variation on those r_d scales — this is “diffractive strong scintillation”. A source of angular size

\theta_d > \frac{r_d}{D}

has larger angular extent than the phase coherency scale and so crazy things start happening (cf Narayan). Note that this is not the same as theta_s: it is much smaller. The fluctuation timescale for a pointlike source here with transverse motion is r_d / v. Narayan says this effect is narrowband.

There’s also “refractive strong scintillation”. This is because a point source has that angular size theta_s, which backprojects to

r_r = \theta_s D = r_F^2 / r_d \gg r_d

and so is sensitive to variations on that lengthscale in the scattering screen. There’s a consequent timescale of r_r / v. Narayan says this effect is broadband.

Takeaways

  • Scintillation tends to the weak case as the frequency gets higher.
  • Scintillation damps as source gets larger. If weak scintillation, nu ~ few GHz, D ~ 100 pc, the angular size upper limit is ~10s of microarcsec. (This limit can be used to calculate brightness temperature lower limits.)
  • You see annual variation in the amount of scintillation for some sources if the transverse v of the screen is not >> the Earth orbital velocity.

References

Posted in Uncategorized | Tagged | Comments Off

“Does the U.S. Produce Too Many Scientists?”

Hot on the heels of Nature’s Astro2010 Decadal Review mini-panel not discussing career issues, I was pointed to an article in Scientific American that does just that: “Does the U.S. Produce Too Many Scientists?” by Beryl Lieff Benderly.

The article’s response to its titular question is a strong “yes”, in contrast to their assertion (not backed up too strongly, but certainly believable) that most people you’d ask would say “no”. The juicy counterintuitive conclusion is obtained by analyzing the question in a relatively narrow way: how many PhDs do we mint or import compared to the number of faculty positions there are? Of course, as also detailed in the whitepaper I helped write, there answer is “a lot more”. If every PhD ought to obtain a faculty position — which is a big “if” — then yes, it’s pretty clear that we do indeed produce too many scientists.

Benderly touches a bit on the “if” and mentions a few of the career-path issues that are also raised in the whitepaper. The very comparison of “number of PhDs” versus “number of faculty spots” is indicative of how narrow the thinking typically is when it comes to career trajectories for a PhD.

The article also focuses on “science” as an activity only performed by PhDs in research labs. Obviously there are issues with this but I imagine that Benderly was well aware of them and simply needed to keep the article somewhat focused. That being said, there are a few sidenotes touching on issues beyond the research-class world that seem a bit off to me. For instance, discussing the quality of American primary and secondary education:

Raising America’s average scores on international comparisons is, therefore, not a matter of repairing a broken educational system that performs poorly overall, as many critiques suggest, but rather of improving the performance of the children at the bottom, overwhelmingly from low-income families and racial and ethnic minorities. This discrepancy, of course, is a vital national need and responsibility, but it does not reflect an overall insufficient supply of able science students.

I guess I agree with the second sentence — we can get away with a lot since we have a large population. But the first sentence seems completely misguided to me. Privileged students are always going to perform just fine given a system that isn’t pathologically flawed  — the fact that our system fails underprivileged students precisely shows that it’s broken.

Then the article closes with:

The real crisis in American science education is not young Americans’ inability to learn, or the schools’ inability to teach, but a distorted job market’s inability to provide them careers worthy of their abilities.

I like the emphasis at the ending here on the career options for PhDs, and yes, the article does focus on PhDs, but really? “The real crisis in American science education”? I’m pretty sure the real crisis in American science education is that most Americans have a shockingly poor grasp of the basic facts that science has taught us about the natural world and a profound ignorance regarding how and why the scientific process is successful. The real crisis in American science education is that many science teachers (bless their souls!) are not qualified to teach the subjects they teach. The real crisis in American science education is that schools don’t have enough money for the necessary educational materials and many students don’t have enough time and stability outside of school to study. Those are the real crises.

Posted in Uncategorized | Tagged , | 1 Comment

Decadal Review Tea-Leaf-Reading

I was a little surprised to find out from Casey Law that Nature recently ran a news piece about the astronomy Decadal Review process — and they didn’t just report on it, they actually got a group of prominent astronomers together for dinner to discuss their impressions of how well the review process works and what its outcome might be this time around. Unfortunately, the article’s behind a paywall, so you probably can’t read it, but here’s a link to it if you have a university Internet connection.

I’m not the most gung-ho SKA supporter on the planet, but I was surprised to see how low it ranked in the Nature group’s list of priorities. As far as I can tell it was never mentioned during the hour-long panel discussion (paywalled transcript link) and the group ranked it last out of seven projects that they’d recommend for funding. Ouch.

You can rationalize a bit by saying that the SKA is more than a decade off, and the group even talked about how major projects have recently taken two iterations of the Decadal Review to get significant momentum behind them, while this is the first go-round for the SKA. Still, maybe this is a sign that the SKA community needs to get more aggressive about its PR. On the more personal front, this is a little scary for hopes that the Decadal Review report will push funding for ATA expansion as preliminary SKA work.

Another thing that disappointed me was that the group didn’t spend any time discussing “state of the profession” issues like the ones discussed in the whitepaper I helped write. This didn’t surprise me at all, but I still wish more people out there (especially the leaders of the community) would be more engaged in thinking about, well, the state of the profession. We can certainly continue to limp along as we have for a long time, but I feel like there’s a tremendous room to improve the system, if only a bit more money, time and effort would be put in that direction.

Posted in Uncategorized | Tagged | Comments Off

A Nice Null Result

OkCupid is a dating site that takes a pleasantly data-driven approach to the online dating game. If you read their blog, you run into a lot of interesting and surprising facts that the OkCupid staff have pulled out of their databases.

Like many dating sites, OkCupid gives its users a list of questions to answer about themselves. For each question, however, it also lets you specify what your ideal partner’s response would be, and how important that response is to you. (You might not always want your partner to have the same response as you — for instance, “Are you sexually dominant?”) This lets OkCupid rate the compatibility between two members according to their personal standards, not just according to what the employees at OkCupid think makes for a good couple.

You can then do interesting aggregate statistics by breaking the users down into groups and then looking at the average compatibility between different groups. On this post on the OkCupid blog, they did various breakdowns and visualized the results on grids like the one below. Each group has its own row and column, and the intersection of a row and column gives the average compatibility between the two groups. A greener color means above-average compatibility, and a redder color means below-average compatibility. For instance, here are the compatibilities of racial groups:

A dating compatibility graphic

Dating compatibility between racil groups, taken from blog.okcupid.com

This is somewhat heartening: in theory, people of all races should be able to get along pretty well in relationships. (Unsurprisingly, in practice, this is untrue. Different racial pairings on OkCupid reply to each other’s messages at rates that vary wildly from what their compatibility scores would imply, indicating that people’s personal attitudes affect things strongly. See the blog post for more info.)

Anyway, here’s that null result that I referenced in the title:

A dating compatibility graphic

Dating compatibility, taken from blog.okcupid.com

From a sample of 500,000 users, 144 comparisons, all within 0.5% of the mean value. That’s a null hypothesis that I can get behind.

(As a side note, the fact that OkCupid lets people rate the personal importance of others’ answers to various questions allows them to easily discover which questions are the most effective for testing compatibility. Apparently “How often do you shower?” is one of the best.)

Posted in Uncategorized | Tagged | Comments Off

For Twitter or for Worse

Well, I’ve started using Twitter under the username pkgw, and I’ve been kind of enjoying it. Right now my timeline is kind of a grab-bag of mostly personal items with some work-related things; I haven’t seen any obvious examples of good ways to keep the two somewhat separate. (As a sidenote, “timeline” seems like sort of an odd word choice to me compared to “feed” or something like that.)

The main thing that I don’t like about Twitter is that its associated vocabulary is so infantile; I feel humiliated whenever I have to say the word “tweet” aloud. But “blog” has started to feel like a a real word, so maybe “tweet” isn’t far behind.

The other thing is that, of course, Twitter is a closed-source web service and I have no idea what their user data policies are. Like it or not, most people who microblog use Twitter, and the social aspects of the service are important, so I’m willing to put up with that. Unlike my email or this blog, I don’t think that I’d be brokenhearted to lose all of my tweets (they are mostly ephemera) so this doesn’t bother me as much as it might for other services.

Posted in Uncategorized | Tagged , | Comments Off

Broadband Spectra Paper Out!

After two and half years, my paper on the FIR/radio correlation and broadband spectra of galaxies with the ATA has been accepted to the Astrophysical Journal and posted to arxiv! If you’ve been wondering what I’ve been doing for the past 30 months, you can find out here.

Posted in Uncategorized | Tagged | Comments Off

The Python Standard Libraries: Not Very Good

Thomas Vander Stichele, whom I have never met but works in some of the areas of Linux that I used to be involved with, writes about some ugly code he found in the Python standard libraries, and says

I usually tend to think of Python as the discerning gentleman’s programming language: well-behaved, well-documented, people take care of the code written. I like the batteries-included approach and assume that the battery code in the standard library is well-written…

I have to say that I have no idea what he’s talking about. The Python standard libraries are terrible. Everywhere you look there are inconsistent APIs and coding styles, redundant or missing functionality, widely ranging quality in the documentation, and poorly-engineered solutions. Consider the description of the subprocess module:

… This module intends to replace several other, older modules and functions like:

  • os.system
  • os.spawn*
  • os.popen*
  • popen2.*
  • commands.*

That is, the Python standard libraries contain at least four different APIs for invoking subprograms, and I have to say that I still don’t think latest iteration is great. The os module, which provides core functionality,  just exposes the standard POSIX API without making any efforts to map it into the language helpfully or abstract very well for non-POSIX systems. Google indicates that urllib is pretty widely hated. StringIO vs cStringIO, pickle vs. cPickle, about a dozen different database and XML modules … it’s a mess.

Now, of course, this is all just saying that the Python standard libraries are what they are: they were written by many different people in many different styles at different times, and they were developed quickly to get functionality out there so people could use it. I think that is very Pythonic: Get Stuff Done and if not everything is perfect, so be it. But I have to say, it’d be nice if the standard libraries were better. They’re bread-and-butter APIs, and it would, you know, be nice if they were carefully and thoughtfully designed. The fact that they aren’t turns out not to be a deal-breaker, but it remains unfortunate.

Posted in Uncategorized | Tagged | 2 Comments

Status Update

Emotions that RFI currently inspires in Peter:

  • sadness
Posted in Uncategorized | Tagged , | Comments Off

SKAremongering

A while back an interesting paper appeared on astro-ph, the astrophysics preprint server: “Large Instrument Development for Radio Astronomy“, by J. R. Fisher and others who appear to be radio engineers at the National Radio Astronomy Observatory. It’s a whitepaper submitted to the Astro2010 decadal review and I think it’s fair to summarize it as a shot at the Square Kilometer Array concept.

The whitepaper doesn’t explicitly name the SKA but that’s clearly what it’s about. The basic argument is of the “let’s not be hasty” form — it takes time to develop new technologies, combining multiple scientific goals in one observatory is difficult, the costs of complex designs can quickly get out of hand, and so on. The SKA concept, which envisions serious progress in many areas of radio astronomical engineering and pretty much aspires to be the Greatest Radio Telescope For Everyone Evar as well as By Far The Most Expensive Radio Telescope Evar certainly calls for lots of new technologies and a complex design.

I’m sympathetic to pretty much all of their arguments. Building the largest radio telescope ever with new, incompletely-understood technology would, I think, be a really bad idea. From what I’ve seen, “all-in-wonder” designs for any technological system are usually a red flag — flexibility almost always comes at the cost of clean, simple, and correct functionality for any particular purpose. World-class hardware and software engineering is hard and takes time.

That being said, their arguments are the same ones that are always used to discourage innovations. “Oh, it’s risky, it’ll take a long time, we understand the existing stuff so much better.” These arguments are often valid, but technology wouldn’t be much fun if they were always heeded. Of course, “fun” shouldn’t be the operative word when you’re talking about a multi-billion-dollar investment. But in the case of the SKA, there’s no way to build the telescope without requiring some innovation. In the terminology of the whitepaper, it’s just a question of how much risk you’ve retired before you start building it.

In the particular case of the SKA, I’m not sure what I think. One thing is the fact that there’s a long development path leading to the SKA itself — pathfinders and prototypes and precursors, oh my. A lot of work is already happening to build and test the kinds of systems that would be involved in the SKA. It’s not as if ground is going to be broken on the final thing before a detailed design has been worked out and thought through. There won’t be any prototype fully-functioning observatories with thousands of antennas, but I think the basic issues involved in scaling up a large array to a huge array are low-risk: a lot of the requirements are parallelizable, and if you understand a good-sized batch of antennas well, you can understand how a much larger batch is going to behave.

On the other hand, a project as big as the SKA tends to develop its own inertia. If, after another decade of work, there’s trouble on all of the engineering fronts, no one’s going to want to (or even be able to?) just abort the whole project. For small R&D projects, you can say, “OK, it didn’t work, that’s good to know,” but when you’re planning to spend a few billion dollars, the plug just doesn’t get pulled. And in that case, you could be spending lots of money on a telescope that will be merely OK when that money could have been spent on several projects that would have each been great.

Maybe it’s best to think of the SKA like a space mission. Space missions are expensive and risky, so you always see that they deploy pretty unexciting telescope technologies; they get their science leverage from whatever application-specific advantage is provided by being in space. The SKA is also expensive and risky since the upfront capital investment will be so large. So in all probability it will likewise deploy technology that will be boring by the time SKA construction starts; this is OK since the SKA gets its science leverage by being really huge. In this case, the question is, is it possible to deploy boring technology in the SKA model? I think so, since the key pieces are the antennas, feeds, and communications links — those are the things that you really, really don’t want to have to replace en masse. And those are eminently testable in smaller configurations. So hooray, the SKA will work out fine! Good thing we figured that out.

Posted in Uncategorized | Tagged , | Comments Off

Public Talk, October 21

I’m giving my radio astronomy talk yet again, to the San Francisco Amateur Astronomers on Wednesday, October 21st, at 7:30 PM at the Randall Museum. The abstract is identical to that of September’s talk. If you’ll be in town, you should come on by!

Posted in Uncategorized | Tagged | Comments Off