Open In Colab


Exploring $z\sim5$ Lyman Break Galaxies in the HELP Virtual Observatory

In this notebook we use the HELP VO system to explore the properties of a sample of $z\sim5$ ($r$-dropouts) in the ELAIS-N1 data. Using the VO we search for $r$-dropout sources in the HyperSuprimeCam (HSC) data following the selection criteria applied by Ono et al. (2017).

For this sample of $z\sim5$ candidates we then explore their properties in the far-infrared using the HELP XID+ deconfused photometry measurements and make use of the HELP photometric redshift posteriors to construct a cleaner sample if IR-luminous LBG candidates.

This notebook assumes you have PyVO installed and working, if this is your first experience with using the HELP VO service you may consider starting with the other example notebooks to familiarise yourself with the steps.

In [1]:
!pip install pyvo
Collecting pyvo
  Downloading (804kB)
     |████████████████████████████████| 808kB 6.9MB/s 
Requirement already satisfied: astropy in /usr/local/lib/python3.7/dist-packages (from pyvo) (4.2)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from pyvo) (2.23.0)
Collecting mimeparse
Requirement already satisfied: pyerfa in /usr/local/lib/python3.7/dist-packages (from astropy->pyvo) (1.7.2)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.7/dist-packages (from astropy->pyvo) (1.19.5)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->pyvo) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->pyvo) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->pyvo) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->pyvo) (2020.12.5)
Building wheels for collected packages: pyvo, mimeparse
  Building wheel for pyvo ( ... done
  Created wheel for pyvo: filename=pyvo-1.1-cp37-none-any.whl size=801573 sha256=974e8360ae43fc38428655f761034b1a41d750df0cdc8d339e461bc2c430007e
  Stored in directory: /root/.cache/pip/wheels/d9/00/df/656aac56938f1c83dfcb361346f74101ce1f8c849fc93b18dc
  Building wheel for mimeparse ( ... done
  Created wheel for mimeparse: filename=mimeparse-0.1.3-cp37-none-any.whl size=3864 sha256=e71d29dddf79e3f0a53cd2dc2c5c126e3be26d8f32e31f60593f3ab3649ef120
  Stored in directory: /root/.cache/pip/wheels/54/ca/c7/3db47cc5c748286db22a7fab43ccf985903d2b9ca119de16ab
Successfully built pyvo mimeparse
Installing collected packages: mimeparse, pyvo
Successfully installed mimeparse-0.1.3 pyvo-1.1
In [2]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import h5py
import pyvo as vo
from IPython import display

Here we define our main query, selecting all our columns of interest and joining the main datatable with the depth maps so that we only include sources from within the HSC optical footprint:

In [4]:
SELECT top 1000000
 t.help_id, t.hp_idx, t.ra, t.dec, t.redshift, t.zspec,
 t.m_ap_suprime_g, t.m_ap_suprime_r, t.m_ap_suprime_i, t.m_ap_suprime_z, t.m_ap_suprime_y,
 t.merr_ap_suprime_g, t.merr_ap_suprime_r, t.merr_ap_suprime_i, t.merr_ap_suprime_z, t.merr_ap_suprime_y,
 t.m_ap_irac_i1, t.merr_ap_irac_i1, t.m_ap_irac_i2, t.merr_ap_irac_i2,
 t.f_spire_250, t.f_spire_350, t.f_spire_500, 
 t.ferr_spire_250, t.ferr_spire_350, t.ferr_spire_500, 
 t.cigale_sfr, t.cigale_dustlumin, t.cigale_mstar,
 t.cigale_sfr_err, t.cigale_dustlumin_err, t.cigale_mstar_err,
 c.ferr_suprime_z_mean, c.ferr_irac_i2_mean
FROM herschelhelp.main AS t
JOIN depth.main AS c
ON (t.field = 'ELAIS-N1') and t.hp_idx = c.hp_idx_o_13 and c.ferr_suprime_i_mean is NOT NULL

Now we define the additional colour criteria that will be added to select the dropout candidates (dropout_criteria) or a small subset of the full catalog (background_criteria):

In [5]:
dropout_criteria = """
    ((t.m_ap_suprime_r - t.m_ap_suprime_i) > 1.2) AND 
    ((t.m_ap_suprime_i - t.m_ap_suprime_z) < 0.7) AND 
    ((t.m_ap_suprime_r - t.m_ap_suprime_i) > (1.5*(t.m_ap_suprime_i - t.m_ap_suprime_z) + 1.1)) AND
    (abs(t.f_ap_suprime_g/t.ferr_ap_suprime_g) < 2) AND
    (abs(t.f_ap_suprime_z/t.ferr_ap_suprime_z) > 5) AND
    (abs(t.f_ap_suprime_y/t.ferr_ap_suprime_y) > 4)  

    t.hp_idx BETWEEN 188733368 AND 188763368

#    ((abs(t.f_spire_250/t.ferr_spire_250) > 2) OR (abs(t.f_spire_350/t.ferr_spire_350) > 2) OR (abs(t.f_spire_500/t.ferr_spire_500) > 2))

Executing the queries and loading into astropy tables...

In [6]:
#Then we execute the query
service = vo.dal.TAPService("")
result_set = + dropout_criteria)
reference_set = + background_criteria)

hsc_r_drops = result_set.to_table()
hsc_background = reference_set.to_table()

The forced photometry of XID+ may return many measurements that are consistent with zero, so we'll set our threshold for FIR detection as $2\sigma$ or more in at least two of the SPIRE bands.

In [7]:
fir_det = (((hsc_r_drops['f_spire_250']/hsc_r_drops['ferr_spire_250'] > 2.).astype('int') + 
           (hsc_r_drops['f_spire_350']/hsc_r_drops['ferr_spire_350'] > 2.).astype('int') +
           (hsc_r_drops['f_spire_500']/hsc_r_drops['ferr_spire_500'] > 2.).astype('int')) >= 2)

$r$-dropouts in ELAIS N1

Lets plots the candidates in the $r$-dropout colour space alongside our background subset of all galaxies to illustrate where everything lies.

In [8]:
Fig, Ax = plt.subplots(1,1, figsize=(5,5))

        hsc_background['m_ap_suprime_r']-hsc_background['m_ap_suprime_i'], ',', alpha=0.1, label='All HSC')

        'o', ms=4, color='indianred', alpha=0.2, label=r'All $r$-drops')

        'o', ms=8, color='firebrick', label=r'FIR Detected $r$-drops')

Leg = Ax.legend(loc='lower right')

Ax.set_xlabel(r'$i_{\rm{HSC}} - z_{\rm{HSC}}$', size=12)
Ax.set_ylabel(r'$r_{\rm{HSC}} - i_{\rm{HSC}}$', size=12)
Ax.set_xlim([-0.5, 2])
Ax.set_ylim([-1, 4])
(-1.0, 4.0)

What do the photometric redshift posteriors of our candidates look like?

Now we load the HELP photo-$z$ posteriors to see the estimated redshifts for these candidates taking into account all of the available photometry (including additional near- and mid-infrared where available).

The posterior files are large but as they are HDF5 files, they do not need to be loaded into memory. Here we assume that the relevant file has been downloaded from HeDAM and is stored locally (Alternatively you can download straight to Google Colab). We load in the posteriors for only our $z\sim5$ candidate sources.

In [10]:
# get hdf5 photoz file
--2021-03-16 11:47:26--
Resolving (
Connecting to (||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8086987844 (7.5G) [application/x-hdf]
Saving to: ‘pz_hb_en1.hdf’

pz_hb_en1.hdf       100%[===================>]   7.53G  24.6MB/s    in 5m 13s  

2021-03-16 11:52:39 (24.7 MB/s) - ‘pz_hb_en1.hdf’ saved [8086987844/8086987844]

In [49]:
pz_hdf = h5py.File('pz_hb_en1.hdf', 'r') # Update to relevant local path
pz_help_id = pz_hdf['help_id'][()]

#decode id from bytes_ to string
pz_help_id = np.array([i.decode('UTF-8') for i in pz_help_id])
matches = list(np.array([np.where(idx == pz_help_id) for idx in hsc_r_drops['help_id']]).squeeze())
hsc_r_drops_pz = np.array([pz_hdf['pz'][idx] for idx in matches])

For illustrative purposes, lets stack the posteriors (and re-normalise) for our different subsamples to see the overall redshift distributions.

In [51]:
pz_all = np.nansum(hsc_r_drops_pz, 0) # Sum all posteriors
pz_all /= np.trapz(pz_all, pz_hdf['zgrid']) # Re-normalise

pz_fir = np.nansum(hsc_r_drops_pz[fir_det], 0) # Sum SPIRE detected posteriors
pz_fir /= np.trapz(pz_fir, pz_hdf['zgrid']) # Re-normalise
In [52]:
Fig, Ax = plt.subplots(1, 1, figsize=(6,4))
Ax.plot(pz_hdf['zgrid'], pz_all, lw=3, label=r'All $r$-drops')
Ax.plot(pz_hdf['zgrid'], pz_fir, lw=3, label=r'FIR Detected $r$-drops')

Leg = Ax.legend(loc='upper right')
Ax.set_xlim([0, 7])
Ax.set_title(r'ELAIS-N1 HSC $r$-dropouts - Stacked $P(z)$')
Ax.set_xlabel('z', size=12)
Ax.set_ylabel(r'$P(z)$', size=12)
Text(0, 0.5, '$P(z)$')

As we can above, only approximately $\sim50\%$ of the sources are predicted to be at high redshift. This is consistent with the contamination estimates presented in Ono et al. (2017). As expected, the high redshift peak is centred at $z\sim5$ but has a very broad distribution.

The posterior distribution for the FIR detected sources is clearly different, with the majority being consistent with low redshift galaxies (i.e. mis-identification of the Balmer/Lyman break). There is still however clearly some candidates that are consistent with being at high redshift. Lets find out how many of the dropout sources have redshift posteriors consistent with the source being at $z > 3$

In [53]:
pz_gtr3 = (np.trapz(hsc_r_drops_pz[:,462:], pz_hdf['zgrid'][462:], axis=1) > 0.95)

How many SPIRE detected sources remain after enforcing a stricter photo-$z$ requirement?

In [54]:
(pz_gtr3 * fir_det).sum()
In [56]:
pz_stack_gtr3 = np.nansum(hsc_r_drops_pz[pz_gtr3], 0) # Sum all posteriors
pz_stack_gtr3 /= np.trapz(pz_stack_gtr3, pz_hdf['zgrid']) # Re-normalise

pz_fir_gtr3 = np.nansum(hsc_r_drops_pz[fir_det*pz_gtr3], 0) # Sum SPIRE detected posteriors
pz_fir_gtr3 /= np.trapz(pz_fir_gtr3, pz_hdf['zgrid']) # Re-normalise

Fig, Ax = plt.subplots(1, 1, figsize=(6,4))
Ax.plot(pz_hdf['zgrid'], pz_stack_gtr3, lw=3, label=r'$z > 3$ $r$-drops')
Ax.plot(pz_hdf['zgrid'], pz_fir_gtr3, lw=3, label=r'FIR Detected $z > 3 r$-drops')

Leg = Ax.legend(loc='upper left')
Ax.set_xlim([0, 7])
Ax.set_title(r'ELAIS-N1 HSC $r$-dropouts - Stacked $P(z)$')
Ax.set_xlabel('z', size=12)
Ax.set_ylabel(r'$P(z)$', size=12)
Text(0, 0.5, '$P(z)$')


Author: Kenneth Duncan

The Herschel Extragalactic Legacy Project, (HELP), is a European Commission Research Executive Agency funded project under the SP1-Cooperation, Collaborative project, Small or medium-scale focused research project, FP7-SPACE-2013-1 scheme, Grant Agreement Number 607254.