# Early-type galaxies in the SDSS. II. Correlations between observables

###### Abstract

A magnitude limited sample of nearly 9000 early-type galaxies, in the redshift range , was selected from the Sloan Digital Sky Survey using morphological and spectral criteria. The sample was used to study how early-type galaxy observables, including luminosity , effective radius , surface brightness , color, and velocity dispersion , are correlated with one another. Measurement biases are understood with mock catalogs which reproduce all of the observed scaling relations and their dependences on fitting technique. At any given redshift, the intrinsic distribution of luminosities, sizes and velocity dispersions in our sample are all approximately Gaussian. A maximum likelihood analysis shows that , , and in the band. In addition, the mass-to-light ratio within the effective radius scales as or , and galaxies with larger effective masses have smaller effective densities: . These relations are approximately the same in the , and bands. Relative to the population at the median redshift in the sample, galaxies at lower and higher redshifts have evolved only little, with more evolution in the bluer bands. The luminosity function is consistent with weak passive luminosity evolution and a formation time of about 9 Gyrs ago.

^{1}

^{1}affiliationtext: University of Chicago, Astronomy & Astrophysics Center, 5640 S. Ellis Ave., Chicago, IL 60637

^{2}

^{2}affiliationtext: Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213

^{3}

^{3}affiliationtext: Fermi National Accelerator Laboratory, P.O. Box 500, Batavia, IL 60510

^{4}

^{4}affiliationtext: Department of Physics and Astronomy, University of Pittsburgh, Pittsburgh, PA 15620

^{5}

^{5}affiliationtext: Stewart Observatory, University of Arizona, 933 N. Clarry Ave., Tucson, AZ 85121

^{6}

^{6}affiliationtext: Department of Astronomy, University of California at Berkeley, 601 Campbell Hall, Berkeley, CA 94720

^{7}

^{7}affiliationtext: Princeton University Observatory, Princeton, NJ 08544

^{8}

^{8}affiliationtext: Hubble Fellow

^{9}

^{9}affiliationtext: Department of Physics, New York University, 4 Washington Place, New York, NY 10003

^{10}

^{10}affiliationtext: Department of Physics & Astronomy, The Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218-2686

^{11}

^{11}affiliationtext: Apache Point Observatory, 2001 Apache Point Road, P.O. Box 59, Sunspot, NM 88349-0059

^{12}

^{12}affiliationtext: Yale University, P. O. Box 208101, New Haven, CT 06520

^{13}

^{13}affiliationtext: Universidad de Chile, Casilla 36-D, Santiago, Chile

^{14}

^{14}affiliationtext: Department of Physics of Complex Systems, Eötvös University, Budapest, H-1117 Hungary

^{15}

^{15}affiliationtext: Institute of Astronomy, School of Science, University of Tokyo, Mitaka, Tokyo 181-0015, Japan

^{16}

^{16}affiliationtext: Research Center for the Early Universe, School of Science, University of Tokyo, Tokyo 113-0033, Japan

^{17}

^{17}affiliationtext: Institute for Cosmic Ray Research, University of Tokyo, Kashiwa 277-8582, Japan

^{18}

^{18}affiliationtext: Institute for Advanced Study, Olden Lane, Princeton, NJ 08540

^{19}

^{19}affiliationtext: U.S. Naval Observatory, 3450 Massachusetts Ave., NW, Washington, DC 20392-5420

^{20}

^{20}affiliationtext: Department of Physics, University of Michigan, 500 East University, Ann Arbor, MI 48109

^{21}

^{21}affiliationtext: Department of Astronomy, University of Tokyo, Tokyo 113-0033, Japan

^{22}

^{22}affiliationtext: Department of Astronomy and Astrophysics, The Pennsylvania State University, University Park, PA 16802

## 1 Introduction

This is the second of four papers in which the properties of early-type galaxies, in the redshift range are studied. Paper I (Bernardi et al. 2003a) describes how the sample was selected from the SDSS database. The sample is essentially magnitude limited, and the galaxies in it span a wide range of environments. Each galaxy in the sample has measured values of luminosity , effective radius and surface brightness in four bands (, , and ), a velocity dispersion , a redshift, and an estimate of the local density.

Section 2 of the present paper shows that the luminosity function of the galaxies in our sample, when expressed as a function of absolute magnitude, is well described by a Gaussian form, and that the luminosities in the population as a whole appear to be evolving passively. Section 3 studies the distribution of (the logarithm of) velocity dispersion, size, surface-brightness, effective mass and effective density at fixed luminosity; all of these are quite well described by Gaussian forms, suggesting that the intrinsic distributions of log(size) and log(velocity dispersion) are, like the distribution of log(luminosity), approximately Gaussian. Maximum-likelihood estimates of these and other correlations, which include the Faber-Jackson relation, the mass-to-light ratio, the Kormendy relation and a mass–density relation are presented in Section 4. Appendix A describes a method for generating accurate mock complete and magnitude-limited galaxy catalogs, which are useful for assessing the relative importance of evolution and selection effects. The procedure used to estimate errors on our results is discussed in Appendix B.

Paper III (Bernardi et al. 2003b) of this series places special emphasis on the Fundamental Plane relation between size, surface brightness and velocity dispersion. It shows how the FP depends on waveband, color, redshift and environment. Paper IV (Bernardi et al. 2003c) uses the colors and spectra of these galaxies to provide information about the chemical evolution of the early-type population.

Except where stated otherwise, we write the Hubble constant as , where and are the present-day scaled densities of matter and cosmological constant. In such a model, the age of the Universe at the present time is Gyr. For comparison, an Einstein-de Sitter model has and Gyr. We frequently use the notation as a reminder that we have set . Also, we will frequently be interested in the logarithms of physical quantities. Our convention is to set and , where and are effective radii in kpc and velocity dispersions in km s, respectively. , and we perform our analysis in a cosmological world model with

## 2 The luminosity function

Our sample is magnitude limited (Table 1 of Paper I gives the magnitude limits in the different bands). Therefore, we measure the luminosity function of the galaxies in our sample using two techniques. The first uses volume limited catalogs, and the second uses a maximum likelihood procedure (Sandage, Tammann & Yahil 1979; Efstathiou, Ellis & Peterson 1988).

In the first method, we divide our parent catalog into many volume limited subsamples; this was possible because the parent catalog is so large. When doing this, we must decide what size volumes to choose. We would like our volumes to be as large as possible so that each volume represents a fair sample of the Universe. On the other hand, the volumes must not be so large that evolution effects are important. In addition, because our catalog is cut at the bright as well as the faint end, large-volume subsamples span only a small range in luminosities. Therefore, we are forced to compromise: we have chosen to make the volumes about thick, because Mpc is larger than the largest structures seen in numerical simulations of the cold dark matter family of models (e.g., Colberg et al. 2000). The catalogs are extracted from regions which cover a very wide angle on the sky, so the actual volume of any given volume limited catalog is considerably larger than (Mpc). Therefore, this choice should provide volumes which are large enough in at least two of the three coordinate directions that they represent fair samples, but not so large in the redshift direction that the range in luminosities in any given catalog is small, or that evolution effects are washed out.

The volume-limited subamples are constructed as follows. First, we specify the boundaries in redshift of the catalog: and . In the context of a world model, these redshift limits, when combined with the angular size of the catalog, can be used to compute a volume. This volume depends on and the world model: as our fiducial model we set and . (Our results hardly change if we use an Einstein de-Sitter model instead.) We then compute the K-corrected limiting luminosities and given the apparent magnitude limits, the redshift limits, and the assumed cosmology. A galaxy is included in the volume limited subsample if and . The luminosity function for the volume limited subsample is obtained by counting the number of galaxies in a luminosity bin and dividing by the volume of the subsample.

The top panels in Figure 1 show the result of doing this in the and bands. Stars, circles, diamonds, triangles, squares and crosses show measurements in volume limited catalogs which have , 0.08, 0.12, 0.16, 0.20, and 0.24 and . Each subsample contains more than five hundred galaxies, except for the two most distant, which each contain about one hundred. As one would expect, the nearby volumes provide the faint end of , and the more distant volumes show the bright end. The extent to which the different volume limited catalogs all trace out the same curve is a measure of how little the luminosity function at low and high redshifts differs from that at the median redshift.

The bottom panels in Figure 1 show evidence that, in fact, the galaxies in our data show evidence for a small amount of evolution: at fixed comoving density, the higher redshift population is slightly brighter than that at lower redshifts. Although volume-limited catalogs provide model-independent measures of this evolution, the test is most sensitive when a large range of luminosities can be probed at two different redshifts. Because the SDSS catalogs are cut at both the faint and the bright ends, our test for evolution is severely limited. Nevertheless, the small trends we see are both statistically significant, and qualitatively consistent with what one expects of a passively evolving population. (Note that our sample contains only early-type galaxies. Blanton et al. 2001 study the luminosity function in an SDSS sample which contains all galaxy types, but they ignore evolution effects. Since late-type galaxies are expected to evolve more rapidly than early-types, it is important to redo Blanton et al.’s analysis after allowing for evolution.)

Before we make more quantitative conclusions, notice that a bell-like Gaussian shape would provide a reasonable description of the luminosity function. Although early-type galaxies are expected to have red colors, our sample was not selected using any color information. It is reassuring, therefore, that the Gaussian shape we find here also provides a good fit to the luminosity function of the redder objects in the SDSS parent catalog (see the curves for the two reddest galaxy bins in Fig.14 of Blanton et al. 2001). A Gaussian form also provides a reasonable description of the luminosity function of early-type galaxies in the CNOC2 survey (Lin et al. 1999, even though they actually fit a Schechter function to their measurements). The 2dFGRS galaxies classified as being of Type 1 by Madgwick et al. (2002) should be similar to early-types. Their Type 1’s extend to considerably fainter absolute magnitudes than our sample and the shape of the luminosity function they report is quite different from ours. This is probably because the population of early-type galaxies at faint absolute magnitudes is quite different from the brighter ones (e.g., Sandage & Perelmuter 1990). In any case, their Schechter function fits underestimate the number density of luminous Type 1 galaxies—a Gaussian tail would provide a significantly better fit.

Given that the Gaussian form provides a good description of our data, we use the maximum-likelihood method outlined by Sandage, Tammann & Yahil (1979) to estimate the parameters of the best-fitting luminosity function. For magnitude limited samples which are small and shallow, this is the method of choice. For a sample such as ours, which spans a sufficiently wide range in redshifts that evolution effects might be important, the method requires a model for the evolution. We parametrize the luminosity evolution similarly to Lin et al. (1999). That is to say, if we were solving only for the luminosity function, then the likelihood function we maximize would be

(1) |

and denote the minimum and maximum absolute magnitudes at which satisfy the apparent magnitude limits of the survey, and runs over all the galaxies in the catalog. (At small , this parametrization of the evolution in absolute magnitude implies that the luminosity evolves as , with . Note that, in assuming that only evolves, this model assumes that there is no differential evolution in luminosities, i.e., that luminous and not so luminous galaxies evolve similarly.

Figure 2 shows the result of estimating the luminosity function in this way in the , , and bands. Later in this paper, we will solve simultaneously for the joint distribution of luminosity, size and velocity dispersion; it is the parameters which describe the luminosity function of this joint solution which are shown in Fig. 2. The dashed lines in each panel show the Gaussian shape of the luminosity function at redshift . For comparison, the symbols show the measurements in the same volume limited catalogs as before, except that now we have subtracted the maximum likelihood estimate of the luminosity evolution from the absolute magnitudes before plotting them. If the model for the evolution is accurate, then the different symbols should all trace out the same smooth dashed curve.

The comoving number density of the galaxies in this sample is Mpc in all four bands. Because the different bands have different apparent magnitude limits, and they were fit independently of each other, it is reassuring that the same value of works for all the bands. For similar reasons, it is reassuring that the best-fit values of imply rest-frame colors at of , , and , which are close to those of the models which we used to compute our K-corrections (Appendix A of Paper I), even though no a priori constraint was imposed on what these rest-frame colors should be.

The histograms in each of the four panels of Figure 3 show the number of galaxies observed as a function of redshift in the four bands. The peak in the number counts at is also present in the full SDSS sample, which includes late-types, and, perhaps more surprisingly, an overdensity at this same redshift is also present in the 2dF Galaxy Redshift Survey. (The second bump at is also present in the 2dFGRS counts.) The solid curves show what we expect to see for the evolving Gaussian function fits—the curves provide a reasonably good fit to the observed counts, although they slightly overestimate the numbers at high redshift in the redder wavebands. For comparison, the dashed curves show what is expected if the luminosities do not evolve and the no-evolution luminosity function is given by the one at the median redshift (i.e., a Gaussian with mean ). Although the fit to the high-redshift tail is slightly better, this no evolution model cannot explain the trends shown in the bottom panel of Figure 1. Moreover, a Bruzual & Charlot (2003) passive evolution model with a formation time of 9 Gyrs ago, predicts that the rest-frame luminosities at redshift should be brighter than those at by 0.3, 0.26, 0.24, and 0.21 mags in , , and respectively—not far off from what we estimate.

The bottom panels in Figure 1 suggest two possible reasons why our model of pure luminosity evolution overestimates at higher . One possibility is that the comoving number densities are decreasing slightly with redshift. A small amount of density evolution is not unexpected, because early-type galaxy morphologies may evolve (van Dokkum & Franx 2001), and our sample is selected on the basis of a fixed morphology. If we allow a small amount of density as well as luminosity evolution, and we use with , as suggested by the results of Lin et al. (1999), then the resulting curves are also well fit by the dashed curves. A second possibility follows from the fact that we only observe the most luminous part of the higher redshift population. If the most luminous galaxies at any given time are also the oldest, then one might expect the bright end of the luminosity function to evolve less rapidly than the fainter end. The curvature seen in the bottom panel of Figure 1 suggests that although the evolution of the fainter objects in our sample (which we only see out to low redshifts) is consistent with formation times of 9 Gyrs ago, the brighter objects are not. Models of differential evolution in the luminosities also predict distributions which are in better agreement with the observed counts at high redshift. Since the evolution of the luminosity function is small, we prefer to wait until we are able to make more accurate K-corrections before accounting for either of these other possibilities more carefully. Therefore, in what follows, we will continue to use the model with pure luminosity evolution.

Repeating the exercise described above but for an Einstein–de-Sitter model yields qualitatively similar results, although the actual values of and are slightly different. At face value, the fact that we see so little evolution in the luminosities argues for a relatively high formation redshift: the Bruzual & Charlot (2003) models indicate that Gyrs.

## 3 Observed correlations: Distributions at fixed luminosity

This Section presents scatter plots between different observables and luminosity. This is done because, except for a cut at small velocity dispersions, our sample was selected by luminosity alone. This means that the distributions of at fixed luminosity are not biased by the selection cut (e.g., Schechter 1980). The distribution of at fixed is shown to be reasonably well described by a Gaussian for all the choices of we consider. This simplifies the maximum likelihood analysis described in Section 4 which we use to estimate a number of observed correlations (it is also used in Paper III to estimate the parameters of the Fundamental Plane).

The best way to think of any absolute magnitude versus scatter plot is to imagine that, at fixed absolute magnitude , there is a distribution of values. The scatter plot then shows the joint distribution

(2) |

where denotes the density of galaxies with and at , and is the luminosity function at which we computed in Section 2. One of the results of this section is to show that the shape of is simple for most of the relations of interest.

The mean value of at fixed is independent of the fact that our catalogs are magnitude limited. Therefore, we estimate the parameters of linear relations of the form:

(3) |

where is the absolute magnitude and is the observable (for example, we will study , or ). For each volume limited catalog, we fit for the slope and zero-point of the linear relation. If there really were a linear relation between and , and neither nor evolved, then the slopes and zero-points of the different volume limited catalogs would be the same.

To illustrate, the different symbols in Figure 4 show , the Faber–Jackson relation (Faber & Jackson 1976), in our dataset. Most datasets in the literature are consistent with the scaling , approximately independent of waveband. For example, Forbes & Ponman (1999), using a compilation of data from Prugniel & Simien (1996) report in the B-band. At longer wavelengths Pahre et al. (1998) report in the K-band, with a scatter of 0.93 mag.

Stars, circles, diamonds, triangles, squares and crosses show the relation measured in volume limited catalogs of successively higher redshift (redshift limits are the same as in Figure 1). The galaxies in each subsample were further divided into two equal-sized parts based on luminosity. The symbols with error bars show the mean for each of these small bins in , and the rms spread around it (note that the error on the mean is smaller than the size of the symbols in all but the highest redshift catalogs). The solid line shows the maximum-likelihood estimate of the slope of this relation at , which we describe in Section 4. Comparison with this line shows that the higher redshift population is slightly brighter. The slope of this line is shown in the top of each panel: , approximately, in all the bands, consistent with the literature. The zero point, however, is different; at fixed luminosity, the objects in our sample have velocity dispersions which are smaller than those reported in the literature by about .

We have enough data that we can actually do more than simply measure the mean at fixed ; we can also compute the distribution around the mean. If we do this for each catalog, then we obtain distributions which are approximately Gaussian in shape, with dispersions which depend on the range of luminosities which are in the subsample. Rather than showing these, we created a composite catalog by stacking together the galaxies from the nonoverlapping volume limited catalogs, and we then divided the composite catalog into five equal sized bins in luminosity. The histograms in the bottom of the plot show the shapes of the distribution of velocities in the different luminosity bins. Except for the lowest and highest redshift catalogs for which the statistics are poorest, the different distributions have almost the same shape; only the mean changes.

One might have worried that the similarity of the distributions is a signature that they are dominated by measurement error. This is not the case: the typical measurement error is about a factor of two smaller than the rms of any of these distributions. If we assume that the measurement errors are Gaussian-distributed, then the distributions we see should be the true distribution broadened by the Gaussian from the measurement errors. The fact that the observed distributions are well approximated by Gaussians suggests that the true intrinsic distributions are also Gaussian. The fact that the width of the intrinsic distribution is approximately independent of considerably simplifies the maximum likelihood analysis presented in the next section.

It is well known that color is strongly correlated with velocity dispersion (Paper IV of this series shows the color relation in our sample). One consequence of this is that residuals from the relation shown in Figure 4 correlate strongly with color: at fixed magnitude, the redder galaxies have the highest velocity dispersions. In addition, as a whole, the reddest galaxies populate the high part of the relation. Forbes & Ponman (1999) reported that residuals from the Faber–Jackson relation correlate with age. If color is an indicator of age and/or metallicity, then our finding is qualitatively consistent with theirs: the typical age/metallicity varies along the Faber–Jackson relation.

A similar study of the relation between the luminosities and sizes of galaxies is shown in Figure 5. Schade et al. (1997) find in the B band, whereas, at longer wavelengths, Pahre et al. (1998) find with an rms of 0.88 mag. This suggests that the relation depends on wavelength. We find in , but in the other bands. The distribution is also reasonably well fit by a Gaussian, with a mean which increases with luminosity, and a dispersion which is approximately independent of . The rms around the mean is about one and a half times larger than the rms around the mean relation. We argue in Paper IV that the color–magnitude and color–size relations are a consequence of the color correlation. If this is correct, then residuals from the relation, should not correlate with size or magnitude. We have checked that this is correct, although we have not included a plot showing this explicitly.

There is an interesting correlation between the residuals of the Faber–Jackson and relations. At fixed luminosity, galaxies which are larger than the mean tend to have smaller velocity dispersions. This is shown in Figure 6, which plots the residuals from the relation versus the residuals from the relation. The short dashed lines show the forward and inverse fits to this scatter plot. The long-dashed line in between the other two shows , where denotes the residual from the mean relation at fixed , and denotes the rms of this residual. The anti-correlation is approximately the same for all .

This suggests that a plot of versus some combination of and should have considerably less scatter than either of the two individual relations. To illustrate, Figure 7 shows the distribution of the combination at fixed . The scatter in is significantly reduced, making the mean trend of increasing with increasing quite clean. (The combination of observables for which the scatter is minimized is discussed in Section 4.) This particular combination defines an effective mass: . In slightly more convenient units, this mass is

(4) |

(Because many of our galaxies are not spherical, some of their support must come from rotation, and so ignoring rotation as we are doing is likely to mis-estimate the true mass. See Bender, Burstein & Faber 1992 for one way to account for this. This quantity will also mis-estimate the mass if some of the support comes from anisotropic velocity dispersions.)

Fiducial values of the effective mass-to-light ratio can be obtained by inserting the maximum likelihood values from Table 1 into this relation. This yields (we used the parameters for the band, for which ). The corresponding total absolute magnitude is . The luminosity of the sun in is mags, so . The luminosity within the effective radius is half this value, so that the effective mass-to-light ratio within the effective radius of an object is times that of the sun. Figure 7 shows that the effective mass-to-light ratio depends on luminosity:

(5) |

At larger radii, the luminosity can double at most, whereas, if the galaxy is embedded in a dark matter halo, the mass at large radii may continue to increase. For this reason one might expect the mass-to-light ratios to be significantly larger at larger radii.

Since the ratio above is the mass-to-light ratio at the radius which encloses half the light, it is tempting to associate it with the mass-to-light ratio at the half mass radius. Because both the numerator and the denominator are projected quantities, this is incorrect. For example, if the mass-to-light ratio is independent of distance from the galaxy center, then the three dimensional half-mass radius is about 30% larger than the projected half-light radius (e.g. Hernquist 1990). If the velocity dispersion does not change substantially over the range in radii which contribute light, then a fairer estimate of the mass-to-light ratio within the half-mass radius would be about 30% larger than the value given above.

We can define an effective density by setting , with , then

(6) |

Figure 8 shows that this effective density decreases with increasing luminosity, although the scatter in densities at fixed luminosity is quite large ( dex). Inserting mean values for and yields

(7) |

Such a trend is qualitatively similar to that seen in numerical simulations of dissipationless gravitational clustering: the central densities of virialized halos in such simulations are smaller in the more massive halos (Navarro, Frenk & White 1997).

Figure 9 shows a final relation at fixed luminosity: the surface-brightness relation. In such a plot, luminosity evolution moves objects upwards and to the right (larger luminosities and surface brightnesses at high redshift), so that the higher redshift population should be obviously displaced from the zero-redshift relation. The plot shows the distribution of and after subtracting the maximum likelihood estimate of the evolution from both quantities. The solid line shows the maximum likelihood value of the slope of this relation. This differs slightly from the scaling Sandage & Perelmuter (1990) find for giant galaxies with , although the scatter around the mean relation of mags is similar. This relation is considerably broader than any of the others we have studied so far, which may account for some of the difference. However, a careful inspection of the figure suggests that the relation is becoming shallower at high redshift; whether or not this is a signature of differential evolution in the luminosities is the subject of work in progress.

## 4 A parametric maximum-likelihood analysis

Section 2 showed that, after accounting for the fact that the SDSS sample is magnitude-limited, the distribution of is quite well described by a Gaussian. In principle, by extending the Efstathiou, Ellis & Peterson (1988) method (along the lines described by Sodré & Lahav 1993) we could derive non-parametric maximum-likelihood estimates of the three-dimensional distribution of , and . The virtue of this approach is that it accounts for the fact that the observed sample is magnitude-limited, that there is also a cut at small velocity dispersions, and that there are correlated measurement errors associated with the luminosities, sizes and velocity dispersions. Once the shape of the three-dimensional distribution has been estimated, it is straightforward to obtain estimates of the various correlations with luminosity we studied in the previous section. This is the subject of Section 4.2. However, the real benefit of the maximum likelihood analysis is that it also yields estimates of correlations between observables which do not include luminosity—some examples of these are shown in Section 4.3.

We chose not to make a non-parametric estimate of the joint distribution because just ten bins in each of , and yields free parameters to be determined from galaxies. Moreover, Section 3 showed that, in each of the SDSS wavebands, the distributions of and at fixed absolute magnitude are quite well described by Gaussian forms. Therefore, the joint distribution of early-type galaxy luminosities, sizes, and velocity dispersions should be well described by a tri-variate Gaussian distribution in the variables , and . Saglia et al. (2001) describe a maximum likelihood analysis of early-type galaxy correlations in which they assume that a tri-variate Gaussian is a reasonable description of their data: we have the luxury of knowing that this is indeed a reasonable description of our dataset. Thus, we have a simple parametrization of the joint distribution for which, in each waveband, nine numbers suffice to describe the statistical properties of our sample: three mean values, , and , three dispersions, , and , and three pairwise correlations, , , and .

In addition, we will also allow for the possibility that the luminosities are evolving—a tenth parameter to be estimated from the sample. The maximum likelihood technique allows us to estimate these ten numbers as follows. We define the likelihood function

(8) |

Similarly to when we discussed the luminosity function, is defined by integrating over the range of absolute magnitudes, velocities and sizes at which make it into the catalog. Here is the vector of the observables, and describes the errors in the measurements.

Appendix D of Paper I describes how the elements of the error matrix were obtained. Briefly, the error in the absolute magnitude assumes that there are no errors in the redshift or the K-correction, so all the error comes from the error on the apparent magnitude ; the error on the circularly averaged radius is given by adding the error on the angular length of the longer axis to those which come from the error on the axis ratio . We assume that the errors in are neither correlated with those in nor with those in the absolute magnitude. However, because both and come from the same fitting procedure, the errors in and are correlated. Finally, we assume that errors in magnitudes are not correlated with those in velocity dispersion, so is set to zero, and that errors in size and velocity dispersion are only weakly correlated because of the aperture correction we apply.

Band | Q | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

5825 | 0.844 | 0.520 | 0.254 | 2.197 | 0.113 | 0.536 | 1.15 | ||||

8228 | 0.841 | 0.490 | 0.241 | 2.200 | 0.111 | 0.543 | 0.85 | ||||

8022 | 0.851 | 0.465 | 0.241 | 2.201 | 0.110 | 0.542 | 0.75 | ||||

7914 | 0.845 | 0.450 | 0.241 | 2.200 | 0.110 | 0.543 | 0.60 |

The covariance matrix contains six of the ten free parameters we are seeking. It is these parameters, along with the three mean values, , and , and the evolution parameter which are varied until the likelihood is maximized. The maximum-likelihood estimates of these parameters in each band are given in Table 1. Notice that although the luminosity and size distributions differ from band to band, the velocity distributions do not. This is reassuring, because the intrinsic distribution of velocity dispersions, estimated from the spectra, should not depend on the band in which the photometric measurements were made. As an additional test, we also computed maximum-likelihood estimates of the covariance matrices of the bivariate Gaussians for the pairs and . These estimates of, e.g., and were similar to those in Table 1.

The remainder of this paper uses to estimate various pairwise correlations. In Paper III, we transform the covariance matrix into one which describes the Fundamental Plane variables of size, surface brightness and velocity dispersion.

### 4.1 The intrinsic distributions of sizes and velocity dispersions

Before we present maximum likelihood estimates of various correlations, it is worth remarking that because the trivariate Gaussian is a good description of the data, our results indicate that, in addition to the intrinsic distribution of absolute magnitudes, the intrinsic distributions of (the logarithms of) early-type galaxy sizes and velocity dispersions are also well fit by Gaussian forms. The means and dispersions of these Gaussians are given by and in Table 1. Note that the width of the distribution of is about half that of . This is consistent with earlier work (e.g., it is one of the motivations for the -space parametrization of Bender, Burstein & Faber 1992).

### 4.2 Correlations with luminosity

As we describe below, appropriate combinations of the coefficients in Table 1 provide maximum likelihood estimates of various linear regressions between pairs of observables which are often studied; these are summarized in Table 2. Plots comparing some of these linear regressions with the maximum likelihood estimates are shown in Section 3.

In the Gaussian model, the mean of at fixed is

(9) |

where the second equality defines , for ease of comparison with equation (3). The dispersion around this mean is

(10) |

Inserting the values in Table 1 into these expressions for and provides the maximum likelihood estimate of the slope and thickness of this relation. These are shown in the second column of Table 2, and the fit itself is shown in Figure 4. The errors we quote on the slopes of this, and the other relations in the Table, were obtained using subsamples as described in Appendix B. Note that the errors we find in this way are comparable to those sometimes quoted in the literature, even though each of the subsamples we selected is an order of magnitude larger than any sample available in the literature.

Band | ||||||
---|---|---|---|---|---|---|

4.00 | 1.50 | 0.86 | ||||

3.91 | 1.58 | 0.87 | ||||

3.95 | 1.59 | 0.88 | ||||

3.92 | 1.58 | 0.88 |

The mean size at fixed absolute luminosity , and the dispersion around this mean, are obtained by replacing all ’s with ’s in equation (9). The third column in Table 2 gives the maximum likelihood value of the slope , of the size-at-fixed-luminosity relation in the four bands. This fit is shown in Figure 5.

Similarly, one can show that the slopes of the mean -mass and -density relations shown in Figures 7 and 8 are and . These are the fourth and fifth columns of Table 2. The dispersions around these mean mass- and density- relations can be written in terms of the elements of , though we have not included the expressions here. Even though these relations are made from linear combinations of and , they may be tighter than either the or relations because the correlation coefficients , and are different from zero.

The surface brightnesses of the galaxies in our sample are defined by , so the dispersion in is . The mean surface brightness at fixed luminosity is obtained by replacing all s with s in the equations (9) and (10) above. This means that we need , which we can write in terms of , and . The sixth column in Table 2 gives the slope of the surface brightness at fixed luminosity relation, , in the four bands. These fits are shown in Figure 9.

### 4.3 Inverse relations and other correlations

So far, we have shown that the maximum likelihood analysis provides estimates of correlations which are in good agreement with quantities which can also be estimated by a more straightforward regression technique. However, with the coefficients of the correlation matrix in hand, it is straightforward to obtain estimates of correlations which, because of selection effects, cannot be reliably estimated using simple regressions. For example, the mean luminosity given the velocity dispersion is

(11) |

with dispersion (compare equations 9 and 10). Inserting the coefficients in Table 1 yields . Similarly, one can show that and in .

We can also study correlations which do not involve luminosity. The best studied of these is the Kormendy (1977) relation: the surface brightnesses of early-type galaxies decrease with increasing effective radius. The mean size at fixed surface brightness in our sample is

(12) |

where can be written in terms of , and , and the final equality defines . The seventh column in Table 2 gives the slope of this relation in the four bands. For comparison, Kormendy (1977) found that in the B-band, and Pahre et al. (1998) find in the K-band.

For the reasons described in Section 3, when presented with a magnitude limited catalog, correlations at fixed luminosity are useful because they are unbiased by the selection. When luminosity is not one of the variables then forward and inverse correlations may be equally interesting, and equally biased. For example, in the Kormendy (1977) relation, may be just as interesting as . The slopes of the two relations are, of course, simply related to each other. In fact, it may be preferable to study the relations which are defined by the principle axes of the ellipse in space which the galaxies populate. The directions of these axes are obtained by computing the eigenvalues and vectors of the covariance matrix associated with the sizes and surface brightnesses. To illustrate, the eigenvalues of the covariance matrix associated with the Kormendy relation are

where we have set . The eigenvalues give the dispersions along and perpendicular to the major axis of the ellipse. The long axis of the ellipse describes the mean relation, , where

With obvious changes of variables, analogous expressions can be derived for all the correlations presented earlier, although we do not show them here.

The Kormendy relation in our sample is shown in Figure 10. The dashed lines show forward and inverse fits to the data: i.e., the mean size at fixed surface brightness, and the mean surface brightness at fixed size. The parameters of the fits are affected by the magnitude limit of the catalog. To estimate the effect of the magnitude limit cut on this relation, we compute the direct and inverse fits to the Kormendy relation in the simulated complete and magnitude-limited samples we describe in Appendix A. The dotted line in Figure 10 shows the direct fit to the magnitude limited simulations (it can hardly be distinguished from the fit to the data).

In comparison, the maximum-likelihood estimate of the true direct relation provides a very good description of the relation in the complete simulations in which there is no magnitude limit: it is shown as the solid line. Notice that the dashed and dotted lines have approximately the same slope as the solid line: the magnitude limit hardly affects the slope, although it changes the zero-point dramatically. At fixed surface brightness, the typical is significantly larger in the magnitude limited sample than in the complete sample. This happens because lines of constant luminosity run downwards and to the right with slope , so that changes in luminosity act approximately perpendicular to the relation.

This shows that although linear regression fits to the data provide good estimates of the true slope of the Kormendy relation, they provide bad estimates of the true zero-point. In comparison, the maximum-likelihood technique, which accounts for the selection on apparent magnitudes, is able to estimate the slope and the zero-point correctly.

Another interesting correlation is that between the effective mass and density defined in equations (4) and (6). A little algebra shows that . Figure 11 shows forward and inverse fits to this relation. The characteristic density of halos seen in numerical simulations of hierarchical clustering scales with halo mass: (Bullock et al. 2001), which is qualitatively similar to the scaling of effectve density with effective mass in our sample. The scatter in characteristic densities at fixed halo mass, dex, is also rather similar to the scatter in effective densities at fixed mass. These coincidences may provide important clues to how early-type galaxies formed.

Band | |||
---|---|---|---|

0.76 | 1.94 | 0.063 | |

0.79 | 1.93 | 0.058 | |

0.82 | 1.89 | 0.054 | |

0.81 | 1.90 | 0.054 |

In contrast to the Faber–Jackson, radius–luminosity, Kormendy, and mass–density relations, the relations between luminosity and mass and luminosity and density involve three variables. Is there some combination of these variables which provides the least scatter? The eigenvectors and eigenvalues of the matrix give the directions of the principle axes of the ellipsoid in space which the early-type galaxies populate. One of the eigenvalues of is considerably smaller than the others, suggesting that the galaxies populate a two-dimensional plane in space. The eigenvectors show that the plane is viewed edge-on in the projection

(13) |

where , and were given in Table 1, and the coefficients and , and the thickness of the plane in this projection, , are given in Table 3. Section 3 shows that a scatter plot of luminosity versus mass is considerably tighter than plots of versus or . The eigenvectors of show that this is because the versus projection is actually quite close to the edge-on projection. It is interesting that this plane is only about 10% thicker than the Fundamental Plane relation between , and which is the subject of Paper III.

## 5 Discussion and conclusions

We have studied the properties of early-type galaxies over the redshift range using photometric (the , , and bands) and spectroscopic observations. The intrinsic distributions of luminosity, velocity dispersion and half-light radius of the galaxies in our sample are each well described by Gaussians in absolute magnitude, , and .

A maximum likelihood analysis of the joint distribution of luminosities, sizes and velocity dispersions suggests that the population at higher redshifts is slightly brighter than the population nearby, and that the change with redshift is faster in the shorter wavebands: If , then , and in , and . This evolution is sufficiently weak that, relative to their values at the median redshift () of our sample, the sizes, surface brightnesses and velocity dispersions of the early-type galaxy population at lower and higher redshifts has evolved little. The fact that we see so little evolution in the luminosities argues for a relatively high formation redshift: Bruzual & Charlot (2003) single burst stellar population synthesis models indicate that Gyrs. This is consistent with the model we use to make K-corrections in Paper I, and is also consistent with the formation time estimates based on the Fundamental Plane in Paper III, and galaxy colors and spectral line indices in Paper IV.

We find that and (see Table 2 for the exact coefficients, and Figures 4 and 5 for the fits). Galaxies which are slightly larger than expected (given their luminosity) have smaller velocity dispersions than expected (Figure 6). This is expected if galaxies are in virial equilibrium.

A plot of luminosity versus effective mass is substantially tighter than either the or the relations. It has a slope which is slightly shallower than unity. In particular, on scales of a few kiloparsecs, , approximately independent of waveband (Figure 7). This complements recent SDSS weak-lensing analyses (McKay et al. 2001) which suggest that mass is linearly proportional to luminosity in these same wavebands, but on scales which are two orders of magnitude larger (kpc). Together, these two measurements of the mass-to-light ratio can be used to provide a constraint on the density profiles of dark matter halos.

A plot of luminosity versus effective density shows that (Figure 8). Moreover, a maximum likelihood analysis suggests that the more massive galaxies are less dense: (Figure 11). This is qualitatively similar to a trend seen in numerical simulations of hierarchical clustering: more massive halos tend to be less centrally concentrated (Navarro, Frenk & White 1997). This coincidence may provide an important clue to how early-type galaxies formed.

The Kormendy relation between size and surface brightness has approximately the same slope in all four SDSS bands (Figure 10). Our maximum likelihood analysis, and measurements made in mock catalogs which reproduce all the observed scalings of the dataset (a procedure for generating such catalogs is described in Appendix A), show that the zero-point of this relation is strongly affected by the magnitude limit of the sample (Section 4.3).

Funding for the creation and distribution of the SDSS Archive has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Aeronautics and Space Administration, the National Science Foundation, the U.S. Department of Energy, the Japanese Monbukagakusho, and the Max Planck Society. The SDSS Web site is http://www.sdss.org/.

The SDSS is managed by the Astrophysical Research Consortium (ARC) for the Participating Institutions. The Participating Institutions are The University of Chicago, Fermilab, the Institute for Advanced Study, the Japan Participation Group, The Johns Hopkins University, Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, University of Pittsburgh, Princeton University, the United States Naval Observatory, and the University of Washington.

## References

- () Bender, R., Burstein, D., & Faber, S. M. 1992, ApJ, 399, 462
- () Bernardi, M., Sheth, R. K., Annis, J. et al. 2003a, AJ, in press
- () Bernardi, M., Sheth, R. K., Annis, J. et al. 2003b, AJ, in press
- () Bernardi, M., Sheth, R. K., Annis, J. et al. 2003c, AJ, in press
- () Blanton, M. R., Dalcanton, J., Eisenstein, D. et al., 2001, AJ, 121, 2358
- () Bruzual, G., & Charlot, S. 2003, in preparation
- () Bullock, J. S., Kolatt, T. S., Sigad, Y., Somerville, R. S., Kravtsov, A. V., Klypin, A. A., Primack, J. R., & Dekel, A. 2001, MNRAS, 321, 559
- () Colberg, J. M., White, S. D. M., Yoshida, N., MacFarland, T., Jenkins, A., Frenk, C. S., Pearce, F. R., Evrard, A. E., Couchman, H. M. P., Efstathiou, G., Peacock, J., Thomas, P. (The Virgo Consortium) 2000, MNRAS, 319, 209
- () Efstathiou, G., Ellis, R. S., & Peterson, B. S. 1988, MNRAS, 232, 431
- () Faber, S. M., & Jackson, R. 1976, ApJ, 204, 668
- () Forbes, D. A., & Ponman, T. J. 1999, MNRAS, 309, 623
- () Hernquist, L. 1990, ApJ, 356, 359
- () Kormendy, J. 1977, ApJ, 218, 333
- () Lin, H., Yee, H. K. C., Carlberg, R. G., Morris, S. L., Sawicki, M., Patton, D. R., Wirth, G., & Shepherd, C. W. 1999, ApJ, 518, 533
- () Madgwick, D. S., Lahav, O., Baldry, I. K. et al. 2002, MNRAS, 333, 133
- () McKay, T. A., Sheldon, E. S., Racusin, J., et al. 2001, ApJ, submitted, astro-ph/0108013
- () Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493
- () Pahre, M. A., Djorgovski, S. G., & de Carvalho, R. R. 1998, AJ, 116, 1591
- () Prugniel, P., & Simien, F. 1996, A&A, 309, 749
- () Saglia, R. P., Colless, M., Burstein, D., Davies, R. L., McMahan, R. K., & Wegner, G. 2001, MNRAS, 324, 389
- () Sandage, A., Tammann, G. A., & Yahil, A. 1979, ApJ, 232, 352
- () Sandage, A., & Perelmuter, 1990, ApJ, 361, 1
- () Schade, D., Barientos, L. F., & López-Cruz, O., 1997, ApJL, L17
- () Schechter, P. L. 1980, AJ, 85, 801
- () Sodré, L. Jr. & Lahav, O. 1993, MNRAS, 260, 285
- () van Dokkum, P. G. & Franx M., 2001, ApJ, 553, 90

## Appendix A Simulating a complete sample

This Appendix describes how to use our knowledge of the covariance matrix to simulate mock galaxy samples which have the same correlated observables as the data. We use these mock samples to estimate the effect of the magnitude limit cut on the relations we wanted to measure in the main text.

The observed parameters , and of each galaxy in our sample are drawn from a distribution, say, , where is the absolute magnitude, and . We show in Section 3 that , where is the luminosity function at redshift , and the distribution of and at fixed luminosity is, to a good approximation, a bivariate Gaussian. The maximum likelihood estimates of the parameters of the luminosity function and of the bivariate distribution at fixed luminosity can be obtained from Table 1.

To make the simulations we must assume that, when extrapolated down to luminosities which we do not observe, these relations remain accurate. Assuming this is the case, we draw from the Gaussian distribution that we found was a good fit to (Section 2). We then draw from the Gaussian distribution with mean and dispersion . Finally, we draw from a Gaussian distribution with mean and variance which accounts for the correlations with both and . In practice we draw three zero mean unit variance Gaussian random numbers: , , and , and then set

Because each simulated galaxy is assigned a luminosity and size, its surface brightness is also fixed: constant.

If we generate a catalog in , then we can also generate colors using the parameters given in Table 1 of Paper IV. Specifically, generate a Gaussian variate , and then set , where , and are defined analogously to , and above. Inserting the values from Table 1 of Paper IV shows that , and : the mean color is determined by the velocity dispersion and not by the absolute magnitude.

Passive evolution of the luminosities and colors is incorporated by adding the required dependent shift to and after the sizes and velocity dispersions have been generated.

This complete catalog can be used to simulate a magnitude limited catalog if we assign each mock galaxy a redshift, assuming a world model and homogeneity. Let and denote the apparent magnitude limits of the observed sample. Let denote the absolute magnitude of the most luminous galaxy we expect to see in our catalog. Because the luminosity function cuts off exponentially at the bright end, we can estimate this by setting . This means that the most distant object which can conceivably make it into the magnitude limited catalog lies at a luminosity distance of about , from which the maximum redshift can be determined. If the comoving number density of mock galaxies is to be independent of redshift, we must assign redshifts as follows. Draw a random variate distributed uniformly between zero and one, and set . The redshift can be obtained by inverting the relation. The apparent magnitude of this mock galaxy is , where is the K-correction. If , then this galaxy would have been observed; add it to the subset of galaxies from the complete catalog which would have been observed in the magnitude limited catalog.

If our simulated catalogs are accurate, then plots of magnitude, size, surface-brightness and velocity dispersion versus redshift made using our magnitude-limited subset should look very similar to the SDSS dataset shown in Figure 12 of Paper I. In addition, in the simulated magnitude limited subset should be similar to that in Figure 3. Furthermore, any correlations between observables in the magnitude limited subset should be just like those in the actual SDSS dataset. If they are, then one has good reason to assume that similar correlations measured in the complete, rather than the magnitude-limited simulation, represent the true correlations between the parameters of SDSS galaxies, corrected for selection effects. In this way, the simulations allow one to estimate the impact that the magnitude-limited selection has when estimating correlations between early-type galaxy observables.

We have verified that our simulated magnitude limited catalogs have similar distributions to those observed, and the simulated and versus plots show the same selection cuts at low velocities and sizes as do the observed data. The distribution of apparent magnitudes, angular sizes, and velocity dispersions in the magnitude limited simulations are very similar to those in the real data. The simulated parameters also show the same correlations at fixed luminosity as the data. Maximum likelihood analysis on the simulations produces an estimate of the covariance matrix which is similar to that of the data. Therefore, we are confident that our simulated complete catalogs have correlations between luminosity, size, and velocity dispersion which are similar to the data.

## Appendix B Composite volume-limited catalogs

Our parent sample is magnitude limited; unless accounted for, this will introduce a bias into a number of correlations we study in this series of papers. For this reason, we often present results measured in a few volume limited subsamples. Because of the cuts at both the faint and the bright ends of the catalog, each volume-limited subsample used in the main text spans only a small range in luminosity. However, because the galaxies in our sample luminosity show little or no evolution relative to the values at the median redshift of the sample, we can extend this range in either of three ways.

One method is to construct a composite volume-limited catalog by stacking together smaller volume-limited subsamples which are adjacent in redshift and in luminosity, but which do not overlap at all. Let denote the volume of the th subsample, and let denote the number of galaxies in it. A conservative approach is to randomly choose the galaxies in with probability proportional to min, where min denotes the volume of the smallest of the subsamples. This has the disadvantage of removing much of the data, but, because our data set is so large, we can afford this luxury. A more cavalier approach is to choose all the galaxies in the largest , all the galaxies in the other , and to generate a set of additional galaxies by randomly choosing one of the galaxies in , adding to each of its observed parameters a Gaussian random variate with dispersion given by the quoted observational error, and repeating this times. A final possibility is to weight all the galaxies in (even those which were not in the volume limited subsample) by the inverse of the volume in which they could have been observed . We chose the first, most conservative option.

By piecing together three volume limited subsamples, we were able to construct composite catalogs of about objects each. Because the completeness limits are different in the different bands, the composite catalogs are different for each band. In addition, because any one composite catalog is got by subsampling the set of eligible galaxies, by subsampling many times, we can generate many realizations of a composite catalog. This allows us to estimate the effects of sample variance on the various correlations we measure.