### Laser wakefield electron acceleration

DRACO^{49} at HZDR drove plasma wakes with laser pulses of 30 fs duration, 800 nm centre wavelength and ~2 J energy. An *f*/20 off-axis parabolic mirror with 2 m focal length focused these pulses to a spot size of 20 μm (FWHM) onto the entrance plane of a 3-mm-long Mach-10 helium (He) gas jet. The leading edge of the laser pulse fully ionized the He, after which its intense peak drove an LWFA in the He plasma. For down-ramp and ionization injection experiments, the plasma in the jet’s central density plateau (length *L* ≈ 1.5 mm) had an average electron density in the range of 2.0 × 10^{18} < *n*_{e} < 3.3 × 10^{18} cm^{−3}. For self-injection experiments, it had an average density of *n*_{e} ≈ 4 × 10^{18} cm^{−3}. For down-ramp injection, a knife edge was inserted into the gas jet, creating a thin dense shock near the beginning of the plateau, pinpointed by a transverse wide-bandwidth shadowgraphy probe with a centre wavelength of 800 nm (ref. ^{50}) (Supplementary Fig. 1b). Laser focus and shock positions were scanned to optimize the LWFA stability and performance. For STII, we removed the shock and doped the gas with 1% nitrogen. We then scanned the pulse shape with a programmable acousto-optic dispersive filter (Fastlite Dazzler) and adjusted the pulse energy to optimize the LWFA performance. For self-injection, we used a pure He jet with no shock and shifted the laser focus several millimetres before the jet. This mismatched the laser divergence and the plasma wake’s focusing force, inducing radial wake oscillations that triggered the injection of plasma electrons into the wake.

A magnetic electron spectrometer with its entrance plane at *z* = 30 cm downstream of the gas-jet exit determined the electron energy distribution for each shot^{49}. A Konica Minolta OG 400 scintillating screen recorded the dispersed electron beam (Fig. 1 (top right) shows examples). We converted the luminescence intensity into charge per unit energy per pixel using methods described elsewhere^{38}. The absolute charge calibration uncertainty was ~20%. The r.m.s. shot-to-shot charge fluctuations were only a few per cent. Uncertainties in the >200 MeV electron energy measurement primarily originated from pointing and divergence fluctuations of LWFA electrons entering the spectrometer and amounted to ~2% for electrons in the range of 200–350 MeV.

### Calculations and simulations of wakefield acceleration

In the ‘Substructure of wakefield-accelerated e^{–} bunches’ section, we estimated the longitudinal profile *N*_{e}(*ξ*) and length Δ*ξ* of down-ramp-injected e^{–} bunches from their measured energy distribution ∂*N*_{e}/∂*U*_{e} and energy spread Δ*U*_{e}, respectively. Here *ξ* = *z* − *v*_{0}*t* is the longitudinal coordinate in the frame of a bubble moving with velocity *v*_{0} along the lab coordinate *z*. This estimate assumed a quasi-static bubble and an accelerating field *E*_{z} = (*e**n*_{e}/2*ϵ*_{0})*ξ* that is linear in *ξ* but exceeds the field (*e**n*_{e}/3*ϵ*_{0})*ξ* of a uniformly charged sphere because of the concentration of electrons at its rear^{39}. The PIC simulations of down-ramp-injected LWFA for our conditions, such as the example shown in Supplementary Fig. 1c, validate these assumptions for our conditions. Linear *E*_{z} is expected when accelerated charge, which averaged ~50 pC for our down-ramp-injected LWFA, is well below the beam-loading limit of ~300 pC (ref. ^{35}). The estimate also assumed that injection is localized in space and time. Transverse shadowgraphs^{50} such as the example shown in Supplementary Fig. 1b validated this assumption. Under these conditions, an electron injected at position *ξ*_{i} within the bubble and accelerated to *ξ*_{f}, positions that are at potentials \({\varPhi }_{x}=(e{n}_{{\rm{e}}}/4{\epsilon }_{0}){\xi }_{x}^{2}\) (where *x* = *i*, *f*) with respect to the bubble centre, gains energy *U*_{e} = *e*(*Φ*_{i} − *Φ*_{f}) in the bubble frame. Applying the chain rule d*N*_{e}/d*ξ* = (∂*N*_{e}/∂*U*_{e})(∂*U*_{e}/∂*ξ*) and transforming to the lab frame, we find that ∂*N*_{e}/∂*U*_{e} maps onto the bunch’s longitudinal profile d*N*_{e}/d*z* via the linear chirp ∂*U*_{e}/∂*z*, that is,

$$\frac{{\rm{d}}{N}_{{\rm{e}}}}{{\rm{d}}z}=\frac{{n}_{{\rm{e}}}{e}^{2}}{2{\epsilon }_{0}}\frac{\partial {N}_{{\rm{e}}}}{\partial {U}_{{\rm{e}}}}{L}_{{{{\rm{acc}}}}},$$

(1)

where *L*_{acc} is the acceleration length in the lab frame. This blue curve in Fig. 4a plots this function using *L*_{acc} ≈ *L*, *n*_{e} = 2.3 × 10^{18} cm^{−3} and the measured ∂*N*_{e}/∂*U*_{e} (Fig. 1, top right). For the down-ramp-injected shot analysed in the main text, the chirp was ∂*U*_{e}/∂*ξ* ≈ 3 MeV μm^{–1} and bunch length Δ*ξ* = (2*ϵ*_{0}/*e*^{2}*n*_{e})(Δ*U*_{e}/*L*) ≈ 850 nm FWHM for ~10% energy spread at 280 MeV, which is consistent with published^{41} and our own data (Supplementary Fig. 1c shows the PIC simulations for this injection method). Since our spectrometer has a resolution of \(\Delta {U}_{{\rm{e}}}^{\,({\rm{res}})}\) ≈ 10 MeV, this approach determines bunch length Δ*ξ* (bunch duration Δ*ξ*/*c*) with 300 nm (1 fs) resolution. The r.m.s. fluctuation of ~20% in *n*_{e}*L*_{acc} was the principal uncertainty in the estimated d*N*_{e}/d*z* and Δ*ξ*. In iterative reconstructions of the 3D profile *N*_{e}(*x*, *y*, *ξ*) of down-ramp-injected bunches, the estimated longitudinal profile served as a parameterized initial guess (rather than a fixed function) that was allowed to vary over a range consistent with this experimental uncertainty, yielding, for example, the orange curve in Fig. 4a.

The 3D PIC simulations of STII were performed using the code PIConGPU^{51} to evaluate the spatial structure of the bunch and the longitudinal phase space. For the STII simulation (Fig. 5a), we used a simulation box of 768 × 4,608 × 768 cells with a transverse resolution of 4.5 and a longitudinal resolution of 36 sampling points per laser wavelength. The laser is modelled using a Gauss–Laguerre reconstructed laser profile measured during the experiment. To avoid numerical Cherenkov radiation, the Lehe field solver^{52} was used together with the Boris pusher^{53} and the Esirkepov current deposition scheme^{54}. The exact code version used and all the setup files are available elsewhere^{55}.

The COTR images directly calculated from the PIC-simulated density distribution (Fig. 5a) are shown in Supplementary Fig. 6. By extracting the oscillatory leading and trailing edges and smooth middle section of this distribution, we verified that nearly all the calculated COTR data originated from the leading and trailing edges. To optimize the match of the calculated and measured COTR images, we first constructed the following parameterized representation of the leading portion of the Fig. 5a distribution:

$${\rho }_{{\rm{l}}}(x,y,z)=A{{\rm{e}}}^{-{[(x+\delta x)/2{\sigma }_{x}]}^{2}}\times {{\rm{e}}}^{-{(z/2{\sigma }_{z})}^{2}}\times {{\rm{e}}}^{{\left\{\left[-y+\delta y-B\cos \left(\frac{2\uppi }{\lambda }z+{\phi }_{z}\right)\right]/2{\sigma }_{y}\right\}}^{2}},$$

(2)

where *A* is the overall amplitude in units of C μm^{–3}; and *x*, *y* and *z* and all the remaining parameters were normalized to 1 μm: *σ*_{x,y,z} are the Gaussian widths; δ*x* and δ*y* are *x* and *y* offsets; and *λ* and *B* are the oscillation wavelength and amplitude, respectively. The first two Gaussians in equation (2) constrain the overall distribution in the *x* and *z* directions, and the third represents the leading oscillation. The simulated trailing oscillations in Fig. 5a had less well-defined wavelength and amplitude; therefore, we specified only the spatial boundaries of this region and left the reconstruction algorithm to find the density distribution on its own. A smooth super-Gaussian function that generated negligible COTR represented the charge density in the middle section. We then ran dozens of iterative reconstructions with varied starting parameters until a dominant cluster of solutions representing the best fit emerged. Figure 5d represents the best-fit composite bunch distribution, and the bottom row in Fig. 5d shows the corresponding best-fit COTR data. The best-fit equation (2) parameters for the data in Fig. 5 were as follows: *A* = 0.323 ± 0.034 pC μm^{–3}, *B* = 7.70 ± 1.00 μm, *σ*_{x} = 3.50 ± 0.40 μm, *σ*_{y} = 5.40 ± 0.40 μm, *σ*_{z} = 0.16 ± 0.10 μm, δ*x* = −0.65 ± 0.20 μm, δ*y* = −0.90 ± 0.06 μm, *λ* = 0.89 ± 0.07 μm.

### COTR generation, imaging and calibration

A schematic of the 65 μm laser-block foil and 25 μm Kapton COTR foil with a 300-nm-thick aluminium-coated back surface in relation to the gas jet and LWFA drive pulse is shown in another work^{18}. For the results presented here, the Kapton foil also included a 65 μm low-*Z* adhesive layer. Here 67 such foil pairs were mounted at the perimeter of a remotely controlled wheel that maintained a fixed 1 mm distance between the foils and rotated a fresh pair into place after each shot. Before the next shot, a lamp temporarily illuminated the new foil’s COTR-emitting back surface, enabling it to be finely adjusted into focus. To image the COTR from the foil to detectors, we employed a ×10 infinity-corrected long-working-distance microscope objective (Mitutoyo Model 46-144; numerical aperture, 0.28; focal length, 20 mm) for all the measurements reported here. By inserting a resolution test target in place of the COTR foil, we measured almost ×40 magnification and better than 2 μm (σ) resolution at the foil surface. To extend the spectral range to *λ* = 193 nm, we temporarily used a reflective microscope objective (TECHSPEC 89-723), but detected no signals at this wavelength. All the data reported here were taken with the Mitutoyo objective.

The integrated COTR energy captured at each detector was calibrated by measuring the power of linearly polarized test lasers at 405, 532, 633 and 805 nm at the position of the COTR foil with a power meter and then exposing the cameras for both *s* and *p* polarizations for a known time, yielding a counts per nanojoule calibration. These measurements were checked against the manufacturer’s published transmission/reflection curves for beamsplitters and transmission curves for bandpass filters. The COTR is radially polarized, but encountered beamsplitters with polarization- and *λ*-dependent transmission/reflection coefficients en route to each detector. We strategically designed each optical path to deliver an *s*/*p* intensity ratio as close to unity as possible. Actual *s*/*p* ratios ranged from 0.7 to 2.0, except for *λ* = 700 nm, for which the *s*/*p* value was ~7.0. Supplementary Table 1 provides a full list. For the data analysis, each image’s actual *s*/*p* ratio was taken into account.

### e^{–} bunch reconstruction procedure

#### Differential evolution

For reconstructing the normalized 3D charge density *ρ*(*x*, *y*, *z*) of an e^{–} bunch from multispectral COTR data, we used a differential evolution^{56} algorithm, a highly parallelizable global-maximum-search algorithm that does not require computationally expensive gradient calculations. At each step ‘*n*’ of an iterative reconstruction, one calculates the COTR field \({{{{\bf{E}}}}}_{\perp }^{(n)}\) that an e^{–} bunch of candidate distribution *ρ*_{n}(*x*, *y*, *z*) propagating along *z* generates on emerging from a foil in the *x*–*y* plane. This starts with calculating the field produced at transverse displacement **r** from the point at which a single electron emerges. After a lens with numerical aperture *θ*_{max} images it to a detector, this field is^{57}

$${{{{\bf{E}}}}}_{\perp }^{({{{\rm{PSF}}}})}({{{\bf{r}}}})=\frac{2ek}{c}\int\nolimits_{0}^{{\theta }_{\max}}\frac{{\theta }^{2}{\rm{d}}\theta }{{\theta }^{2}+{\gamma }^{-2}}{J}_{1}(k\theta | {{{\bf{r}}}}| )\hat{{{{\bf{r}}}}},$$

(3)

where *e* and *γ* are the electron charge and Lorentz factor, respectively; *k* and *c* are the wave number and speed of emitted transition radiation, respectively; and *J*_{1} is a Bessel function of the first kind. Equation (3) is our point-spread function (PSF). The total transverse COTR field from an e^{–} bunch is then the convolution of *ρ*_{n}(*x*, *y*, *z*), multiplied by the phase delay e^{−ikz} of radiation from each longitudinal slice of *ρ*_{n}, with the PSF as follows:

$${{{{\bf{E}}}}}_{\perp }^{(n)}(k,x,y)=Q\int\,{\rm{d}}{x}^{{\prime} }\int\,{\rm{d}}{y}^{{\prime} }{{{{\bf{E}}}}}_{\perp }^{({{{\rm{PSF}}}})}({{{\bf{r}}}}-{{{{\bf{r}}}}}^{{\prime} })\int\,{{\rm{e}}}^{-{\rm{i}}kz}{\rho }_{n}({x}^{{\prime} },{y}^{{\prime} },z){\rm{d}}z,$$

(4)

where *Q* is the total charge and |**r**′|^{2} = *x*′^{2} + *y*′^{2}. The total COTR intensity, which is proportional to \(| {{{{\bf{E}}}}}_{\perp }^{(n)}{| }^{2}\), is then compared with the data. At the *γ* and *θ* values involved in this work, effects from the finite size of the COTR foil or milliradian-scale divergences^{57} are negligible.

#### Digital reconstruction

For automated digital reconstruction, we segmented *ρ*(*x*, *y*, *z*) into *N*_{v} voxels with Gaussian longitudinal profiles (Δ*z*_{r.m.s.} = 80 nm), effectively suppressing radiation at *λ* < 250 nm that we never observed^{27}, and transverse widths Δ*x* = Δ*y* = 1 μm, just below our resolution threshold in the foil plane. The voxel spacing was ~60 nm longitudinally and 1 μm transversely. Thus, *N*_{v} ≈ 15,000 = 21 × 21 × 35. The voxel charge was then substituted for a single-electron charge *e* in evaluating the PSF in equation (3), and the convolution integrals in equation (4) became sums over voxels. We similarly segmented the calculated \({{{{\bf{E}}}}}_{\perp }^{(n)}(k,x,y)\) into pixels for comparison with charge-coupled-device images. For each iteration, a dimensionless NSS error \({N}_{\lambda }^{-1}{\sum }_{p}{(\Delta {N}_{p})}^{2}\) is calculated from differences \(\Delta {N}_{p}={N}_{p}^{\,({\rm{c}})}-{N}_{p}^{\,({\rm{m}})}\) between the calculated (c) and measured (m) counts in pixel ‘*p*’ normalized to the summed squared count \({N}_{\lambda }={\sum }_{p}({N}_{p}^{\,({\rm{m}})})^{2}\) at each *λ*. Procedures for aligning the calculated and measured intensity distributions and for evaluating the alignment uncertainties are discussed in ref. ^{58} and in Supplementary Section 2. We generated a composite NSS for each multi-*λ* image set.

The overall cost function that reconstructions minimized was a weighted sum of this composite NSS and four additional terms. Three of the latter ensured that the reconstructed {*ρ*_{ij}} (where *i* and *j* are voxel and reconstruction labels, respectively, and brackets denote the set of values for *N*_{v} voxels) adhered within designated uncertainties to (1) a prescribed longitudinal profile from an independent measurement or simulation, (2) the total measured charge and (3) a positive-definite electron density *ρ*. The fourth ensured that it generated (4) the measured overall COTR intensity at each *λ* within the detector calibration uncertainty. The longitudinal profile term had the form \({\sum }_{\zeta }{({\Delta }_{\zeta }/{q}_{\zeta })}^{2}\varTheta ({\Delta }_{\zeta })\), where *q*_{ζ} is the prescribed charge at longitudinal position *ζ*, Δ_{ζ} is the difference *Q*_{ζ} − *q*_{ζ} between the candidate profile’s charge *Q*_{ζ} at *ζ* and *q*_{ζ} and *Θ* is the Heaviside step function. Thus, if *Q*_{ζ} > *q*_{ζ}, *Θ*(Δ_{ζ}) = 1 and *Q*_{ζ} increased the cost, whereas if *Q*_{ζ} < *q*_{ζ}, *Θ*(*q*_{ζ}) = 0 and *Q*_{ζ} incurred no cost. Similarly, the total charge term had the form Δ*Q*^{2}*Θ*(Δ*Q*), where Δ*Q* is the difference *Q* − *q*_{m} between the candidate’s *Q* and the measured *q*_{m} total charges. Although this term encouraged *Q* to converge towards *q*_{m}, we typically initialized *Q* at ~10*q*_{m} to enrich genetic diversity and hasten convergence^{56}. The positive-definite *ρ* constraint contributed a term \({\sum }_{i}{n}_{i}^{2}\varTheta (-{n}_{i})\), where *n*_{i} is the number of electrons in voxel *i*. This term is zero except where the number in a voxel is negative (that is, the charge is positive). Finally, the COTR intensity constraint contributed a term \({\sum }_{\lambda }{({\Delta }_{I\lambda }/{\epsilon }_{I\lambda })}^{2}\) to the cost function, where Δ_{Iλ} is the variation in the overall COTR intensity during a reconstruction relative to the nominal measured value at each *λ*, and *ϵ*_{Iλ} is the relative uncertainty of the measurement. Weights assigned to the various terms were strong, but finite and balanced. Underweighting led to unphysical solutions, overweighting to premature termination of a reconstruction in local minima and unbalanced weighting to over-prioritization of one term at the expense of others. Nevertheless, these criteria still left a wide range of flexibility in choosing workable weighting factors. Supplementary Fig. 2 diagrammatically depicts the digital reconstruction and cost analysis procedures.

#### Cluster analysis

Subjective judgements of the uniqueness of reconstructed {*ρ*_{ij}} for a given COTR dataset can be simply based on the visual inspection of similarities within and between small (*N* ≤ 10) groups of *N* randomly initialized reconstructions. Supplementary Fig. 4 shows an example of such a group. As *N* increases, the systematic assessments of uniqueness become necessary. Here quantitative judgements are based on cluster analysis^{44}. Specifically, we used a *K*-means clustering algorithm^{59}, which partitions *N* reconstructions {*ρ*_{ij}} (where 1 ≤ *j* ≤ *N*) for a COTR dataset into a small fixed number *K* ≪ *N* of clusters in which each {*ρ*_{ij}} belongs to the cluster with the nearest mean \(\{{\bar{\rho }}_{i\kappa }\}\), where \({\bar{\rho }}_{i\kappa }={N}_{\kappa }^{-1}\mathop{\sum }\nolimits_{j = 1}^{{N}_{\kappa }}{\rho }_{ij},{N}_{\kappa }\) is the number of reconstructions in the cluster and \(\mathop{\sum }\nolimits_{\kappa = 1}^{K}{N}_{\kappa }=N\). A simple measure of the closeness of reconstruction *j* to \(\{\,{\bar{\rho }}_{i\kappa }\}\) is the variance \({\delta }_{j\kappa }={N}_{{\rm{v}}}^{-1}\mathop{\sum }\nolimits_{i = 1}^{{N}_{{\rm{v}}}}{\left\Vert {\rho }_{ij}-{\bar{\rho }}_{i\kappa }\right\Vert }^{2}\). The average variance within cluster *κ* is then \({\Delta }_{\kappa }={N}_{\kappa }^{-1}\mathop{\sum }\nolimits_{j = 1}^{{N}_{\kappa }}{\delta }_{j\kappa }\). The algorithm then reassigns reconstructions among the *K* clusters until global variance \(\varLambda =\mathop{\sum }\nolimits_{\kappa = 1}^{K}{\Delta }_{\kappa }\) is minimized.

To visualize clusters within a set of *N* reconstructions, we generated two-dimensional plots (for example, Fig. 4e) using PCA^{60}. In PCA, each {*ρ*_{ij}} is treated as a point in *N*_{v}-dimensional space. Principal components are unit vectors, where the *m*th vector is the direction of a line that best fits the *N* points and being orthogonal to the first *m* − 1 vectors. Here best fit means that the line minimizes the average squared perpendicular distance of the *N* points to the line. The first principal component is, thus, a line along the direction of maximum variance of the *N* points. The second principal component defines the direction of maximum variance in what is left once the effect of the first component is removed. Figure 4e plots the reconstruction results with respect to the first two principal components only. Such plots help us to visualize the clusters of closely related points. This identification then provides a basis for evaluating the difference between solutions within each cluster and between different clusters.

In a *K*-means cluster analysis^{59}, *K* is an input parameter, that is, the algorithm identifies the designated number of clusters, assigning each point {*ρ*_{ij}} in a way that minimizes the global variance *Λ*. Designating *K* = 2 usually generated two well-separated clusters on a PCA plot of reconstruction sets {*ρ*_{ij}}. Members of these two clusters were approximately longitudinal mirror images of each other, that is, one is nearly a reflection of the other from a plane perpendicular to the *z* axis. Supplementary Figs. 3d and 4 show examples of this. This happens because an e^{–} bunch and its longitudinally inverted counterpart produce identical multispectral COTR data. Thus, any set of *N* reconstructions divides naturally and unavoidably into these two indistinguishable clusters of solutions. In principle, phase-sensitive measurements could distinguish them. However, the intensity measurements used here can, at best, unfold the e^{–}-beam structure to within a longitudinal mirror image of itself.

For occasional down-ramp-injected shots, a third well-separated cluster emerged and lowered *Λ* when we designated *K* = 3. Figure 4 is an example of this. In most such cases, the elements of the third cluster retained the main structural characteristics of the other clusters and was related to them through a simple geometric transformation. Supplementary Fig. 4 shows the details. Rarely, however, did further insights emerge from designating *K* ≥ 4. The main indicators that the maximum useful *K* had been designated for a given dataset were (1) designating a higher *K* did not yield a well-separated additional cluster on a PCA plot; (2) designating a higher *K* did not lower *Λ*; (3) cluster-averaged structures fit the COTR data as well as (or nearly as well as) individual reconstructions; and (4) visual inspection revealed greater structural differences between clusters than within a cluster.

Indicator (3) above provided our formal metric for uniqueness. Most down-ramp-injected shots yielded two or three clusters of reconstructions in each of which NSS of the cluster-averaged structure was nearly as small as that of individual reconstructions. This indicated that cluster members possessed common features that survived averaging. We relied on visual inspection to evaluate the geometric relationship between clusters. When, as in most cases, this relationship was simple, these reconstructions were deemed unique. Most self-injected shots, on the other hand, yielded clusters in which the NSS of the cluster-averaged structure substantially exceeded that of individual reconstructions, indicating that the reconstruction had not found a unique solution. Supplementary Figs. 7 and 8 show representative examples of this.