Data Abstraction¶
The core data primitive in PyARPES is the xarray.DataArray
. However, adding additional scientific functionality is needed since xarray
provides only very general functionality. The approach that we take is described in some detail in the xarray
documentation at extending xarray, which allows putting additional functionality on all arrays and datasets on particular, registered attributes.
In PyARPES we use a few of these:
.S
attribute: functionality associated with spectra (physics here).G
attribute: general abstract functionality that could reasonably be a part of xarray core.F
attribute: functionality associated with curve fitting.X
attribute: functionality related to selecting data
Caveat: In general these accessors can and do behave slightly differently between datasets and arrays, depending on what makes contextual sense.
This section will describe just some of the functionality provided by the .S
attribute, while the following section will describe some of the functionality on .G
and the section on curve fitting describes much of what is available through .F
and .X
.
Much more can be learned about them by viewing the definitions in arpes.xarray_extensions
.
Data selection¶
select_around
and select_around_data
¶
As an alternative to interpolating, you can integrate in a small rectangular or ellipsoidal region around a point using .S.select_around
. You can also do this for a sequence of points using .S.select_around_data
.
These functions can be run in either summing or averaging mode using either mode='sum'
or mode='mean'
respectively. Using the radius parameter you can specify the integration radius in pixels (int
) or in unitful (float
) values for all (pass a single value) or for specific (dict
) axes.
select_around_data
operates in the same way, except that instead of passing a single point, select_around_data
expects a dictionary or Dataset mapping axis names to iterable collections of coordinates.
As a concrete example, let’s consider the example_data.temperature_dependence
dataset with axes (eV, phi, T)
consisting of cuts at different temperatures. Suppose we wish to obtain EDCs at the Fermi momentum for each value of the temperature.
First we will load the data, and combine datasets to get a full temperature dependence.
[2]:
from arpes.io import example_data
from matplotlib import pyplot as plt
temp_dep = example_data.temperature_dependence
near_ef = temp_dep.sel(eV=slice(-0.05, 0.05), phi=slice(-0.2, None)).sum("eV")
near_ef.S.plot()
Finding \(\phi_F\)/\(k_F\)¶
Now, we want to find the location of the peak in each slice of temperature so we know where to take EDCs.
We will do this in two ways:
Taking the argmax across
phi
Curve fitting MDCs
Argmax¶
[3]:
argmax_indices = near_ef.spectrum.argmax(dim="phi")
argmax_phis = near_ef.phi[argmax_indices]
fig, ax = plt.subplots()
near_ef.S.plot(ax=ax)
ax.scatter(*argmax_phis.G.to_arrays()[::-1], color="red")
[3]:
<matplotlib.collections.PathCollection at 0x1bfc833f4c0>
In the process of plotting we saw another convenience function .G.to_arrays()
which turns a 1D dataset into two plain arrays of its coordinates (X) and values (Y) so that we could scatter phi
as a function of temperature
.
Curve fitting¶
This might be okay depending on what we are doing, but curve fitting is also straightforward and gives better results.
[4]:
from arpes.fits.utilities import broadcast_model
from arpes.fits.fit_models import LorentzianModel, AffineBackgroundModel
phis = broadcast_model(
# Fit with two components, a linear background and a Lorentzian peak
[AffineBackgroundModel, LorentzianModel],
# fit across the entire `near_ef` dataset
near_ef,
# fit along each value of "temperature"
"temperature"
)
phis
Running on multiprocessing pool... this may take a while the first time.
Deserializing...
Finished deserializing
[4]:
<xarray.Dataset> Dimensions: (phi: 30, temperature: 34) Coordinates: * temperature (temperature) float64 16.0 18.0 20.0 ... 90.06 95.44 99.98 x float64 0.193 y float64 -1.298 z float64 -2.939 theta float64 0.2273 beta float64 0.0 chi float64 -0.2618 hv float64 55.4 alpha float64 1.571 psi int32 0 * phi (phi) float64 -0.1978 -0.1946 -0.1913 ... -0.1063 -0.103 Data variables: results (temperature) object <lmfit.model.ModelResult object at 0x... data (temperature, phi) float64 1.535e+03 1.591e+03 ... 3.066e+03 residual (temperature, phi) float64 -126.9 -29.14 ... 7.805 -29.28 norm_residual (temperature, phi) float64 -0.08268 -0.01831 ... -0.009548
- phi: 30
- temperature: 34
- temperature(temperature)float6416.0 18.0 20.0 ... 95.44 99.98
array([16. , 18. , 20. , 22. , 24. , 26. , 28. , 30. , 32. , 34. , 36. , 38. , 40. , 42. , 44. , 46. , 48. , 50. , 52. , 54. , 56. , 58. , 60. , 62. , 64. , 66. , 68. , 70. , 74.88, 80.02, 84.97, 90.06, 95.44, 99.98])
- x()float640.193
array(0.193)
- y()float64-1.298
array(-1.298)
- z()float64-2.939
array(-2.939)
- theta()float640.2273
array(0.22728807)
- beta()float640.0
array(0.)
- chi()float64-0.2618
array(-0.26179939)
- hv()float6455.4
array(55.4)
- alpha()float641.571
array(1.57079633)
- psi()int320
array(0)
- phi(phi)float64-0.1978 -0.1946 ... -0.1063 -0.103
array([-0.197836, -0.194566, -0.191296, -0.188026, -0.184756, -0.181487, -0.178217, -0.174947, -0.171677, -0.168407, -0.165137, -0.161868, -0.158598, -0.155328, -0.152058, -0.148788, -0.145518, -0.142249, -0.138979, -0.135709, -0.132439, -0.129169, -0.125899, -0.12263 , -0.11936 , -0.11609 , -0.11282 , -0.10955 , -0.106281, -0.103011])
- results(temperature)object<lmfit.model.ModelResult object ...
array([<lmfit.model.ModelResult object at 0x000001BFC8484700>, <lmfit.model.ModelResult object at 0x000001BFC8484220>, <lmfit.model.ModelResult object at 0x000001BFC84A4250>, <lmfit.model.ModelResult object at 0x000001BFC84B1070>, <lmfit.model.ModelResult object at 0x000001BFC84BD070>, <lmfit.model.ModelResult object at 0x000001BFC84BDDC0>, <lmfit.model.ModelResult object at 0x000001BFC84C6970>, <lmfit.model.ModelResult object at 0x000001BFC84CF160>, <lmfit.model.ModelResult object at 0x000001BFC84DD370>, <lmfit.model.ModelResult object at 0x000001BFC84EB160>, <lmfit.model.ModelResult object at 0x000001BFC84EBF70>, <lmfit.model.ModelResult object at 0x000001BFC84F6D00>, <lmfit.model.ModelResult object at 0x000001BFC84FCA90>, <lmfit.model.ModelResult object at 0x000001BFC85098E0>, <lmfit.model.ModelResult object at 0x000001BFC85146A0>, <lmfit.model.ModelResult object at 0x000001BFC851E2E0>, <lmfit.model.ModelResult object at 0x000001BFC852E0D0>, <lmfit.model.ModelResult object at 0x000001BFC85390D0>, <lmfit.model.ModelResult object at 0x000001BFC8539E20>, <lmfit.model.ModelResult object at 0x000001BFC8542AC0>, <lmfit.model.ModelResult object at 0x000001BFC854B9D0>, <lmfit.model.ModelResult object at 0x000001BFC8557880>, <lmfit.model.ModelResult object at 0x000001BFC8560700>, <lmfit.model.ModelResult object at 0x000001BFC856B520>, <lmfit.model.ModelResult object at 0x000001BFC85742B0>, <lmfit.model.ModelResult object at 0x000001BFC857E130>, <lmfit.model.ModelResult object at 0x000001BFC8574280>, <lmfit.model.ModelResult object at 0x000001BFC858EF70>, <lmfit.model.ModelResult object at 0x000001BFC8599D60>, <lmfit.model.ModelResult object at 0x000001BFC85A5BE0>, <lmfit.model.ModelResult object at 0x000001BFC85AFAF0>, <lmfit.model.ModelResult object at 0x000001BFC85B79A0>, <lmfit.model.ModelResult object at 0x000001BFC85C1820>, <lmfit.model.ModelResult object at 0x000001BFC85CC6A0>], dtype=object)
- data(temperature, phi)float641.535e+03 1.591e+03 ... 3.066e+03
- df :
- None
array([[1535.46875, 1591.46875, 1707. , ..., 3865.75 , 3727.65625, 3733.125 ], [1570.875 , 1604.15625, 1703.8125 , ..., 3875.75 , 3812.875 , 3782.78125], [1517.40625, 1595.125 , 1731.1875 , ..., 3909.96875, 3831.96875, 3747.53125], ..., [1628.09375, 1702.9375 , 1840.46875, ..., 3246.53125, 3207.84375, 3123.71875], [1616.84375, 1749.28125, 1846.375 , ..., 3157.5625 , 3119.21875, 3122.5 ], [1617. , 1774.71875, 1939.96875, ..., 3050.03125, 3046.625 , 3066.5 ]])
- residual(temperature, phi)float64-126.9 -29.14 ... 7.805 -29.28
- df :
- None
array([[-126.94994883, -29.13693519, 18.63945339, ..., -36.19510402, 65.35760567, 34.96075945], [-142.4684627 , -31.49782168, 22.3883113 , ..., -28.44311232, 23.97159731, 55.04812047], [-117.50952006, -40.1892131 , -11.76667984, ..., -39.63703342, 12.44967992, 82.29023245], ..., [ -52.47442691, 12.91062704, 26.19396169, ..., -50.98752256, -34.18617977, 37.20571017], [ -48.71434251, -27.44069062, 39.00174343, ..., -6.21762024, 2.091586 , -22.56193818], [ -18.62200775, -17.47872085, -14.00076949, ..., 29.78693761, 7.80522438, -29.28022723]])
- norm_residual(temperature, phi)float64-0.08268 -0.01831 ... -0.009548
array([[-0.0826783 , -0.0183082 , 0.01091942, ..., -0.00936302, 0.01753316, 0.00936501], [-0.0906937 , -0.01963513, 0.01314013, ..., -0.00733874, 0.00628701, 0.01455229], [-0.07744104, -0.02519502, -0.00679688, ..., -0.01013743, 0.0032489 , 0.02195852], ..., [-0.03223059, 0.00758139, 0.01423222, ..., -0.01570523, -0.01065706, 0.01191071], [-0.03012928, -0.01568684, 0.02112341, ..., -0.00196912, 0.00067055, -0.0072256 ], [-0.01151639, -0.00984873, -0.00721701, ..., 0.00976611, 0.00256192, -0.00954842]])
There’s a lot here to digest, the result of our curve fit also produced a xr.Dataset
! This is because it bundles the fitting results (a 1D array of the fitting instances), the original data, and the residual together.
We will see more about all of this in the curve fitting section to follow. For now, let’s just get the peak centers.
[5]:
fig, ax = plt.subplots()
near_ef.S.plot(ax=ax)
ax.scatter(*phis.F.p("b_center").G.to_arrays()[::-1], color="red")
[5]:
<matplotlib.collections.PathCollection at 0x1bfc8668af0>
This looks a lot cleaner, although our peaks may be biased too far to negative phi
due to the asymmetric background.
With the values of the Fermi momentum (in angle space) now in hand, we can select EDCs at the appropriate momentum for each value of the temperature.
Let’s average in a 10 mrad window.
[6]:
# take the Lorentzian (component `b`) center parameter
phi_values = phis.F.p("b_center")
temp_dep.spectrum.S.select_around_data(
{"phi": phi_values}, mode="mean", fast=True, radius={"phi": 0.005}).S.plot()
Exercises¶
Change the
phi
range of the selection to see how the fit responds. Can we deal with the asymmetric background this way?Inspect the first fit with
phis.results[0].item()
. What can you tell about the fit?Select a region of the temperature dependent data away from the band. Perform a broadcast fit for the Fermi edge using
arpes.fits.fit_models.AffineBroadenedFD
. Does the edge position shift at all? Does the edge width change at all? Look at the previous exercise to determine which parameters to look at.