cuvarbase package

Submodules

cuvarbase.bls module

Implementation of the box-least squares periodogram [K2002] and variants.

[K2002]Kovacs et al. 2002
class cuvarbase.bls.BLSMemory(max_ndata, max_nfreqs, stream=None, **kwargs)[source]

Bases: object

allocate_data(ndata=None)[source]
allocate_freqs(nfreqs=None)[source]
allocate_pinned_arrays(nfreqs=None, ndata=None)[source]
classmethod fromdata(t, y, dy, qmin=None, qmax=None, freqs=None, nf=None, transfer=True, **kwargs)[source]
setdata(t, y, dy, qmin=None, qmax=None, freqs=None, nf=None, transfer=True, **kwargs)[source]
transfer_data_to_cpu()[source]
transfer_data_to_gpu(transfer_freqs=True)[source]
cuvarbase.bls.compile_bls(block_size=256, function_names=['full_bls_no_sol', 'bin_and_phase_fold_custom', 'reduction_max', 'store_best_sols', 'store_best_sols_custom', 'bin_and_phase_fold_bst_multifreq', 'binned_bls_bst'], prepare=True, **kwargs)[source]

Compile BLS kernel

Parameters:
  • block_size (int, optional (default: _default_block_size)) – CUDA threads per CUDA block.
  • function_names (list, optional (default: _all_function_names)) – Function names to load and prepare
  • prepare (bool, optional (default: True)) – Whether or not to prepare functions (for slightly faster kernel launching)
Returns:

functions – Dictionary of (function name, PyCUDA function object) pairs

Return type:

dict

cuvarbase.bls.count_tot_nbins(nbins0, nbinsf, dlogq)[source]
cuvarbase.bls.dnbins(nbins, dlogq)[source]
cuvarbase.bls.eebls_gpu(t, y, dy, freqs, qmin=0.01, qmax=0.5, nstreams=5, noverlap=3, dlogq=0.2, max_memory=None, freq_batch_size=None, functions=None, **kwargs)[source]

Box-Least Squares, accelerated with PyCUDA

Parameters:
  • t (array_like, float) – Observation times
  • y (array_like, float) – Observations
  • dy (array_like, float) – Observation uncertainties
  • freqs (array_like, float) – Frequencies
  • qmin (float or array_like) – Minimum q value(s) to test for each frequency
  • qmax (float or array_like) – Maximum q value(s) to test for each frequency
  • nstreams (int, optional (default: 5)) – Number of CUDA streams to utilize.
  • noverlap (int, optional (default: 3)) – Number of overlapping q bins to use
  • dlogq (float, optional, (default: 0.5)) – logarithmic spacing of \(q\) values, where \(d\log q = dq / q\)
  • freq_batch_size (int, optional (default: None)) – Number of frequencies to compute in a single batch; determines this automatically based on max_memory
  • max_memory (float, optional (default: None)) – Maximum memory to use in bytes. Will ignore this if freq_batch_size is specified, and will use the total free memory as returned by pycuda.driver.mem_get_info if this is None.
  • functions (tuple of CUDA functions) – returned by compile_bls
Returns:

  • bls (array_like, float) – BLS periodogram, normalized to \(1 - \chi^2(f) / \chi^2_0\)
  • qphi_sols (list of (q, phi) tuples) – Best (q, phi) solution at each frequency

cuvarbase.bls.eebls_gpu_custom(t, y, dy, freqs, q_values, phi_values, freq_batch_size=None, nstreams=5, max_memory=None, functions=None, **kwargs)[source]

Box-Least Squares, with custom q and phi values. Useful if you’re honing the initial solution or testing between a relatively small number of possible solutions.

Parameters:
  • t (array_like, float) – Observation times
  • y (array_like, float) – Observations
  • dy (array_like, float) – Observation uncertainties
  • freqs (array_like, float) – Frequencies
  • q_values (array_like) – Set of q values to search at each trial frequency
  • phi_values (float or array_like) – Set of phi values to search at each trial frequency
  • nstreams (int, optional (default: 5)) – Number of CUDA streams to utilize.
  • freq_batch_size (int, optional (default: None)) – Number of frequencies to compute in a single batch; determines this automatically by default based on max_memory
  • max_memory (float, optional (default: None)) – Maximum memory to use in bytes. Will ignore this if freq_batch_size is specified. If None, will use the free memory given by pycuda.driver.mem_get_info()
  • functions (tuple of CUDA functions) – Dictionary of prepared functions from compile_bls().
  • **kwargs – passed to compile_bls()
Returns:

  • bls (array_like, float) – BLS periodogram, normalized to 1 - chi2(best_fit) / chi2(constant)
  • qphi_sols (list of (q, phi) tuples) – Best (q, phi) solution at each frequency

cuvarbase.bls.eebls_gpu_fast(t, y, dy, freqs, qmin=0.01, qmax=0.5, functions=None, stream=None, dlogq=0.3, memory=None, noverlap=2, max_nblocks=5000, force_nblocks=None, dphi=0.0, shmem_lim=None, freq_batch_size=None, transfer_to_device=True, transfer_to_host=True, **kwargs)[source]

Box-Least Squares with PyCUDA but about 2-3 orders of magnitude faster than eebls_gpu. Uses shared memory for the binned data, which means that there is a lower limit on the q values that this function can handle.

To save memory and improve speed, the best solution is not kept. To get the best solution, run eebls_gpu at the optimal frequency.

Warning

If you are running on a single-GPU machine, there may be a kernel time limit set by your OS. If running this function produces a timeout error, try setting freq_batch_size to a reasonable number (~10). That will split up the computations by frequency.

Note

No extra global memory is needed, meaning you likely do not need to use large_run with this function.

Note

There is no noverlap parameter here yet. This is only a problem if the optimal q value is close to qmin. To alleviate this, you can run this function noverlap times with dphi = i/noverlap for the i-th run. Then take the best solution of all runs.

Parameters:
  • t (array_like, float) – Observation times
  • y (array_like, float) – Observations
  • dy (array_like, float) – Observation uncertainties
  • freqs (array_like, float) – Frequencies
  • qmin (float or array_like, optional (default: 1e-2)) – minimum q values to search at each frequency
  • qmax (float or array_like (default: 0.5)) – maximum q values to search at each frequency
  • dphi (float, optional (default: 0.)) – Phase offset (in units of the finest grid spacing). If you want noverlap bins at the smallest q value, run this function noverlap times, with dphi = i / noverlap for the i-th run and take the best solution for all the runs.
  • dlogq (float) – The logarithmic spacing of the q values to use. If negative, the q values increase by dq = qmin.
  • functions (dict) – Dictionary of compiled functions (see compile_bls())
  • freq_batch_size (int, optional (default: None)) – Number of frequencies to compute in a single batch; if None this will run a single batch for all frequencies simultaneously
  • shmem_lim (int, optional (default: None)) – Maximum amount of shared memory to use per block in bytes. This is GPU-dependent but usually around 48KB. If None, uses device information provided by PyCUDA (recommended).
  • max_nblocks (int, optional (default: 200)) – Maximum grid size to use
  • force_nblocks (int, optional (default: None)) – If this is set the gridsize is forced to be this value
  • memory (BLSMemory instance, optional (default: None)) – See BLSMemory.
  • transfer_to_host (bool, optional (default: True)) – Transfer BLS back to CPU.
  • transfer_to_device (bool, optional (default: True)) – Transfer data to GPU
  • **kwargs – passed to compile_bls
Returns:

bls – BLS periodogram, normalized to \(1 - \chi_2(\omega) / \chi_2(constant)\)

Return type:

array_like, float

cuvarbase.bls.eebls_transit_gpu(t, y, dy, fmax_frac=1.0, fmin_frac=1.0, qmin_fac=0.5, qmax_fac=2.0, fmin=None, fmax=None, freqs=None, qvals=None, use_fast=False, **kwargs)[source]

Compute BLS for timeseries assuming edge-on keplerian orbit of a planet with Mp/Ms << 1, Rp/Rs < 1, Lp/Ls << 1 and negligible eccentricity.

Parameters:
  • t (array_like, float) – Observation times
  • y (array_like, float) – Observations
  • dy (array_like, float) – Observation uncertainties
  • fmax_frac (float, optional (default: 1.0)) – Maximum frequency is fmax_frac * fmax, where fmax is automatically selected by fmax_transit.
  • fmin_frac (float, optional (default: 1.5)) – Minimum frequency is fmin_frac * fmin, where fmin is automatically selected by fmin_transit.
  • fmin (float, optional (default: None)) – Overrides automatic frequency minimum with this value
  • fmax (float, optional (default: None)) – Overrides automatic frequency maximum with this value
  • qmin_fac (float, optional (default: 0.5)) – Fraction of the fiducial q value to search at each frequency (minimum)
  • qmax_fac (float, optional (default: 2.0)) – Fraction of the fiducial q value to search at each frequency (maximum)
  • freqs (array_like, optional (default: None)) – Overrides the auto-generated frequency grid
  • qvals (array_like, optional (default: None)) – Overrides the keplerian q values
  • functions (tuple, optional (default=None)) – result of compile_bls(**kwargs).
  • use_fast (bool, optional (default: False)) –
  • **kwargs – passed to eebls_gpu, compile_bls, fmax_transit, fmin_transit, and transit_autofreq
Returns:

  • freqs (array_like, float) – Frequencies where BLS is evaluated

  • bls (array_like, float) – BLS periodogram, normalized to \(1 - \chi^2(f) / \chi^2_0\)

  • solutions (list of (q, phi) tuples) – Best (q, phi) solution at each frequency

    Note

    Only returned when use_fast=False.

cuvarbase.bls.fmax_transit(rho=1.0, qmax=0.5, **kwargs)[source]
cuvarbase.bls.fmax_transit0(rho=1.0, **kwargs)[source]
cuvarbase.bls.fmin_transit(t, rho=1.0, min_obs_per_transit=5, **kwargs)[source]
cuvarbase.bls.freq_transit(q, rho=1.0, **kwargs)[source]
cuvarbase.bls.hone_solution(t, y, dy, f0, df0, q0, dlogq0, phi0, stop=1e-05, samples_per_peak=5, max_iter=50, noverlap=3, **kwargs)[source]

Experimental!

cuvarbase.bls.nbins_iter(i, nb0, dlogq)[source]
cuvarbase.bls.q_transit(freq, rho=1.0, **kwargs)[source]
cuvarbase.bls.single_bls(t, y, dy, freq, q, phi0)[source]

Evaluate BLS power for a single set of (freq, q, phi0) parameters.

Parameters:
  • t (array_like, float) – Observation times
  • y (array_like, float) – Observations
  • dy (array_like, float) – Observation uncertainties
  • freq (float) – Frequency of the signal
  • q (float) – Transit duration in phase
  • phi0 (float) – Phase offset of transit
Returns:

bls – BLS power for this set of parameters

Return type:

float

cuvarbase.bls.transit_autofreq(t, fmin=None, fmax=None, samples_per_peak=2, rho=1.0, qmin_fac=0.2, qmax_fac=None, **kwargs)[source]

Produce list of frequencies for a given frequency range suitable for performing Keplerian BLS.

Parameters:
  • t (array_like, float) – Observation times.
  • fmin (float, optional (default: None)) – Minimum frequency. By default this is determined by fmin_transit.
  • fmax (float, optional (default: None)) – Maximum frequency. By default this is determined by fmax_transit.
  • samples_per_peak (float, optional (default: 2)) – Oversampling factor. Frequency spacing is multiplied by 1/samples_per_peak.
  • rho (float, optional (default: 1)) – Mean stellar density of host star in solar units \(\rho=\rho_{\star} / \rho_{\odot}\), where \(\rho_{\odot}\) is the mean density of the sun
  • qmin_fac (float, optional (default: 0.2)) – The minimum \(q\) value to search in units of the Keplerian \(q\) value
  • qmax_fac (float, optional (default: None)) – The maximum \(q\) value to search in units of the Keplerian \(q\) value. If None, this defaults to 1/qmin_fac.
Returns:

  • freqs (array_like) – The frequency grid
  • q0vals (array_like) – The list of Keplerian \(q\) values.

cuvarbase.ce module

Implementation of Graham et al. 2013’s Conditional Entropy period finding algorithm

class cuvarbase.ce.ConditionalEntropyAsyncProcess(*args, **kwargs)[source]

Bases: cuvarbase.core.GPUAsyncProcess

GPUAsyncProcess for the Conditional Entropy period finder

Parameters:
  • phase_bins (int, optional (default: 10)) – Number of phase bins to use.
  • mag_bins (int, optional (default: 10)) – Number of mag bins to use.
  • max_phi (float, optional (default: 3.)) – For weighted CE; skips contibutions to bins that are more than max_phi sigma away.
  • weighted (bool, optional (default: False)) – If true, uses the weighted version of the CE periodogram. Slower, but accounts for data uncertainties.
  • block_size (int, optional (default: 256)) – Number of CUDA threads per CUDA block.
  • phase_overlap (int, optional (default: 0)) – If > 0, the phase bins are overlapped with each other
  • mag_overlap (int, optional (default: 0)) – If > 0, the mag bins are overlapped with each other
  • use_fast (bool, optional (default: False)) – Use a somewhat experimental function to speed up computations. This is perfect for large Nfreqs and nobs <~ 2000. If True, use run() and not large_run() and set nstreams = 1.

Example

>>> proc = ConditionalEntropyAsyncProcess()
>>> Ndata = 1000
>>> t = np.sort(365 * np.random.rand(N))
>>> y = 12 + 0.01 * np.cos(2 * np.pi * t / 5.0)
>>> y += 0.01 * np.random.randn(len(t))
>>> dy = 0.01 * np.ones_like(y)
>>> results = proc.run([(t, y, dy)])
>>> proc.finish()
>>> ce_freqs, ce_powers = results[0]
allocate(data, freqs=None, **kwargs)[source]

Allocate GPU memory for Conditional Entropy computations

Parameters:
  • data (list of (t, y, dy) tuples) – List of data, [(t_1, y_1, w_1), ...] * t: Observation times * y: Observations * dy: Observation uncertainties
  • freqs (list, optional) – Either a list of floats (same frequencies for all data), or a list of length n=len(data), with element i of the list being a list of frequencies for the i-th lightcurve.
  • **kwargs
Returns:

allocated_memory – list of allocated memory objects for each lightcurve

Return type:

list of ConditionalEntropyMemory

allocate_for_single_lc(t, y, freqs, dy=None, stream=None, **kwargs)[source]

Allocate GPU (and possibly CPU) memory for single lightcurve

Parameters:
  • t (array_like) – Observation times
  • y (array_like) – Observations
  • freqs (array_like) – frequencies
  • dy (array_like, optional) – Observation uncertainties
  • stream (pycuda.driver.Stream, optional) – CUDA stream you want this to run on
  • **kwargs
Returns:

mem – Memory object.

Return type:

ConditionalEntropyMemory

autofrequency(*args, **kwargs)[source]

calls cuvarbase.utils.autofrequency()

batched_run_const_nfreq(data, batch_size=10, freqs=None, **kwargs)[source]

Same as batched_run but is more efficient when the frequencies are the same for each lightcurve. Doesn’t reallocate memory for each batch.

Note

To get best efficiency, make sure the maximum number of observations is not much larger than the typical number of observations.

large_run(data, freqs=None, max_memory=None, **kwargs)[source]

Run Conditional Entropy on a large frequency grid

Parameters:
  • data (list of tuples) – list of [(t, y, dy), …] containing * t: observation times * y: observations * dy: observation uncertainties
  • freqs (optional, list of np.ndarray frequencies) – List of custom frequencies. If not specified, calls autofrequency with default arguments
  • max_memory (float, optional (default: None)) – Maximum memory per batch in bytes. If None, it will use 90% of the total free memory available as specified by pycuda.driver.mem_get_info()
  • **kwargs
Returns:

results – list of (freqs, ce) corresponding to CE for each element of the data array

Return type:

list of lists

memory_requirement(data, **kwargs)[source]

Return an approximate GPU memory requirement in bytes. Will throw a NotImplementedError if called, so … don’t call it.

preallocate(max_nobs, freqs, nlcs=1, streams=None, **kwargs)[source]

Preallocate memory for future runs.

Parameters:
  • max_nobs (int) – Upper limit for the number of observations
  • freqs (array_like) – Frequency array to be used by future run calls
  • nlcs (int, optional (default: 1)) – Maximum batch size for run calls
  • streams (list of pycuda.driver.Stream) – Length of list must be >= nlcs
Returns:

self.memory – List of ConditionalEntropyMemory objects

Return type:

list

run(data, memory=None, freqs=None, set_data=True, **kwargs)[source]

Run Conditional Entropy on a batch of data.

Parameters:
  • data (list of tuples) – list of [(t, y, dy), …] containing * t: observation times * y: observations * dy: observation uncertainties
  • freqs (optional, list of np.ndarray frequencies) – List of custom frequencies. If not specified, calls autofrequency with default arguments
  • memory (optional, list of ConditionalEntropyMemory objects) – List of memory objects, length of list must be >= len(data)
  • set_data (boolean, optional (default: True)) – Transfers data to gpu if memory is provided
  • **kwargs
Returns:

results – list of (freqs, ce) corresponding to CE for each element of the data array

Return type:

list of lists

class cuvarbase.ce.ConditionalEntropyMemory(**kwargs)[source]

Bases: future.types.newobject.newobject

allocate(**kwargs)[source]
allocate_bins(**kwargs)[source]
allocate_buffered_data_arrays(**kwargs)[source]
allocate_data(**kwargs)[source]
allocate_freqs(**kwargs)[source]
allocate_pinned_cpu(**kwargs)[source]
balance_magbins(y, **kwargs)[source]
compute_mag_bin_fracs(y, **kwargs)[source]
fromdata(t, y, **kwargs)[source]
set_gpu_arrays_to_zero(**kwargs)[source]
setdata(t, y, **kwargs)[source]
transfer_ce_to_cpu(**kwargs)[source]
transfer_data_to_gpu(**kwargs)[source]
transfer_freqs_to_gpu(**kwargs)[source]
cuvarbase.ce.conditional_entropy(memory, functions, block_size=256, transfer_to_host=True, transfer_to_device=True, **kwargs)[source]
cuvarbase.ce.conditional_entropy_fast(memory, functions, block_size=256, transfer_to_host=True, transfer_to_device=True, freq_batch_size=None, shmem_lc=True, shmem_lim=None, max_nblocks=200, force_nblocks=None, stream=None, **kwargs)[source]

cuvarbase.core module

class cuvarbase.core.GPUAsyncProcess(*args, **kwargs)[source]

Bases: future.types.newobject.newobject

batched_run(data, batch_size=10, **kwargs)[source]

Run your data in batches (avoids memory problems)

finish()[source]

synchronize all active streams

run(*args, **kwargs)[source]

cuvarbase.cunfft module

class cuvarbase.cunfft.NFFTAsyncProcess(*args, **kwargs)[source]

Bases: cuvarbase.core.GPUAsyncProcess

GPUAsyncProcess for the adjoint NFFT.

Parameters:
  • sigma (float, optional (default: 2)) – Size of NFFT grid will be NFFT_SIZE * sigma
  • m (int, optional (default: 8)) – Maximum radius for grid contributions (by default, this value will automatically be set based on a specified error tolerance)
  • autoset_m (bool, optional (default: True)) – Automatically set the m parameter based on the error tolerance given by the m_tol parameter
  • tol (float, optional (default: 1E-8)) – Error tolerance for the NFFT (used to auto set m)
  • block_size (int, optional (default: 256)) – CUDA block size.
  • use_double (bool, optional (default: False)) – Use double precision. On non-Tesla cards this will make things ~24 times slower.
  • use_fast_math (bool, optional (default: True)) – Compile kernel with the --use_fast_math option supplied to nvcc.

Example

>>> import numpy as np
>>> t = np.random.rand(100)
>>> y = np.cos(10 * t - 0.4) + 0.1 * np.random.randn(len(t))
>>> proc = NFFTAsyncProcess()
>>> data = [(t, y, 2 * len(t))]
>>> nfft_adjoint = proc.run(data)
allocate(data, **kwargs)[source]

Allocate GPU memory for NFFT-related computations

Parameters:
  • data (list of (t, y, N) tuples) – List of data, [(t_1, y_1, N_1), ...] * t: Observation times. * y: Observations. * nf: int, FFT size
  • **kwargs
Returns:

allocated_memory – List of allocated memory for each dataset

Return type:

list of NFFTMemory objects

estimate_m(N)[source]

Estimate m based on an error tolerance of self.tol.

Parameters:N (int) – size of NFFT
Returns:m – Maximum grid radius
Return type:int

Notes

Pulled from <https://github.com/jakevdp/nfft>_.

get_m(N=None)[source]

Returns the m value for N frequencies.

Parameters:N (int) – Number of frequencies, only needed if autoset_m is False.
Returns:m – The filter radius (in grid points)
Return type:int
m_from_C(C, sigma)[source]

Returns an estimate for what m value to use from C, where C is something like err_tolerance/N_freq.

Pulled from <https://github.com/jakevdp/nfft>_

run(data, memory=None, **kwargs)[source]

Run the adjoint NFFT on a batch of data

Parameters:
  • data (list of tuples) – list of [(t, y, w), …] containing * t: observation times * y: observations * nf: int, size of NFFT
  • memory
  • **kwargs
Returns:

powers – List of adjoint NFFTs

Return type:

list of np.ndarrays

class cuvarbase.cunfft.NFFTMemory(sigma, stream, m, use_double=False, precomp_psi=True, **kwargs)[source]

Bases: future.types.newobject.newobject

allocate(**kwargs)[source]
allocate_data(**kwargs)[source]
allocate_grid(**kwargs)[source]
allocate_pinned_cpu(**kwargs)[source]
allocate_precomp_psi(**kwargs)[source]
fromdata(t, y, allocate=True, **kwargs)[source]
is_ready()[source]
transfer_data_to_gpu(**kwargs)[source]
transfer_nfft_to_cpu(**kwargs)[source]
cuvarbase.cunfft.nfft_adjoint_async(memory, functions, minimum_frequency=0.0, block_size=256, just_return_gridded_data=False, use_grid=None, fast_grid=True, transfer_to_device=True, transfer_to_host=True, precomp_psi=True, samples_per_peak=1, **kwargs)[source]

Asynchronous NFFT adjoint operation.

Use the NFFTAsyncProcess class and related subroutines when possible.

Parameters:
  • memory (NFFTMemory) – Allocated memory, must have data already set (see, e.g., NFFTAsyncProcess.allocate())
  • functions (tuple, length 5) – Tuple of compiled functions from SourceModule. Must be prepared with their appropriate dtype.
  • minimum_frequency (float, optional (default: 0)) – First frequency of transform
  • block_size (int, optional) – Number of CUDA threads per block
  • just_return_gridded_data (bool, optional) – If True, returns grid via grid_g.get() after gridding
  • use_grid (GPUArray, optional) – If specified, will skip gridding procedure and use the GPUArray provided
  • fast_grid (bool, optional, default: True) – Whether or not to use the “fast” gridding procedure
  • transfer_to_device (bool, optional, (default: True)) – If the data is already on the gpu, set as False
  • transfer_to_host (bool, optional, (default: True)) – If False, will not transfer the resulting nfft to CPU memory
  • precomp_psi (bool, optional, (default: True)) – Only relevant if fast is True. Will precompute values for the fast gridding procedure.
  • samples_per_peak (float, optional (default: 1)) – Frequency spacing is reduced by this factor, but number of frequencies is kept the same
Returns:

ghat_cpu – The resulting NFFT

Return type:

np.array

cuvarbase.lombscargle module

class cuvarbase.lombscargle.LombScargleAsyncProcess(*args, **kwargs)[source]

Bases: cuvarbase.core.GPUAsyncProcess

GPUAsyncProcess for the Lomb Scargle periodogram

Parameters:**kwargs (passed to NFFTAsyncProcess) –

Example

>>> proc = LombScargleAsyncProcess()
>>> Ndata = 1000
>>> t = np.sort(365 * np.random.rand(N))
>>> y = 12 + 0.01 * np.cos(2 * np.pi * t / 5.0)
>>> y += 0.01 * np.random.randn(len(t))
>>> dy = 0.01 * np.ones_like(y)
>>> freqs, powers = proc.run([(t, y, dy)])
>>> proc.finish()
>>> ls_freqs, ls_powers = freqs[0], powers[0]
allocate(data, nfreqs=None, k0s=None, **kwargs)[source]

Allocate GPU memory for Lomb Scargle computations

Parameters:
  • data (list of (t, y, dy) tuples) – List of data, [(t_1, y_1, dy_1), ...] * t: Observation times * y: Observations * dy: Observation uncertainties
  • **kwargs
Returns:

allocated_memory – list of allocated memory objects for each lightcurve

Return type:

list of LombScargleMemory

allocate_for_single_lc(t, y, dy, nf, k0=0, stream=None, **kwargs)[source]

Allocate GPU (and possibly CPU) memory for single lightcurve

Parameters:
  • t (array_like) – Observation times
  • y (array_like) – Observations
  • dy (array_like) – Observation uncertainties
  • nf (int) – Number of frequencies
  • k0 (int) – The starting index for the Fourier transform. The minimum frequency f0 = k0 * df, where df is the frequency spacing
  • stream (pycuda.driver.Stream) – CUDA stream you want this to run on
  • **kwargs
Returns:

mem – Memory object.

Return type:

LombScargleMemory

autofrequency(*args, **kwargs)[source]
batched_run_const_nfreq(data, batch_size=10, use_fft=True, freqs=None, **kwargs)[source]

Same as batched_run but is more efficient when the frequencies are the same for each lightcurve. Doesn’t reallocate memory for each batch.

Notes

To get best efficiency, make sure the maximum number of observations is not much larger than the typical number of observations

memory_requirement(n0, nf, k0, nbatch=1, autoadjust_sigma=False, **kwargs)[source]

return an approximate GPU memory requirement in bytes

preallocate(max_nobs, nlcs=1, nf=None, k0=None, freqs=None, streams=None, **kwargs)[source]
run(data, use_fft=True, memory=None, freqs=None, **kwargs)[source]

Run Lomb Scargle on a batch of data.

Parameters:
  • data (list of tuples) – list of [(t, y, dy), …] containing * t: observation times * y: observations * dy: observation uncertainties
  • freqs (optional, list of np.ndarray frequencies) – List of custom frequencies. Right now, this has to be linearly spaced with freqs[0] / (freqs[1] - freqs[0]) being an integer.
  • memory (optional, list of LombScargleMemory objects) – List of memory objects, length of list must be >= len(data)
  • use_fft (optional, bool (default: True)) – Uses the NFFT, otherwise just does direct summations (which are quite slow…)
  • floating_mean (optional, bool (default: True)) – Add a floating mean to the model (see Zechmeister & Kurster 2009)
  • window (optional, bool (default: False)) – If true, computes the window function for the data instead of Lomb-Scargle
  • amplitude_prior (optional, float (default: None)) – If not None, sets the variance of a Gaussian prior on the amplitude (sometimes useful for suppressing aliases)
  • **kwargs
Returns:

results – list of (freqs, pows) for each LS periodogram

Return type:

list of lists

class cuvarbase.lombscargle.LombScargleMemory(sigma, stream, m, **kwargs)[source]

Bases: future.types.newobject.newobject

Container class for allocating memory and transferring data between the GPU and CPU for Lomb-Scargle computations

Parameters:
  • sigma (int) – The sigma parameter for the NFFT
  • stream (pycuda.driver.Stream instance) – The CUDA stream used for calculations/data transfer
  • m (int) – The m parameter for the NFFT
allocate(**kwargs)[source]

Allocate all memory necessary

allocate_buffered_data_arrays(**kwargs)[source]

Allocates pinned memory for lightcurves if we’re reusing this container

allocate_data(**kwargs)[source]

Allocates memory for lightcurve

allocate_grids(**kwargs)[source]

Allocates memory for NFFT grids, NFFT precomputation vectors, and the GPU vector for the Lomb-Scargle power

allocate_pinned_cpu(**kwargs)[source]

Allocates pinned CPU memory for asynchronous transfer of result

fromdata(**kwargs)[source]

Sets and (optionally) allocates memory for data

is_ready()[source]

don’t use this.

set_gpu_arrays_to_zero(**kwargs)[source]

Sets all gpu arrays to zero

setdata(**kwargs)[source]

Sets the value of the data arrays.

transfer_data_to_gpu(**kwargs)[source]

Transfers the lightcurve to the GPU

transfer_lsp_to_cpu(**kwargs)[source]

Asynchronous transfer of LSP result to CPU

cuvarbase.lombscargle.add_regularization(sums, amplitude_priors=None, cn0=None, sn0=None)[source]

Add regularization to sums. See Zechmeister & Kuerster 2009 for details about notation.

Parameters:
  • sums (tuple of array_like) – C, S, CC, CS, SS, YC, YS. See mhdirect_sums for more information.
  • amplitude_priors (float or array_like, optional) – Corresponds to standard deviation of a Gaussian prior on the amplitudes of all harmonics (if its a float), or for each of the harmonics (if it’s an array_like).
  • cn0 (array_like) – Location of the centroid of the Gaussian amplitude prior on each of the cosine amplitudes
  • sn0 (array_like) – Location of the centroid of the Gaussian amplitude prior on each of the sine amplitudes
Returns:

C, S, CC, CS, SS, YC, YS – The set of sums with regularization added.

Return type:

array_like

See also

mhdirect_sums()

cuvarbase.lombscargle.check_k0(freqs, k0=None, rtol=0.01, atol=1e-07)[source]
cuvarbase.lombscargle.fap_baluev(t, dy, z, fmax, d_K=3, d_H=1, use_gamma=True)[source]

False alarm probability for periodogram peak based on Baluev (2008) [2008MNRAS.385.1279B]

Parameters:
  • t (array_like) – Observation times.
  • dy (array_like) – Observation uncertainties.
  • z (array_like or float) – Periodogram value(s)
  • fmax (float) – Maximum frequency searched
  • d_K (int, optional (default: 3)) – Number of degrees of fredom for periodgram model. 2H - 1 where H is the number of harmonics
  • d_H (int, optional (default: 1)) – Number of degrees of freedom for default model.
  • use_gamma (bool, optional (default: True)) – Use gamma function for computation of numerical coefficient; replaced with scipy.special.gammaln and should be stable now
Returns:

fap – False alarm probability

Return type:

float

Example

>>> rand = np.random.RandomState(100)
>>> t = np.sort(rand.rand(100))
>>> y = 12 + 0.01 * np.cos(2 * np.pi * 10. * t)
>>> dy = 0.01 * np.ones_like(y)
>>> y += dy * rand.rand(len(t))
>>> proc = LombScargleAsyncProcess()
>>> results = proc.run([(t, y, dy)])
>>> freqs, powers = results[0]
>>> fap_baluev(t, dy, powers, max(freqs))
cuvarbase.lombscargle.get_k0(freqs)[source]
cuvarbase.lombscargle.lomb_scargle_async(memory, functions, freqs, block_size=256, use_fft=True, python_dir_sums=False, transfer_to_device=True, transfer_to_host=True, window=False, **kwargs)[source]

Asynchronous Lomb Scargle periodogram

Use the LombScargleAsyncProcess class and related subroutines when possible.

Parameters:
  • memory (LombScargleMemory) – Allocated memory, must have data already set (see, e.g., LombScargleAsyncProcess.allocate())
  • functions (tuple (lombscargle_functions, nfft_functions)) – Tuple of compiled functions from SourceModule. Must be prepared with their appropriate dtype.
  • freqs (array_like, optional (default: 0)) – Linearly-spaced frequencies starting at an integer multiple of the frequency spacing (i.e. freqs = df * (k0 + np.arange(nf)))
  • block_size (int, optional) – Number of CUDA threads per block
  • use_fft (bool, optional (default: True)) – If False, uses direct sums.
  • python_dir_sums (bool, optional (default: False)) – If True, performs direct sums with Python on the CPU
  • transfer_to_device (bool, optional, (default: True)) – If the data is already on the gpu, set as False
  • transfer_to_host (bool, optional, (default: True)) – If False, will not transfer the resulting periodogram to CPU memory
  • window (bool, optional (default: False)) – If True, computes the window function for the data
Returns:

lsp_c – The resulting periodgram (memory.lsp_c)

Return type:

np.array

cuvarbase.lombscargle.lomb_scargle_direct_sums(t, yw, w, freqs, YY, nharms=1, **kwargs)[source]

Compute Lomb-Scargle periodogram using direct summations. This is usually only useful for debugging and/or small numbers of frequencies.

Parameters:
  • t (array_like) – Observation times.
  • yw (array_like) – Observations multiplied by their corresponding weights. sum(yw) is the weighted mean.
  • w (array_like) – Weights for each of the observations. Usually proportional to 1/sigma ** 2 where sigma is the observation uncertainties. Normalized so that sum(w) = 1.
  • freqs (array_like) – Trial frequencies to evaluate the periodogram
  • YY (float) – Weighted variance of the observations.
  • nharms (int, optional (default: 1)) – Number of harmonics to use in the model. Lomb Scargle only uses 1, but more harmonics allow for greater model flexibility at the cost of higher complexity and therefore reduced signal- to-noise.
Returns:

power – The periodogram powers at each of the trial frequencies

Return type:

array_like

cuvarbase.lombscargle.lomb_scargle_simple(t, y, dy, **kwargs)[source]

Simple lomb-scargle interface for testing that things work on the GPU. Note: This will be substantially slower than working with the LombScargleAsyncProcess interface.

cuvarbase.lombscargle.mhdirect_sums(t, yw, w, freq, YY, nharms=1)[source]

Compute the set of frequency-dependent sums for (multi-harmonic) Lomb Scargle at a given frequency

Parameters:
  • t (array_like) – Observation times.
  • yw (array_like) – Observations multiplied by their corresponding weights. sum(yw) is the weighted mean.
  • w (array_like) – Weights for each of the observations. Usually proportional to 1/sigma ** 2 where sigma is the observation uncertainties. Normalized so that sum(w) = 1.
  • freq (float) – Signal frequency.
  • YY (float) – Weighted variance of the observations.
  • nharms (int, optional (default: 1)) – Number of harmonics to compute. This is 1 for the standard Lomb-Scargle
Returns:

C, S, CC, CS, SS, YC, YS – The set of sums with regularization added.

Return type:

array_like

See also

mhdirect_sums()

cuvarbase.lombscargle.mhgls_from_sums(sums, YY, ybar)[source]

Compute multiharmonic periodogram power from set of sums. See Zechmeister & Kuerster 2009 for details about notation.

Parameters:
  • sums (tuple of array_like) – C, S, CC, CS, SS, YC, YS. See mhdirect_sums for more information.
  • YY (float) – Weighted variance of y - ybar, where ybar is the weighted mean and y are the observations
  • ybar (float) – Weighted mean of the data (np.dot(w, y)), where w are the weights and y are the observations.
Returns:

power – periodogram power

Return type:

float

See also

mhdirect_sums()

cuvarbase.lombscargle.mhgls_params_from_sums(sums, YY, ybar)[source]

Compute optimal amplitudes and offset from set of sums. See Zechmeister & Kuerster 2009 for details about notation.

Parameters:
  • sums (tuple of array_like) – C, S, CC, CS, SS, YC, YS. See mhdirect_sums for more information.
  • YY (float) – Weighted variance of y - ybar, where ybar is the weighted mean and y are the observations
  • ybar (float) – Weighted mean of the data (np.dot(w, y)), where w are the weights and y are the observations.
Returns:

  • cn (array_like) – Cosine amplitudes of each harmonic
  • sn (array_like) – Sine amplitudes of each harmonic
  • offset (float) – Constant offset

See also

mhdirect_sums()

cuvarbase.pdm module

class cuvarbase.pdm.PDMAsyncProcess(*args, **kwargs)[source]

Bases: cuvarbase.core.GPUAsyncProcess

allocate(data)[source]
run(data, gpu_data=None, pow_cpus=None, kind='binned_linterp', nbins=10, **pdm_kwargs)[source]
cuvarbase.pdm.binless_pdm_cpu(t, y, w, freqs, dphi=0.05)[source]
cuvarbase.pdm.binned_pdm_model(t, y, w, freq, nbins, linterp=True)[source]
cuvarbase.pdm.pdm2_cpu(t, y, w, freqs, nbins=30, linterp=True)[source]
cuvarbase.pdm.pdm2_single_freq(t, y, w, freq, nbins=30, linterp=True)[source]
cuvarbase.pdm.pdm_async(stream, data_cpu, data_gpu, pow_cpu, function, dphi=0.05, block_size=256)[source]
cuvarbase.pdm.var_binned(t, y, w, freq, nbins, linterp=True)[source]
cuvarbase.pdm.var_tophat(t, y, w, freq, dphi)[source]

cuvarbase.utils module

cuvarbase.utils.autofrequency(t, nyquist_factor=5, samples_per_peak=5, minimum_frequency=None, maximum_frequency=None, **kwargs)[source]

Determine a suitable frequency grid for data.

Note that this assumes the peak width is driven by the observational baseline, which is generally a good assumption when the baseline is much larger than the oscillation period. If you are searching for periods longer than the baseline of your observations, this may not perform well.

Even with a large baseline, be aware that the maximum frequency returned is based on the concept of “average Nyquist frequency”, which may not be useful for irregularly-sampled data. The maximum frequency can be adjusted via the nyquist_factor argument, or through the maximum_frequency argument.

Parameters:
  • samples_per_peak (float (optional, default=5)) – The approximate number of desired samples across the typical peak
  • nyquist_factor (float (optional, default=5)) – The multiple of the average nyquist frequency used to choose the maximum frequency if maximum_frequency is not provided.
  • minimum_frequency (float (optional)) – If specified, then use this minimum frequency rather than one chosen based on the size of the baseline.
  • maximum_frequency (float (optional)) – If specified, then use this maximum frequency rather than one chosen based on the average nyquist frequency.
Returns:

frequency – The heuristically-determined optimal frequency bin

Return type:

ndarray or Quantity

cuvarbase.utils.dphase(dt, freq)[source]
cuvarbase.utils.find_kernel(name)[source]
cuvarbase.utils.gaussian_window(t, t0, d)[source]
cuvarbase.utils.get_autofreqs(t, **kwargs)[source]
cuvarbase.utils.tophat_window(t, t0, d)[source]
cuvarbase.utils.weights(err)[source]

generate observation weights from uncertainties

Module contents