3: github.com/mumax/3/cuda Index | Files | Directories

package cuda

import "github.com/mumax/3/cuda"

Package cuda provides GPU interaction


Package Files

alloc.go angles.go angles_wrapper.go anisotropy.go buffer.go bytes.go conv_common.go conv_copypad.go conv_demag.go conv_kernmul.go conv_mfm.go conv_selftest.go copypadmul2_wrapper.go copyunpad_wrapper.go crop.go crop_wrapper.go crossproduct.go crossproduct_wrapper.go cubicanisotropy2_wrapper.go div_wrapper.go dmi.go dmi_wrapper.go dmibulk.go dmibulk_wrapper.go dotproduct.go dotproduct_wrapper.go exchange.go exchange_wrapper.go exchangedecode_wrapper.go fatbin.go fft3dc2r.go fft3dr2c.go fftplan.go init.go kernmulc_wrapper.go kernmulrsymm2dxy_wrapper.go kernmulrsymm2dz_wrapper.go kernmulrsymm3d_wrapper.go llnoprecess_wrapper.go lltorque.go lltorque2_wrapper.go lut.go madd.go madd2_wrapper.go madd3_wrapper.go magnetoelastic.go magnetoelasticfield_wrapper.go magnetoelasticforce_wrapper.go maxangle.go maxangle_wrapper.go minimize.go minimize_wrapper.go mslice.go mul_wrapper.go normalize.go normalize_wrapper.go phi_wrapper.go reduce.go reducedot_wrapper.go reducemaxabs_wrapper.go reducemaxdiff_wrapper.go reducemaxvecdiff2_wrapper.go reducemaxvecnorm2_wrapper.go reducesum_wrapper.go region.go regionadds_wrapper.go regionaddv_wrapper.go regiondecode_wrapper.go regionselect_wrapper.go resize.go resize_wrapper.go shift.go shiftbytes_wrapper.go shiftbytesy_wrapper.go shiftx_wrapper.go shifty_wrapper.go shiftz_wrapper.go slice.go slonczewski.go slonczewski2_wrapper.go temperature.go temperature2_wrapper.go theta_wrapper.go topologicalcharge.go topologicalcharge_wrapper.go uniaxialanisotropy2_wrapper.go util.go zeromask.go zeromask_wrapper.go zhangli.go zhangli2_wrapper.go


const (
    BlockSize    = 512
    TileX, TileY = 32, 32
    MaxGridSize  = 65535

CUDA Launch parameters. there might be better choices for recent hardware, but it barely makes a difference in the end.

const (
    X   = 0
    Y   = 1
    Z   = 2
const CONV_TOLERANCE = 1e-6

Maximum tolerable error on demag convolution self-test.


Maximum tolerable imaginary/real part for demag kernel in Fourier space. Assures kernel has correct symmetry.


Block size for reduce kernels.


var (
    Version     int    // cuda version
    DevName     string // GPU name
    TotalMem    int64  // total GPU memory
    GPUInfo     string // Human-readable GPU description
    Synchronous bool   // for debug: synchronize stream0 at every kernel launch

var UseCC = 0

func Add Uses

func Add(dst, src1, src2 *data.Slice)

Add: dst = src1 + src2.

func AddCubicAnisotropy2 Uses

func AddCubicAnisotropy2(Beff, m *data.Slice, Msat, k1, k2, k3, c1, c2 MSlice)

Add uniaxial magnetocrystalline anisotropy field to Beff. see uniaxialanisotropy.cu

func AddDMI Uses

func AddDMI(Beff *data.Slice, m *data.Slice, Aex_red, Dex_red SymmLUT, Msat MSlice, regions *Bytes, mesh *data.Mesh, OpenBC bool)

Add effective field of Dzyaloshinskii-Moriya interaction to Beff (Tesla). According to Bagdanov and Röβler, PRL 87, 3, 2001. eq.8 (out-of-plane symmetry breaking). See dmi.cu

func AddDMIBulk Uses

func AddDMIBulk(Beff *data.Slice, m *data.Slice, Aex_red, D_red SymmLUT, Msat MSlice, regions *Bytes, mesh *data.Mesh, OpenBC bool)

Add effective field due to bulk Dzyaloshinskii-Moriya interaction to Beff. See dmibulk.cu

func AddDotProduct Uses

func AddDotProduct(dst *data.Slice, prefactor float32, a, b *data.Slice)

dst += prefactor * dot(a, b), as used for energy density

func AddExchange Uses

func AddExchange(B, m *data.Slice, Aex_red SymmLUT, Msat MSlice, regions *Bytes, mesh *data.Mesh)

Add exchange field to Beff.

m: normalized magnetization
B: effective field in Tesla
Aex_red: Aex / (Msat * 1e18 m2)

see exchange.cu

func AddMagnetoelasticField Uses

func AddMagnetoelasticField(Beff, m *data.Slice, exx, eyy, ezz, exy, exz, eyz, B1, B2, Msat MSlice)

Add magneto-elasticit coupling field to the effective field. see magnetoelasticfield.cu

func AddSlonczewskiTorque2 Uses

func AddSlonczewskiTorque2(torque, m *data.Slice, Msat, J, fixedP, alpha, pol, λ, ε_prime MSlice, flp float64, mesh *data.Mesh)

Add Slonczewski ST torque to torque (Tesla). see slonczewski.cu

func AddUniaxialAnisotropy2 Uses

func AddUniaxialAnisotropy2(Beff, m *data.Slice, Msat, k1, k2, u MSlice)

Add uniaxial magnetocrystalline anisotropy field to Beff. see uniaxialanisotropy.cu

func AddZhangLiTorque Uses

func AddZhangLiTorque(torque, m *data.Slice, Msat, J, alpha, xi, pol MSlice, mesh *data.Mesh)

Add Zhang-Li ST torque (Tesla) to torque. see zhangli.cu

func Buffer Uses

func Buffer(nComp int, size [3]int) *data.Slice

Returns a GPU slice for temporary use. To be returned to the pool with Recycle

func Crop Uses

func Crop(dst, src *data.Slice, offX, offY, offZ int)

Crop stores in dst a rectangle cropped from src at given offset position. dst size may be smaller than src.

func CrossProduct Uses

func CrossProduct(dst, a, b *data.Slice)

func Div Uses

func Div(dst, a, b *data.Slice)

divide: dst[i] = a[i] / b[i] divide-by-zero yields zero.

func Dot Uses

func Dot(a, b *data.Slice) float32

Dot product.

func ExchangeDecode Uses

func ExchangeDecode(dst *data.Slice, Aex_red SymmLUT, regions *Bytes, mesh *data.Mesh)

Finds the average exchange strength around each cell, for debugging.

func FreeBuffers Uses

func FreeBuffers()

Frees all buffers. Called after mesh resize.

func GPUCopy Uses

func GPUCopy(in *data.Slice) *data.Slice

Returns a copy of in, allocated on GPU.

func GetCell Uses

func GetCell(s *data.Slice, comp, ix, iy, iz int) float32

func GetElem Uses

func GetElem(s *data.Slice, comp int, index int) float32

func GetMagnetoelasticForceDensity Uses

func GetMagnetoelasticForceDensity(out, m *data.Slice, B1, B2 MSlice, mesh *data.Mesh)

Calculate magneto-elasticit force density see magnetoelasticforce.cu

func Init Uses

func Init(gpu int)

Locks to an OS thread and initializes CUDA for that thread.

func LLNoPrecess Uses

func LLNoPrecess(torque, m, B *data.Slice)

Landau-Lifshitz torque with precession disabled. Used by engine.Relax().

func LLTorque Uses

func LLTorque(torque, m, B *data.Slice, alpha MSlice)

Landau-Lifshitz torque divided by gamma0:

- 1/(1+α²) [ m x B +  α m x (m x B) ]
torque in Tesla
m normalized
B in Tesla

see lltorque.cu

func Madd2 Uses

func Madd2(dst, src1, src2 *data.Slice, factor1, factor2 float32)

multiply-add: dst[i] = src1[i] * factor1 + src2[i] * factor2

func Madd3 Uses

func Madd3(dst, src1, src2, src3 *data.Slice, factor1, factor2, factor3 float32)

multiply-add: dst[i] = src1[i] * factor1 + src2[i] * factor2 + src3 * factor3

func MaxAbs Uses

func MaxAbs(in *data.Slice) float32

Maximum of absolute values of all elements.

func MaxVecDiff Uses

func MaxVecDiff(x, y *data.Slice) float64

Maximum of the norms of the difference between all vectors (x1,y1,z1) and (x2,y2,z2)

(dx, dy, dz) = (x1, y1, z1) - (x2, y2, z2)
max_i sqrt( dx[i]*dx[i] + dy[i]*dy[i] + dz[i]*dz[i] )

func MaxVecNorm Uses

func MaxVecNorm(v *data.Slice) float64

Maximum of the norms of all vectors (x[i], y[i], z[i]).

max_i sqrt( x[i]*x[i] + y[i]*y[i] + z[i]*z[i] )

func MemAlloc Uses

func MemAlloc(bytes int64) unsafe.Pointer

Wrapper for cu.MemAlloc, fatal exit on out of memory.

func MemCpy Uses

func MemCpy(dst, src unsafe.Pointer, bytes int64)

func MemCpyDtoH Uses

func MemCpyDtoH(dst, src unsafe.Pointer, bytes int64)

func MemCpyHtoD Uses

func MemCpyHtoD(dst, src unsafe.Pointer, bytes int64)

func Memset Uses

func Memset(s *data.Slice, val ...float32)

Memset sets the Slice's components to the specified values. To be carefully used on unified slice (need sync)

func Minimize Uses

func Minimize(m, m0, torque *data.Slice, dt float32)

m = 1 / (4 + τ²(m x H)²) [{4 - τ²(m x H)²} m - 4τ(m x m x H)] note: torque from LLNoPrecess has negative sign

func Mul Uses

func Mul(dst, a, b *data.Slice)

multiply: dst[i] = a[i] * b[i] a and b must have the same number of components

func NewSlice Uses

func NewSlice(nComp int, size [3]int) *data.Slice

Make a GPU Slice with nComp components each of size length.

func Normalize Uses

func Normalize(vec, vol *data.Slice)

Normalize vec to unit length, unless length or vol are zero.

func Recycle Uses

func Recycle(s *data.Slice)

Returns a buffer obtained from GetBuffer to the pool.

func RegionAddS Uses

func RegionAddS(dst *data.Slice, lut LUTPtr, regions *Bytes)

dst += LUT[region], for scalar. Used to add terms to scalar excitation.

func RegionAddV Uses

func RegionAddV(dst *data.Slice, lut LUTPtrs, regions *Bytes)

dst += LUT[region], for vectors. Used to add terms to excitation.

func RegionDecode Uses

func RegionDecode(dst *data.Slice, lut LUTPtr, regions *Bytes)

decode the regions+LUT pair into an uncompressed array

func RegionSelect Uses

func RegionSelect(dst, src *data.Slice, regions *Bytes, region byte)

select the part of src within the specified region, set 0's everywhere else.

func Resize Uses

func Resize(dst, src *data.Slice, layer int)

Select and resize one layer for interactive output

func SetCell Uses

func SetCell(s *data.Slice, comp int, ix, iy, iz int, value float32)

func SetElem Uses

func SetElem(s *data.Slice, comp int, index int, value float32)

func SetMaxAngle Uses

func SetMaxAngle(dst, m *data.Slice, Aex_red SymmLUT, regions *Bytes, mesh *data.Mesh)

SetMaxAngle sets dst to the maximum angle of each cells magnetization with all of its neighbors, provided the exchange stiffness with that neighbor is nonzero.

func SetPhi Uses

func SetPhi(s *data.Slice, m *data.Slice)

func SetTemperature Uses

func SetTemperature(Bth, noise *data.Slice, k2mu0_Mu0VgammaDt float64, Msat, Temp, Alpha MSlice)

Set Bth to thermal noise (Brown). see temperature.cu

func SetTheta Uses

func SetTheta(s *data.Slice, m *data.Slice)

func SetTopologicalCharge Uses

func SetTopologicalCharge(s *data.Slice, m *data.Slice, mesh *data.Mesh)

Set s to the toplogogical charge density s = m · (m/∂x ❌ ∂m/∂y) See topologicalcharge.cu

func ShiftBytes Uses

func ShiftBytes(dst, src *Bytes, m *data.Mesh, shiftX int, clamp byte)

Like Shift, but for bytes

func ShiftBytesY Uses

func ShiftBytesY(dst, src *Bytes, m *data.Mesh, shiftY int, clamp byte)

func ShiftX Uses

func ShiftX(dst, src *data.Slice, shiftX int, clampL, clampR float32)

shift dst by shx cells (positive or negative) along X-axis. new edge value is clampL at left edge or clampR at right edge.

func ShiftY Uses

func ShiftY(dst, src *data.Slice, shiftY int, clampL, clampR float32)

func ShiftZ Uses

func ShiftZ(dst, src *data.Slice, shiftZ int, clampL, clampR float32)

func Sum Uses

func Sum(in *data.Slice) float32

Sum of all elements.

func Sync Uses

func Sync()

Synchronize the global stream This is called before and after all memcopy operations between host and device.

func Zero Uses

func Zero(s *data.Slice)

Set all elements of all components to zero.

func ZeroMask Uses

func ZeroMask(dst *data.Slice, mask LUTPtr, regions *Bytes)

Sets vector dst to zero where mask != 0.

type Bytes Uses

type Bytes struct {
    Ptr unsafe.Pointer
    Len int

3D byte slice, used for region lookup.

func NewBytes Uses

func NewBytes(Len int) *Bytes

Construct new byte slice with given length, initialised to zeros.

func (*Bytes) Copy Uses

func (dst *Bytes) Copy(src *Bytes)

Copy on device: dst = src.

func (*Bytes) Download Uses

func (src *Bytes) Download(dst []byte)

Copy to host: dst = src.

func (*Bytes) Free Uses

func (b *Bytes) Free()

Frees the GPU memory and disables the slice.

func (*Bytes) Get Uses

func (src *Bytes) Get(index int) byte

Get one element. data.Index can be used to find the index for x,y,z.

func (*Bytes) Set Uses

func (dst *Bytes) Set(index int, value byte)

Set one element to value. data.Index can be used to find the index for x,y,z.

func (*Bytes) Upload Uses

func (dst *Bytes) Upload(src []byte)

Upload src (host) to dst (gpu).

type DemagConvolution Uses

type DemagConvolution struct {
    // contains filtered or unexported fields

Stores the necessary state to perform FFT-accelerated convolution with magnetostatic kernel (or other kernel of same symmetry).

func NewDemag Uses

func NewDemag(inputSize, PBC [3]int, kernel [3][3]*data.Slice, test bool) *DemagConvolution

Initializes a convolution to evaluate the demag field for the given mesh geometry. Sanity-checked if test == true (slow-ish for large meshes).

func (*DemagConvolution) Exec Uses

func (c *DemagConvolution) Exec(B, m, vol *data.Slice, Msat MSlice)

Calculate the demag field of m * vol * Bsat, store result in B.

m:    magnetization normalized to unit length
vol:  unitless mask used to scale m's length, may be nil
Bsat: saturation magnetization in Tesla
B:    resulting demag field, in Tesla

func (*DemagConvolution) Free Uses

func (c *DemagConvolution) Free()

type LUTPtr Uses

type LUTPtr unsafe.Pointer // points to 256 float32's

type LUTPtrs Uses

type LUTPtrs []unsafe.Pointer // elements point to 256 float32's

type MFMConvolution Uses

type MFMConvolution struct {
    // contains filtered or unexported fields

Stores the necessary state to perform FFT-accelerated convolution

func NewMFM Uses

func NewMFM(mesh *data.Mesh, lift, tipsize float64, cachedir string) *MFMConvolution

Initializes a convolution to evaluate the demag field for the given mesh geometry.

func (*MFMConvolution) Exec Uses

func (c *MFMConvolution) Exec(outp, inp, vol *data.Slice, Msat MSlice)

store MFM image in output, based on magnetization in inp.

func (*MFMConvolution) Free Uses

func (c *MFMConvolution) Free()

func (*MFMConvolution) Reinit Uses

func (c *MFMConvolution) Reinit(lift, tipsize float64, cachedir string)

type MSlice Uses

type MSlice struct {
    // contains filtered or unexported fields

Slice + scalar multiplier.

func MakeMSlice Uses

func MakeMSlice(arr *data.Slice, mul []float64) MSlice

func ToMSlice Uses

func ToMSlice(s *data.Slice) MSlice

func (MSlice) DevPtr Uses

func (m MSlice) DevPtr(c int) unsafe.Pointer

func (MSlice) Len Uses

func (m MSlice) Len() int

func (MSlice) Mul Uses

func (m MSlice) Mul(c int) float32

func (MSlice) Recycle Uses

func (m MSlice) Recycle()

func (MSlice) SetMul Uses

func (m MSlice) SetMul(c int, mul float32)

func (MSlice) Size Uses

func (m MSlice) Size() [3]int

type SymmLUT Uses

type SymmLUT unsafe.Pointer // points to 256x256 symmetric matrix, only lower half stored. See exchange.cu


cuGo bindings for the CUDA driver API.
cufftGo bindings for the CUDA CUFFT API.

Package cuda imports 14 packages (graph) and is imported by 15 packages. Updated 2019-11-05. Refresh now. Tools for package owners.