fusera

package module
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2018 License: Apache-2.0 Imports: 24 Imported by: 0

README

mitrelogo-black

Fusera

Fusera (FUSE for SRA) is a FUSE implementation for the cloud extension to the NCBI Sequence Read Archive (SRA). SRA accepts data from all kinds of sequencing projects including clinically important studies that involve human subjects or their metagenomes, which may contain human sequences. These data often have a controlled access via dbGaP (the database of Genotypes and Phenotypes). The SRA provides access to cloud-hosted data through a web-services API (documented here) that provides signedURL access to data objects. Fusera presents selected SRA data elements as a read-only file system, enabling users and tools to access the data through a file system interface. The related sracp tool (reference) provides a convenient interface for copying the data to a mounted file system within a virtual machine. These tools are intended for deployment on linux.

Fundamentally, Fusera presents all of the cloud-hosted SRA data for a set of SRA Accession numbers as a mounted directory, with one subdirectory per SRA Accession number. The user’s credentials are passed through a dbGaP repository key, or ngc file, that is obtained from dbGaP. Fusera relies on the SRA’s Nameservice API (reference) which may limit the ability of fusera to ‘see’ certain data sets based on the location where the fusera service is deployed with the aim of limiting charges for data egress.

Installation

Note that it is in the works to have both apt-get packages and homebrew packages to ease the install process.

Dependencies
Fusera

Depending on the linux distro, fuse-utils may need to be installed.

Mac users must install osxfuse either on their website or through Homebrew:

$ brew cask install osxfuse
Sracp

For now, sracp requires an installation of curl. This is an "alpha" solution and it's in the roadmap to remove this dependency.

Pre-built Releases

For easy installation, releases of Fusera and Sracp can be found at https://github.com/mitre/fusera/releases

For linux: fusera-linux-amd64 and sracp-linux-amd64

For mac: fusera-darwin-amd64 and sracp-darwin-amd64

After downloading, it's advised to rename the files to fusera and sracp to fit with the rest of this document.

Make sure to grab the latest release, which is signified on the left sidebar with a green badge. Also note that changing the binary to be executable will probably be necessary:

chmod +x fusera
chmod +x sracp

It is advised to move this file somewhere in your PATH in order to increase ease of calling either fusera or sracp.

Build from source:
$ go get github.com/mitre/fusera/cmd/fusera
$ go install github.com/mitre/fusera/cmd/fusera
$ go get github.com/mitre/fusera/cmd/sracp
$ go install github.com/mitre/fusera/cmd/sracp

Usage

In an effort to keep the instructions as up to date as possible, please refer to the wiki for instructions on how to use fusera or sracp:
https://github.com/mitre/fusera/wiki/Running-Fusera
https://github.com/mitre/fusera/wiki/Running-Sracp

Troubleshooting

https://github.com/mitre/fusera/wiki/Troubleshooting

License

Fusera started its life as a hard fork of the Goofys project.

Copyright (C) 2015 - 2017 Ka-Hing Cheung

Modifications Copyright (C) 2018 The MITRE Corporation

The modifications were developed for the NIH Cloud Commons Pilot. General questions can be forwarded to:

opensource@mitre.org
Technology Transfer Office
The MITRE Corporation
7515 Colshire Drive
McLean, VA 22102-7539

Licensed under the Apache License, Version 2.0

Only the functionality needed was retained from the Goofys project. Here are a list of files removed from the original source:

  • api/api.go
  • internal/
    • perms.go
    • ticket.go
    • ticket_test.go
    • v2signer.go
    • minio_test.go
    • goofys_test.go
    • aws_test.go

There has also been some refactoring of the codebase, so while some files have been removed, the code in them might exist in other files. License headers and copyright have been kept in these circumstances.

The major changes to the original source stem from Fusera's use case. Instead of only communicating with one bucket, Fusera is redesigned to be capable of accessing many different files distributed over multiple cloud services. One file can be in Google's Cloud Storage while another file that appears to be in the same folder can actually exist on an AWS S3 bucket. This flexibility is partly enabled by Fusera only needing read access to the files and therefore can use either public or signed urls to make HTTP requests directly to the cloud service's API.

So Goofys' use of the aws-sdk to interact with AWS compatible endpoints was removed for a more flexible way of communicating since Fusera has no need to authenticate for write access to any of these files.

Also, Goofys' start up was modified in order for Fusera to be able to communicate with an NIH API which provides the urls used to access the desired files.

References

Fusera has a lot to owe to the Goofys project: a high-performance, POSIX-ish file system written in Go. This was used as a starting point.

Documentation

Index

Constants

View Source
const BUF_SIZE = 5 * 1024 * 1024
View Source
const MAX_READAHEAD = uint32(100 * 1024 * 1024)
View Source
const READAHEAD_CHUNK = uint32(20 * 1024 * 1024)

Variables

This section is empty.

Functions

func Dup

func Dup(value []byte) []byte

func MaxInt

func MaxInt(a, b int) int

func MaxUInt32

func MaxUInt32(a, b uint32) uint32

func MaxUInt64

func MaxUInt64(a, b uint64) uint64

func MinInt

func MinInt(a, b int) int

func MinUInt32

func MinUInt32(a, b uint32) uint32

func MinUInt64

func MinUInt64(a, b uint64) uint64

func TryUnmount

func TryUnmount(mountPoint string) (err error)

Types

type Buffer

type Buffer struct {
	// contains filtered or unexported fields
}

func (*Buffer) Close

func (b *Buffer) Close() (err error)

func (Buffer) Init

func (b Buffer) Init(buf *MBuf, r ReaderProvider) *Buffer

func (*Buffer) Read

func (b *Buffer) Read(p []byte) (n int, err error)

type BufferPool

type BufferPool struct {
	// contains filtered or unexported fields
}

func NewBufferPool

func NewBufferPool(maxSizeGlobal uint64) *BufferPool

for testing

func (*BufferPool) Free

func (pool *BufferPool) Free(buf []byte)

func (BufferPool) Init

func (pool BufferPool) Init() *BufferPool

func (*BufferPool) MaybeGC

func (pool *BufferPool) MaybeGC()

func (*BufferPool) RequestBuffer

func (pool *BufferPool) RequestBuffer() (buf []byte)

func (*BufferPool) RequestMultiple

func (pool *BufferPool) RequestMultiple(size uint64, block bool) (buffers [][]byte)

type DirHandle

type DirHandle struct {
	Entries    []*DirHandleEntry
	Marker     *string
	BaseOffset int
	// contains filtered or unexported fields
}

func NewDirHandle

func NewDirHandle(inode *Inode) (dh *DirHandle)

func (*DirHandle) CloseDir

func (dh *DirHandle) CloseDir() error

func (*DirHandle) ReadDir

func (dh *DirHandle) ReadDir(offset fuseops.DirOffset) (en *DirHandleEntry, err error)

LOCKS_REQUIRED(dh.mu)

type DirHandleEntry

type DirHandleEntry struct {
	Name   *string
	Inode  fuseops.InodeID
	Type   fuseutil.DirentType
	Offset fuseops.DirOffset

	Attributes   *InodeAttributes
	ETag         *string
	StorageClass *string
}

type DirInodeData

type DirInodeData struct {
	DirTime time.Time

	Children []*Inode
	// contains filtered or unexported fields
}

type FileHandle

type FileHandle struct {
	// contains filtered or unexported fields
}

func NewFileHandle

func NewFileHandle(in *Inode) *FileHandle

func (*FileHandle) ReadFile

func (fh *FileHandle) ReadFile(offset int64, buf []byte) (bytesRead int, err error)

func (*FileHandle) Release

func (fh *FileHandle) Release()

type Fusera

type Fusera struct {
	fuseutil.NotImplementedFileSystem
	// contains filtered or unexported fields
}

func Mount

func Mount(ctx context.Context, opt *Options) (*Fusera, *fuse.MountedFileSystem, error)

func NewFusera

func NewFusera(ctx context.Context, opt *Options) (*Fusera, error)

func (*Fusera) GetInodeAttributes

func (fs *Fusera) GetInodeAttributes(ctx context.Context, op *fuseops.GetInodeAttributesOp) (err error)

func (*Fusera) GetXattr

func (fs *Fusera) GetXattr(ctx context.Context, op *fuseops.GetXattrOp) (err error)

func (*Fusera) ListXattr

func (fs *Fusera) ListXattr(ctx context.Context, op *fuseops.ListXattrOp) (err error)

func (*Fusera) LookUpInode

func (fs *Fusera) LookUpInode(ctx context.Context, op *fuseops.LookUpInodeOp) (err error)

func (*Fusera) OpenDir

func (fs *Fusera) OpenDir(ctx context.Context, op *fuseops.OpenDirOp) (err error)

func (*Fusera) OpenFile

func (fs *Fusera) OpenFile(ctx context.Context, op *fuseops.OpenFileOp) (err error)

func (*Fusera) ReadDir

func (fs *Fusera) ReadDir(ctx context.Context, op *fuseops.ReadDirOp) (err error)

LOCKS_EXCLUDED(fs.mu)

func (*Fusera) ReadFile

func (fs *Fusera) ReadFile(ctx context.Context, op *fuseops.ReadFileOp) (err error)

func (*Fusera) ReleaseDirHandle

func (fs *Fusera) ReleaseDirHandle(ctx context.Context, op *fuseops.ReleaseDirHandleOp) (err error)

func (*Fusera) ReleaseFileHandle

func (fs *Fusera) ReleaseFileHandle(ctx context.Context, op *fuseops.ReleaseFileHandleOp) (err error)

func (*Fusera) SigUsr1

func (fs *Fusera) SigUsr1()

func (*Fusera) StatFS

func (fs *Fusera) StatFS(ctx context.Context, op *fuseops.StatFSOp) (err error)

func (*Fusera) SyncFile

func (fs *Fusera) SyncFile(ctx context.Context, op *fuseops.SyncFileOp) (err error)

type Inode

type Inode struct {
	Id          fuseops.InodeID
	Name        *string
	Link        string
	Acc         string
	ErrContents string

	Attributes InodeAttributes
	KnownSize  *uint64
	AttrTime   time.Time

	Parent *Inode

	Invalid     bool
	ImplicitDir bool
	// contains filtered or unexported fields
}

func NewInode

func NewInode(fs *Fusera, parent *Inode, name *string, fullName *string) (inode *Inode)

func (*Inode) DeRef

func (inode *Inode) DeRef(n uint64) (stale bool)

LOCKS_REQUIRED(fs.mu)

func (*Inode) FullName

func (inode *Inode) FullName() *string

func (*Inode) GetAttributes

func (inode *Inode) GetAttributes() (*fuseops.InodeAttributes, error)

func (*Inode) GetXattr

func (inode *Inode) GetXattr(name string) ([]byte, error)

func (*Inode) InflateAttributes

func (inode *Inode) InflateAttributes() (attr fuseops.InodeAttributes)

func (*Inode) ListXattr

func (inode *Inode) ListXattr() ([]string, error)

func (*Inode) OpenDir

func (inode *Inode) OpenDir() (dh *DirHandle)

func (*Inode) OpenFile

func (inode *Inode) OpenFile() (fh *FileHandle, err error)

func (*Inode) Ref

func (inode *Inode) Ref()

LOCKS_REQUIRED(fs.mu) XXX why did I put lock required? This used to return a resurrect bool which no long does anything, need to look into that to see if that was legacy

func (*Inode) ToDir

func (inode *Inode) ToDir()

type InodeAttributes

type InodeAttributes struct {
	Size           uint64
	Mtime          time.Time
	ExpirationDate time.Time
}

type MBuf

type MBuf struct {
	// contains filtered or unexported fields
}

func (*MBuf) Free

func (mb *MBuf) Free()

func (*MBuf) Full

func (mb *MBuf) Full() bool

func (MBuf) Init

func (mb MBuf) Init(h *BufferPool, size uint64, block bool) *MBuf

func (*MBuf) Read

func (mb *MBuf) Read(p []byte) (n int, err error)

func (*MBuf) Seek

func (mb *MBuf) Seek(offset int64, whence int) (int64, error)

seek only seeks the reader

func (*MBuf) Write

func (mb *MBuf) Write(p []byte) (n int, err error)

func (*MBuf) WriteFrom

func (mb *MBuf) WriteFrom(r io.Reader) (n int, err error)

type Options

type Options struct {
	Ngc         []byte
	Acc         map[string]bool
	Loc         string
	ApiEndpoint string
	AwsBatch    int
	GcpBatch    int
	// SRR# has a map of file names that map to urls where the data is
	Urls map[string]map[string]string

	// File system
	MountOptions      map[string]string
	MountPoint        string
	MountPointArg     string
	MountPointCreated string

	Cache    []string
	DirMode  os.FileMode
	FileMode os.FileMode
	Uid      uint32
	Gid      uint32

	// Tuning
	StatCacheTTL time.Duration
	TypeCacheTTL time.Duration

	// // Debugging
	Debug bool
}

type ReaderProvider

type ReaderProvider func() (io.ReadCloser, error)

type S3ReadBuffer

type S3ReadBuffer struct {
	// contains filtered or unexported fields
}

func (*S3ReadBuffer) Read

func (b *S3ReadBuffer) Read(offset uint64, p []byte) (n int, err error)

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL