Code¶
Repo Manager¶
dgit supports multiple ways to store datasets. It could be git itself, local filesystem (possibly, with s3 backend). We expect to support Instabase in future .
-
class
dgitcore.plugins.repomanager.
RepoManagerBase
(name, version, description, supported=[])[source]¶ Bases:
object
Repository manager handles the specifics of the version control system. Currently only git manager is supported.
-
class
dgitcore.contrib.repomanagers.gitmanager.
GitRepoManager
[source]¶ Bases:
dgitcore.plugins.repomanager.RepoManagerBase
Git-based versioning service. This implements the RepoManagerBase class.
-
clone
(url, backend=None)[source]¶ Clone a URL
Parameters: url : URL of the repo. Supports s3://, git@, http://
-
commit
(repo, args=[])[source]¶ Commit the changes to the repo (pass thru git command)
Parameters: repo: Repository object
args: git-specific args
-
diff
(repo, args=[])[source]¶ diff two repo versions (pass thru git command)
Parameters: repo: Repository object
args: git-specific args
-
init
(username, reponame, force, backend=None)[source]¶ Initialize a Git repo
Parameters: username, reponame : Repo name is tuple (name, reponame)
force: force initialization of the repo even if exists
backend: backend that must be used for this (e.g. s3)
-
log
(repo, args=[])[source]¶ Show the log (pass thru git command)
Parameters: repo: Repository object
args: git-specific args
-
notes
(repo, args=[])[source]¶ Add notes to the commit
Parameters: repo: Repository object
args: notes-specific args
-
pull
(repo, args=[])[source]¶ Pull from origin/filesystem based master
Parameters: repo: Repository object
args: git-specific args
-
push
(repo, args=[])[source]¶ Push to origin master
Parameters: repo: Repository object
args: git-specific args
-
remote
(repo, args=[])[source]¶ Check remote URL
Parameters: repo: Repository object
args: git-specific args
-
show
(repo, args=[])[source]¶ Show the content of the repo (pass thru git command)
Parameters: repo: Repository object
args: git-specific args
-
Backends¶
dgit is designed to support multiple backends. Intially local filesystem and s3 are supported. We plan to support more in future.
-
class
dgitcore.plugins.backend.
BackendBase
(name, version, description, supported=[])[source]¶ Bases:
object
Backend object implements
-
class
dgitcore.contrib.backends.s3.
S3Backend
[source]¶ Bases:
dgitcore.plugins.backend.BackendBase
S3 backend for the datasets.
Parameters: Configuration (s3 enable,access, secret, bucket, prefix)
Instrumentation¶
Various plugins that can be used to instrument any process of generation of the dataset.
-
class
dgitcore.plugins.instrumentation.
InstrumentationBase
(name, version, description, supported=[])[source]¶ Bases:
object
Pre-computed patterns
-
class
dgitcore.contrib.instrumentations.content.
ContentInstrumentation
[source]¶ Bases:
dgitcore.plugins.instrumentation.InstrumentationBase
Instrumentation to extract content summaries including mimetypes, sha1 signature and schema where possible.
-
class
dgitcore.contrib.instrumentations.platform.
PlatformInstrumentation
[source]¶ Bases:
dgitcore.plugins.instrumentation.InstrumentationBase
Instrumentation to extract platform-specific information
-
class
dgitcore.contrib.instrumentations.executable.
ExecutableInstrumentation
[source]¶ Bases:
dgitcore.plugins.instrumentation.InstrumentationBase
Instrumentation to extract executable related summaries such as the git commit, nature of executable, parameters etc.
Metadata¶
dgit supports posting metadata to simple API servers to enable search, lineage computation, and sharing. A minimal posting client is supported for now.
Validation¶
-
class
dgitcore.plugins.validator.
ValidatorBase
(name, version, description, supported=[])[source]¶ Bases:
object
This is the base class for all backends including
-
class
dgitcore.contrib.validators.metadata_validator.
MetadataValidator
[source]¶ Bases:
dgitcore.plugins.validator.ValidatorBase
Validate repository metdata
-
class
dgitcore.contrib.validators.regression_quality.
RegressionQualityValidator
[source]¶ Bases:
dgitcore.plugins.validator.ValidatorBase
Validate repository metdata