Unleash Data Potential | Drive Innovation with Quilt

Data packages are searchable, sharable, versioned, reproducible, and self contained. In essence, data packages are intelligent manifests that combine a list of pointers to data (for example objects in Amazon S3 or files in Sharepoint), with context about those data, including lineage, metadata, and revision comments.

Not only do data packages keep track of their own versions, they keep track of the versions of underlying data as well, giving teams the ability to review every version of every document contained in a data package. Teams can grab a version of a package and run it through pipelines to test repeatability, or can link a historical package to their colleague to ensure they’re iterating on the same data.

Data with context

Data packages offer a streamlined approach to data management by storing both data and metadata together in object storage. This integrated method contrasts with traditional practices where metadata and versioning information are stored separately in databases, while the actual data resides in object stores. By consolidating data and its contextual information within the same storage unit, data packages enhance the integrity and coherence of data management, ensuring that context and content are always aligned and readily accessible.

Deeply Versioned

One of the significant advantages of data packages is their support for deep versioning, which is facilitated by the use of SHA-256 checksums for each revision. This cryptographic hash function ensures the integrity of data by providing a unique identifier for every version, allowing users to track changes, verify data accuracy, and revert to previous versions if necessary. This robust versioning capability enhances data reliability and facilitates meticulous data management across different iterations.

Flexible Metadata

Data packages excel in accommodating diverse metadata needs through their flexible metadata schema. Unlike rigid systems that enforce a single, uniform metadata schema across all datasets, data packages allow teams to capture and manage metadata in a way that best suits their specific requirements. This flexibility ensures that relevant details are preserved and easily accessible without the need for a one-size-fits-all approach, thus supporting a wide range of data use cases and applications.

Interoperable Data

The interoperability of data packages is a crucial feature, as it allows seamless integration and sharing of data across different platforms. By storing data in an open-source, customer-owned data storage system, data packages enable the attachment of data from various platforms (Platform A and Platform B) while maintaining compatibility. This open approach ensures that data can be effectively utilized across diverse systems and environments, fostering greater collaboration and data exchange.

Easily Accessed

Data packages are designed to be easily accessible, with features like SQL querying and faceted search to enhance data retrieval. These functionalities allow teams to efficiently locate and extract the data packages they need, regardless of the cloud environment they are using. The ability to perform advanced searches and queries ensures that users can quickly find relevant datasets and integrate them into their workflows, significantly improving data accessibility and usability.

FLEXIBLE INTEROPERABLE DATA PACKAGES

INFINITE POTENTIAL: THE DATA REVOLUTION

A PARADIGM SHIFT

DATA: REIMAGINED

If you can't find the data, you can't reproduce the analysis.

Why build data packages?

Data is more powerful with context

Data without context is not reusable

LINKED DATA IS REUSABLE DATA

What are Data Packages

Data Packages Defined

Data with context

Deeply Versioned

Flexible Metadata

Interoperable Data

Easily Accessed

Data packages aren't theoretical

Build data packages with Quilt TODAY

Quilt is an Advanced AWS Technology Partner