Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Stage Direction - Package

Package

Letter from the editor

TLDR; For the next several milestones we will focus the effort of the Package group on the functionality and usability of GitLab-hosted packages and container images and delay work on virtual registries.

To the GitLab Community and customers,

I'm Tim. I have been updating this direction page for nearly two years. I thought that adding a personal touch to this page might be appreciated. Moving forward, I'll try to add a short note about progress, prioritization, or problems we've been encountering.

Looking back, it's been an exciting two years working on the GitLab Package stage. We've added support for many new package manager formats. We moved the Package Registry and Dependency Proxy to open source so that everyone can has access to a private registry and can pull container images from Docker Hub without a paid subscription. Finally, we completed development of a significant change to the Container Registry that will reduce the cost of storage and pave the way for a much more integrated user experience.

During that time, adoption of the Package Registry grew by more than 500%. Growth can be challenging though and with increased usage we’ve seen an increase in the number of bugs, usability concerns, and production incidents. I know how frustrating these incidents can be and it's our desire to build features that work reliably and efficiently. We are committed to reducing the total number of incidents and the mean time to resolution of any that do occur.

Speaking of the team, we've had a few changes recently and some exciting news to share. First, one of our backend engineers has decided to move on from GitLab, we wish them well. Thank you Giorgenes for your contributions, including adding support for Composer and PyPI! Not that we can replace Giorgenes, but we will be backfilling this role as soon as possible. But, in the mean time, we are a bit short-handed.

Not all is lost though. We have received sign-off from the executive team to add two additional backend developers in May 2021! It will take us time to hire and on-board people, but I wanted to share GitLab's commitment to growing the breadth and depth of the Package stage.

Here is the part that isn't fun.

As you read through the vision and strategy of the Package group below, you’ll read about the importance of virtual registries in helping larger enterprises consolidate on GitLab. For the past few months I’ve been sharing that work would begin in February on this feature. However, with everything I've described above, we'll be postponing work on virtual registries until July 2021. This will allow us to better support our existing product and prepare for sustainable growth.

What does that mean? Well, I'd like to highlight three themes that we'd like to focus on for the next several months:

  1. Reduce the cost of the Container Registry by rolling out cleanup policies, online garbage collection and helping customers migrate to the new registry.
  2. Resolve Package Registry bugs that prevent you from publishing, viewing, installing, and finding packages.
  3. Update the Package Registry and Dependency Proxy user interface to display all of the important information and metadata.

As a team we are using the label package:scaling to identify and prioritize issues related to the above themes. You can view this list here.

I understand this delay means that you might not be able to migrate away from your existing vendors as quickly as you'd prefer, but I believe that improving the usability and reliability of our core product is more important at this moment in time. If you have questions or concerns, feel free to reach out to me via E-mail) or in an issue.

Thank you for reading! Tim

Goal

The goal of the Package Group is to build a product, that within three years, is our customer's single source of truth for storing and distributing images and packages.

Do customers want this?

Yes. As the PM for the Package stage, I hear regularly from customers and prospects that would like to migrate off of Jfrog's Artifactory. Their reasons for wanting to consolidate on GitLab are:

  1. Convenience (authentication, management, improved UX)
  2. Cost
  3. Lack of support (getting to meet with GitLab PMs is a big + for these folks)

Typically the needs of these customers can be predictably segmented by the size of their organization. For the sake of simplicity, let's classify their needs as enterprise and non-enterprise.

Non-enterprise organizations

Typically they’d like to know if we support format x and if not when will we support it. The formats that we don’t support that we hear most often are:

(All of the above will be useful for ~Dogfooding as well)

If we support their requested format, these customers are often able to consolidate.

They are typically blocked by issues and bugs that are fairly straightforward to address. They are most likely to engage in issues or on Twitter. They may use a single project as their universal registry. They are concerned about inconsistent token support, storage costs, and management.

Enterprise organizations

Typically the types of questions we hear depend on how far along the customer is in their evaluation of GitLab. Early on, we typically hear requests for new formats, mainly Linux packages. But, the fact that we support Maven, NPM, PyPI, generic and NuGet packages is usually enough to get them started. As they progress, we are more likely to hear that they need our existing integrations to be more fully-featured. Artifactory offers a feature (virtual registries) that allows you to publish, proxy, and cache multiple package repositories behind a single, logical URL. Without supporting this, no large organization will be able to migrate from Artifactory to GitLab.

This is where bringing together the Dependency Proxy and Package Registry is critical. Our vision to make the Dependency Proxy a Complete, captures the core requirements required to help these customers consolidate on GitLab.

Categories

If you'd like to learn more, the below information contains a summary, competitive info, and helpful links for each product category associated with the Package stage.

Container Registry

The GitLab Container Registry is a secure and private registry for Docker images. Built on open source software and completely integrated within GitLab. Use GitLab CI/CD to create and publish branch/release specific images. Use the GitLab API to manage the registry across groups and projects. Use the user interface to discover and manage your team's images. GitLab will provide a Lovable container registry experience by being the single location for the entire DevOps Lifecycle, not just a portion of it. We will provide many of the features expected of a container registry, but without the weight and complexity of a single-point solution.

What's next and why

Every day the registry is used to publish and install images. On GitLab.com, we see an average of nearly 200k images published per day. In the past year, we've primarily focused on GitLab-#2313, which will add zero downtime garbage collection, saving GitLab.com and our users running their own self-managed instance of GitLab tens of thousands of dollars per month.

Beyond garbage collection, the above work will unblock features that have been blocked by the way Docker stores container image manifests. So, we have many months of feature requests and bugs that have been blocked by the aforementioned project.

This includes things like:

Support for signing images, tag immutability, and making it easier to contribute are all part of the make the container registry complete epic.

In addition, we are starting to hear requests for more enterprise-focused features like HA and multi-cloud registry mirroring.

Competitive Landscape

Open source container registries such as Docker Hub and Red Hat's Quay offer users a single location to build, analyze, and distribute their container images. Docker Hub recently introduced rate limits for pulls from Docker Hub.

The primary reason people don’t use DockerHub is that they need a private registry and one that lives alongside their source code and pipelines. They like to be able to use pre-defined environment variables for cataloging and discovering images. Often DockerHub is used as a base image for a test, but if you are building an app, you will likely customize an image to fit your application and save it GitLab's private registry alongside your source code.

Artifactory and Nexus both offer support for building and deploying Docker images. Artifactory offers their container registry as part of their community edition as well.

Artifactory integrates with several different CI servers through dedicated plug-ins, including Jenkins and Azure DevOps, but does not yet support GitLab. However, you can still connect to your Artifactory repository from GitLab CI. Here is an example of how to deploy Maven projects to Artifactory with GitLab CI/CD.

GitHub has recently released an open beta of their container registry. Currently, the GitHub Container Registry only supports Docker image formats. During the beta, storage and bandwidth are free. After the beta, you can expect each tier to come with an included amount of storage and data transfers. Once you pass those limits, you will pay $0.25 USD per GB of storage and $0.50 USD per GB of data transfer. One concern worth raising is that we don't see a way to programmatically delete images. Given the cost of storing images, this could be a concern for organizations that heavily use GitHub's registry. Another limitation is that they only support authentication using your Personal Access Token. This is not ideal for organizations that would like to avoid using individual-level credentials. With the GitLab Container Registry, you may use a PAT, Deploy, or Job token to authenticate to the registry.

There are several nice features that they've included. One nice feature is that you can publish images to your namespace or your user account. We would like to create that same functionality via gitlab-#241027. Also, their user interface includes helpful metadata, such as how often it's downloaded and a readme.

Amazon offers a fully-featured registry and plans to add support for highly available, publicly hosted images.

Google Cloud offers a container registry that allows you to integrate with any CI/CD platform. The registry is free, although they do charge for storage and network egress. Google's registry includes container scanning and high availability.

JetBrains offers a container registry that allows you to add a project repository and publish images and tags using the Docker client or your JetBrains project. Although they do not currently have any documentation for administrative features, such as cleanup policies or garbage collection.

Digital Ocean offers a container registry that allows you store and configure private Docker images. In addition, they support global load balancing and caching in multiple regions. One potential drawback is that each Digital Ocean account is limited to 1 registry, whereas with GitLab each Project can have its own registry.

Package Registry

Our goal is for you to rely on GitLab as a universal package manager, so that you can reduce costs and drive operational efficiencies. The backbone of this category is your ability to easily publish and install packages, no matter where they are hosted.

You can view the list of supported and planned formats in our documentation here.

What's Next & Why

Since moving the Package Registry to Core, we've seen increased interest in and adoption of the registry. With that adoption, we've seen an increase in bugs and user experience issues. So, we are working through a list of issues to ensure the registry works reliability and seamlessly.

We have a few issues in progress that are focused on usability and reliability for Conan, Composer and Generic packages. For Composer, we are focused on GitLab-#259840 and GitLab-#247531 which will add support for v2 of Composer and prefer-source and prefered-install.

For Conan, we will resolve GiitLab-#270129, which resolves an iissue with download URLs being returned incorrectly.

Finally, for Generic packages, we'll address GitLab-#273034, which adds support for packages with semantic versions.

We are also breaking ground on adding support for publishing and installing RubyGems. You can follow gitlab-#803 for more details on timing and implementation.

Competitive Landscape
Universal package management tools

Artifactory and Nexus are the two leading universal package manager applications on the market. They both offer products that support the most common formats and additional security and compliance features. A critical gap between those two products and GitLab's Package offering is the ability to easily connect to and group external, remote registries. To date, GitLab has been focused on delivering Project and Group-level private package registries for the most commonly used formats. We plan on bridging this gap by expanding the Dependency Proxy to support remote and virtual registries.

Cloud providers

Azure and AWS both offer support for hosted and remote registries for a limited amount of formats. Google has a product called Artifact Registry that is in Alpha and supports Java and Node. All of the cloud providers charge for Cloud storage and network egress.

DevOps Platforms

GitHub offers a package management solution as well. They offer project-level package registries for a variety of formats. However, looking at GitHub's roadmap, they've moved features

GitHub charges for storage and network transfers. GitHub does a nice job with search and reporting usage data on how many times a given package has been downloaded. They do not have anything on their roadmap about supporting remote and virtual registries, which would allow them to group registries behind a single URL and allow them to act as a universal package manager, like Artifactory or Nexus or GitLab.

JetBrains offers a Package Registry with support for npm and more planned formats.. They have an ambitious and exciting roadmap for 2021, including adding support for Maven, Python and PHP. It's interesting to see that they'd like to support signing of packages and virtual registries, two features we are interested in adding at Gitlab.

Supported formats

The below table lists our supported and most frequently requested package manager formats. Artifactory and Nexus both support a longer list of formats, but we have not heard many requests from our customers for these formats. If you'd like to suggest we consider a new format, please open an issue here.

  GitLab Artifactory Nexus GitHub Azure Artifacts AWS CodeArtifact Google Artifact Registry
Composer ✔️ ✔️ ✔️️️️ - - - -
Conan ✔️ ✔️ ☑️ - - - -
Debian - ✔️ ✔️ - - - -
Gradle ✔️ ✔️ ✔️ ️✔️ ️ ✔️ ✔️ ✔️
Maven ✔️ ✔️ ✔️ ️✔️ ️ ✔️ ✔️ ✔️
NPM ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
NuGet ✔️ ✔️ ✔️ ✔️ ✔️ - -
PyPI ✔️ ✔️ ✔️ - ✔️ ✔️ -
RPM - ✔️ ✔️ - - - -
RubyGems - ✔️ ✔️ ✔️ - - -

☑️ indicates support is through community plugin or beta feature

Interested in contributing a new format? Please check out our suggested contributions.

Dependency Proxy

Many projects depend on a growing number of packages that must be fetched from external sources with each build. This slows down build times and introduces availability issues into the supply chain. ​​For organizations, this presents a critical problem. By providing a mechanism for storing and accessing external packages, we enable faster and more reliable builds.

Our vision for the Dependency Proxy is to provide a product that will provide fast, reliable access to all of your dependencies, whether they are hosted on GitLab or any other vendor. In addition, the Dependency Proxy will work hand-in-hand with the planned Dependency Firewall, which will help to prevent any unknown or unverified providers from introducing potential security vulnerabilities.

Currently the Dependency Proxy allows you to proxy and cache images from DockerHub. This can help you to speed up your pipelines and reduce your external dependencies. However this is only the first step. In the coming milestones, we will expand the Dependency Proxy from a single, hardcoded endpoint, to the place where you can setup and manage all of your registries (both packages and images) in one place.

There are a few important terms that are worth sharing:

What's Next & Why

gitlab-#190944 will add support for pulling container images from Docker Hub by digest. This change is required to support containerd and Docker 20.x+. The former is critical, especially with Kubernetes deprecating support for the Docker runtime in 1.22.

gitlab-#231239 is the first step in unifying the Dependency Proxy and the Package Registry. The issue proposes turning the npm request forwarding feature from a simple forwarding mechanism into a proxy. This will allow GitLab to store and present package metadata and will eventually lead gitlab-#24123 which will add support for caching.

Usecases listed
  1. Provide a single method of reaching upstream package management utilities, in the event they are not otherwise reachable.
  2. Cache images and packages for faster build times.
  3. Track which dependencies are utilized by which projects when pulled through the proxy.
  4. Audit logs in order to find out exactly what happened and with what code.
  5. Operate when fully cut off from the internet with local dependencies.
User flow

The below diagram demonstrates how you can use the Dependency Proxy to create a virtual registry which will look for and fetch dependencies from your hosted and remote registries. This will allow you to download all of your dependencies with a single URL, instead of having to remember which packages are hosted where.

Diagram Note: The above diagram shows all of your dependencies being resolved through the Dependency Proxy. Usage of this feature is not required. You can easily use your hosted and remote registries without grouping them in a virtual registry.

Competitive landscape

Artifactory is the leader in this category. They offer 'remote repositories' which serve as a caching repository for various package manager integrations. Utilizing the command line, API or a user interface, a user may create policies and control caching and proxying behavior. A Docker image or package may be requested from a remote repository on demand and if no content is available it will be fetched and cached according to the user's policies. In addition, they offer support for many of major packaging formats in use today. For storage optimization, they offer check-sum based storage, deduplication, copying, moving and deletion of files.

The below tables outline our current capabilities compared to JFrog's Artifactory and Sonatype's Nexus.

Container Registry GitLab Artifactory Nexus
Local registries ✔️ ✔️ ✔️
Remote registries Partial* ✔️ ✔️
Virtual registries Coming soon ✔️ ✔️

*The Dependency Proxy currently supports one hardcoded remote registry, which allows you to proxy and cache container images hosted on DockerHub.

Package Registry GitLab Artifactory Nexus
Local registries ✔️ ✔️ ✔️
Remote registries Partial* ✔️ ✔️
Virtual registries Coming soon ✔️ ✔️

*By default, when an NPM package is not found in the GitLab NPM Registry, the request will be forwarded to npmjs.com. Check out this speed-run to see how it works.

Dependency Firewall

Many projects depend on packages that may come from unknown or unverified providers, introducing potential security vulnerabilities. GitLab already provides dependency scanning across a variety of languages to alert users of any known security vulnerabilities, but we currently do not allow organizations to prevent those vulnerabilities from being downloaded to begin with.

The goal of this category will be to leverage the dependency proxy, which proxies and caches dependencies, to give more control and visibility to security and compliance teams. We will do this by allowing users to create and maintain an approved/banned list of dependencies, providing more insight into the usage and impact of external dependencies and by ensuring the GitLab Security Dashboard is the single source of truth for all security related issues.

By preventing the introduction of security vulnerabilities further upstream, organizations can let their development teams work faster and more efficiently.

Use cases
What's next & why

Today, if you try to install an npm package from the GitLab Package Registry and it's not found, GitLab will forward the request to the npm public registry. As an MVC of the Dependency Firewall, GitLab will flag any known vulnerabilities or suspicious activity prior to the package being downloaded. In order to accomplish this, we must update the architecture of the feature to store package metadata.

gitlab-#241239, which will ensure that when a package is pulled through the proxy that GitLab stores the packages metadata, so that we can flag suspicious packages based on that metadata.

Once the above is complete, gitlab-#215393 will flag any npm packages that are pulled through the Dependency Proxy that recently had the author name or email updated to ensure that users are aware of any suspicious changes.

Competitive landscape

JFrog utilizes a combination of their Bintray and XRay products, to proxy, cache and screen dependencies. They also provide dependency graphs across multiple languages and centralized dashboards for the review and remediation of vulnerabilities. It is a mature product, that is generally well received by users.

GitHub's new package registry does a really nice job of creating visibility into the dependency graph for a given package, but they do not give users the ability to control which packages are used in a given group/project.

Helm Chart Registry

Users or organizations that deploy complex pieces of software towards Kubernetes managed environments depend on a standardized way to automate provisioning those external environments. Helm is the package manager for Kubernetes and helps users define, manage, install, upgrade, and rollback even the most complex Kubernetes application. Helm uses a package format called Charts to describe a set of Kubernetes resources. There is a clear path towards utilizing GitLab's Container Registry and new features in Helm 3 to provide a single location for Helm Charts within the Container Registry. GitLab will join the open source community in enabling this capability and improve upon it with features targeted at our EE offering.

Helm charts will be easy to create, version, share and publish right within GitLab. This would provide an official and integrated method to publish, control, and version control Helm charts.

Usecases listed
  1. Public and private repositories for Helm charts
  2. Fine-grained access control
  3. Standardized workflow to version control and publish charts making use of GitLab's other services
What's Next & Why

With the launch of Helm 3, which is now in beta, pushing and pulling charts can now be done via OCI Registry. This means that users can now utilize the GitLab Container Registry for hosting Helm charts. However, due to the way metadata is passed and stored via Docker, it is not currently possible for us to parse this data and meet our performance standards.

The first step is gitlab-#207147, which defines a new database schema for storing Docker manifests.

Competitive Landscape

Helm Hub is the official Helm charts repository, which is supported by products like Artifactory from Jfrog and by Codefresh. Additionally, Chart museum offers an open sourced self-managed solution, aside from being able to code one yourself with GitLab pages, Apache, or by using a GH repo's raw publishing url.

The Azure container registry can be used as a host for Helm chart repositories. With Helm 3 changing the storage backend to container registries, we are evaluating if we can offer the same level of support.

GitHub has added beta support for storing and managing public and private Helm Charts in the GitHub Container Registry to their roadmap for Q2 2021.

Release Evidence

Release Evidence is about addressing the demand of the business to understand what is changing in your software. Our focus is on supporting the variety of controls and automation (security, compliance, or otherwise) to ensure your releases are managed in an auditable and trackable way.

The backbone of this category is creating a single artifact for our users to furnish during an audit or compliance process. The strong integration across GitLab enables the creation of an auditable chain of custody for assets, commits, issues, including satisfactorily meeting quality and security gates. Connecting the changes made in source code to your production state is a unique opportunity GitLab can offer to users. Table stakes for enterprise-grade governance includes traceability of automated actions alongside the gathering of appropriate approvals throughout the release process. Our intention is to streamline the experience of preparing for an audit or compliance review as an organic byproduct of using GitLab.

Release Evidence is complemented by the tangential category within Release of Secrets Management. Also related is Requirements Management from the Plan stage.

What's Next & Why

Currently, we are monitoring feedback and evaluating usage of Release Evidence. Our next goal will be to improve the release evidence performance in GitLab-#196185.

Competitive Landscape

A great differentiator for GitLab, is the expansion of the evidence to include test results via (gitlab#32773), security scans, and other artifacts in (gitlab#2207) to be collected as part of a release generation. This will uniquely enable us to be the single source of truth application for the DevOps lifecycle throughout the audit process.

In today's landscape, "chain of custody" features are ill-defined and not well articulated by our largest competitors. Release Evidence is a strategic feature set that XebiaLabs, Spinnaker, and other CDRA solutions do not readily offer to their users.

Git LFS

Git LFS (Large File Storage) is a Git extension, which reduces the impact of large files in your repository by downloading the relevant versions of them lazily. Specifically, large files are downloaded during the checkout process rather than during cloning or fetching..

This page is maintained by the Product Manager for Package, Tim Rizzi (E-mail), however the prioritization, design and implementation of features and bugs is owned by the Create:Source Code Group.

Use cases
  1. Version large files—even those as large as a couple GB in size—with Git.
  2. Automatically detect LFS-tracked files and clone them via HTTP
  3. Download less data. This means faster cloning and fetching from repositories that deal with large files.
  4. Host more in your Git repositories. External file storage makes it easy to keep your repository at a manageable size.
What's next & why

Up next is gitlab-#219957 which will clarify in the documentation how file locking works with LFS.

Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license