Harbor started in 2014 as a humble internal project meant to address a simple use case: storing images for developers leveraging containers. The cloud native landscape was wildly different and tools like Kubernetes were just starting to see the light of the day. It took a few years for Harbor to mature to the point of being open sourced in 2016, but the project was a breath of fresh air for individuals and organizations attempting to find a solid container registry solution. We were confident Harbor was addressing critical use cases based on its strong growth in user base early on.
We were incredibly excited when Harbor was accepted to the Cloud Native Sandbox in the summer of 2018. Although Harbor had been open sourced for some years by this point, having a vendor-neutral home immediately impacted the project resulting in increased engagement via our community channels and GitHub activity.
There were many things we immediately began tackling after joining the Sandbox, including addressing some technical debt, laying out a roadmap based solely on community feedback, and expanding the number of contributors to include folks that have consistently worked on improving Harbor from other organizations. We’ve also started a bi-weekly community call where we hear directly from Harbor users on what’s working well and what’s not. Finally, we’ve ratified a project governance model that defines how the project operates at various levels.
Given Harbor’s already-large global user base across organizations small and large, proposing the project mature into the CNCF Incubator was a natural next step. The processes around progressing to Incubation are defined here. In order to be considered, certain growth and maturity characteristics must first be demonstrated by the project:
- Production usage: There must be users of the project that have deployed it to production environments and depend on its functionality for their business needs. We’ve worked closely with a number of large organizations leveraging Harbor the last number of years, so: check!
- Healthy maintainer team: There must be a healthy number of members on the team that can approve and accept new contributions to the project from the community. We have a number of maintainers that founded the project and continue to work on it full time, in addition to new maintainers joining the party: check!
- Healthy flow of contributions: The project must have a continuous and ongoing flow of new features and code being submitted and accepted into the codebase. Harbor released v1.6 in the summer of 2018, and we’re on the verge of releasing v1.7: check!
- CNCF’s Technical Oversight Committee (TOC) evaluated the proposal from the Harbor team and concluded that we had met all the required criteria. It is both deeply humbling and an honor to be in the company of other highly-respected incubated projects like gRPC, Fluentd, Envoy, Jaeger, Rook, NATS, and more.
What’s Harbor anyway?
Harbor is an open source cloud native registry that stores, signs, and scans container images for vulnerabilities.
Harbor solves common challenges by delivering trust, compliance, performance, and interoperability. It fills a gap for organizations and applications that cannot use a public or cloud-based registry, or want a consistent experience across clouds.
Harbor addresses the following common use cases:
- On-prem container registry – organizations with the desire to host sensitive production images on-premises can do so with Harbor.
- Vulnerability scanning – organizations can scan images before they are used in production. Images with failed vulnerability scans can be blocked from being pulled.
- Image signing – images can be signed via Notary to ensure provenance.
- Role-based Access Control – integration with LDAP (and AD) to provide user- and group-level permissions.
- Image replication – production images can be replicated to disparate Harbor nodes, providing disaster recovery, load balancing and the ability for organizations to replicate images to different geos to provide a more expedient image pull.
Architecture
The “Harbor stack” is comprised of various 3rd-party components, including nginx, Docker Distribution v2, Redis, and PostgreSQL. Harbor also relies on Clair for vulnerability scanning, and Notary for image signing.
The Harbor components, highlighted in blue, are the heart of Harbor and are responsible for most of the heavy lifting in Harbor:
- Core Services provides an API and UI interface. Intercepts docker pushes / pulls to provide role-based access control and also to prevent vulnerables images from being pulled and subsequently used in production (all of this is configurable).
- Admin service is being phased out for v1.7, with feature / functionality being merged into the core service.
- Job Service is responsible for running background tasks (e.g., replication, one-shot or recurring vulnerability scans, etc.). Jobs are submitted by the core service and run in the job service component.
Currently Harbor is packaged via both docker-compose service definition and a Helm chart.
Want to learn more?
The best way to learn about Harbor is:
- Our website: https://goharbor.io/
- Harbor’s CNCF webinar: https://www.cncf.io/event/webinar-harbor/
- Slides: https://drive.google.com/file/d/1F6nvZhtw6-bgwdlySXLq3OIvlC6f6Vvb/view?usp=sharing
Community stats and graphs
Harbor has continued an upward trajectory of community growth through 2018. The stats below visualize the consistent growth pre- and post-acceptance into the Cloud Native Sandbox:
Where we are
Harbor is both mature and production-ready. We know of dozens of large organizations leveraging Harbor in production, including at least one serving millions of container images to tens-of-thousands of compute nodes. The various components that comprise Harbor’s overall architecture are battle-tested in real-world deployments.
Harbor is API driven and is being used in custom SaaS and on-prem products by various vendors and companies. It’s easy to integrate Harbor in your environment, whether a customer-facing SaaS or an internal development pipeline.
The Harbor team strives to release quarterly. We’re currently working on our eight major release, v1.7, due out soon. Over the last two releases alone we’ve made marked strides in achieving our long terms goals:
- Native support of Helm charts
- Initial support for deploying Harbor via Helm chart
- Refactoring of our persistence layer, now relying solely on PostgreSQL and Redis – this will help us achieve our high-availability goals
- Added labels and replication filtering based on labels
- Improvements to RBAC, including LDAP group-based access control
- Architecture simplification (i.e., collapsing admin server component responsibilities into core component)
Where we’re going
This is the fun part. 🙂
Harbor is a vibrant community of users – those who use Harbor and publicly share their experiences, the individuals who report and respond to issues, the folks who hang around in our Slack community, and those who spend time on GitHub improving our code and documentation. We’re all incredible fortunate at the rich and exciting ideas that are proposed via GitHub issues on a regular basis.
We’re still working on our v1.8 roadmap, but here are some major features we’re considering and might land at some point in the future (timing to be determined, and contributions are welcome!):
- Quotas – system- and project-level quotas; networking quotas; bandwidth quotas; user quotas; etc.
- Replication – the ability to replicate to non-Harbor nodes.
- Image proxying and caching – a docker pull would proxy a request to, say, Docker Hub, then scan the image before providing to developer. Alternatively, pre-cache images and block images that do not meet vulnerability requirements.
- One-click upgrades and rollbacks of Harbor.
- Clustering – Harbor nodes should cluster, replicate metadata (users, RBAC and system configuration, vulnerability scan results, etc.). Support for wide-area clustering is a stretch goal.
- BitTorrent-backed storage – images are transparently transferred via BT protocol.
- Improved multi-tenancy – provide additional multi-tenancy construct (system → tenant → project)
Please feel free to share your wishlist of features via GitHub; just open an issue and share your thoughts. We keep track of items the community desires and will prioritized based on demand.
How to get involved
Getting involved in Harbor is easy. Step 1: don’t be shy. We’re a friendly bunch of individuals working on an exciting open source project.
The lowest-barrier of entry is joining us on Slack. Ask questions, give feedback, request help, share your ideas on how to improve the project, or just say hello!
We love GitHub issues and pull requests. If you think something can be improved, let us know. If you want to spend a few minutes fixing something yourself – docs, code, error messages, you name it – please feel free to open a PR. We’ve previously discussed how to contribute, so don’t be shy. If you need help with the PR process, the quickest way to get an answer is probably to ping us on Slack.
See you on GitHub!
技术委员会投票推举Harbor为孵化项目
By James Zabala 詹姆斯 扎巴拉
Harbor始于2014年,是一个内部发起的项目,旨在解决一个简单的问题:帮助开发人员存储容器镜像。 当时云原生的版图和现在完全不同,像Kubernetes这样的工具刚刚开始引起注意。 直到2016年,Harbor逐渐成熟并开源,它为试图解决类似问题的个人和组织带来了新的选择。 而用户量的持续增长也让我们相信Harbor 解决了关键的问题。
当Harbor于2018年夏天被 CNCF 接受为 “沙箱”项目时,我们感到非常兴奋。虽然Harbor已经开源了几年,但是有一个供应商中立的“家”立刻提高了社区的活跃度和github上用户的参与度。
成为沙箱项目后,我们立即开始着手改善之前存在的问题,包括偿还技术债务,基于社区反馈制定未来的路线图,并吸纳社区中积极贡献的开发者成为项目成员。我们还每两周定期召开社区电话会议,以便让社区用户有机会更直接地我们向我们提出意见。最后,我们还讨论通过了治理模式,并用它定义社区运行的规范。
鉴于Harbor已经拥有了庞大的用户基础,很自然地,我们认为将项目升级为CNCF“孵化”项目是接下来正确的一步。升级成为“孵化”项目的条件可在这里找到。 为了获得升级的资格,我们首先陈述并表明我们已经达到了要求:
● 生产环境的使用:必须有用户把该项目部署到生产环境,并用它的功能满足其商业需求。过去几年中我们已经和若干个大型组织密切合作帮助解决他们在生产环境使用Harbor中遇到的问题。(满足)
● 健康的维护团队:必须有足够数量的团队成员可以批准并接受社区对项目的新贡献。 除去新加入的维护人员外,本项目的创始团队成员仍在全职工作在此项目上。(满足!)
● 持续健康的成长:必须有持续新代码和功能被提交到项目的代码库中。Harbor在2018年夏天发布了1.6版,1.7版也马上会发布。(满足)
● CNCF的技术委员会(TOC)已经对Harbor团队提出的申请进行了评估,并认为我们已经满足了所有的条件。我们很荣幸地成为像gRPC, Fluentd, Envoy, Jaeger, Rook, NATS等有高度声望的项目中的一员。
Harbor是什么?
Harbor是开源的云原生镜像库用来存储、签名容器镜像并对容器镜像进行漏洞扫描。
Harbor通过提供信任、合规、性能以及互操作等能力来解决常见的挑战。Harbor的出现使得那些无法使用公有仓库、基于云的镜像库异或在多云环境中想保留一致性体验的组织和应用有了新的选择。
Harbor解决以下常见用例:
● 本地容器镜像库-希望在本地托管敏感生产镜像的组织可以使用Harbor。
● 漏洞扫描-组织可以在上线前对镜像进行扫描。具有安全漏洞的镜像将被阻止拉取。
● 镜像签名-通过Notary对镜像进行签名以确保来源。
● 基于角色的访问控制-集成LDAP(和AD)来提供用户-组级别的权限控制。
● 镜像复制-生产镜像可以复制到不同的Harbor节点,以提供灾难恢复、负载均衡以及在不同地理位置复制镜像进而实现更方便的镜像拉取的能力。
系统架构
“Harbor技术栈”中有多个第三方组件,包括NGINX,Docker,Distribution V2,Redis以及PostgreSQL。Harbor同时依赖Clair来进行漏洞扫描,依赖Notary进行镜像签名。
图中蓝色高亮的Harbor组件,是Harbor的核心,承担大部分的处理工作:
● 内核服务-提供API和UI服务接口。解析docker push/pull请求以提供基于角色的访问控制,同时阻止漏洞镜像被拉取并随后用在生产环境中(都可配置)
● 配置服务将在V1.7中移除,相关的功能会合并到内核服务中去。
● 任务队列服务主要负责运行后台任务(比如:复制、单次或者经常性的漏洞扫描等)。任务通过内核服务提交后在任务队列服务组件中运行。
目前可通过docker-compose和Helm chart来部署Harbor。
更多学习资料
学习Harbor的最佳途径:
● Harbor 主页:https://goharbor.io/
● Harbor CNCF在线研讨链接:https://www.cncf.io/event/webinar-harbor/
○ 胶片:
https://drive.google.com/file/d/1F6nvZhtw6-bgwdlySXLq3OIvlC6f6Vvb/view?usp=sharing
社区统计图表
从2018年,Harbor社区成长度呈现持续上升的趋势。下列图表显示了Harbor作为”沙箱“项目加入CNCF基金会前后的稳定增长。
现状
Harbor是一个成熟且可付诸于生产的项目。已知有大量的公司在他们的生产环境中使用Harbor,其中至少一家公司成功部署了上万个Harbor节点来管理其百万级的容器镜像,这足以证明Harbor所有的功能组件都经过了实际工程检验。
Harbor是可API驱动的项目,已被许多厂商和合作公司集成到他们订制的SaaS系统或者离线安装产品中。部署Harbor到你的环境中并非难事,无论你的环境是面向客户的SaaS系统还是内部开发流程。
Harbor团队致力于季度发布。我们正在研发并将很快发布第八个大版本 – v1.7。对于实现Harbor的长期目标,仅在过去的两个版本中,我们就已经取得了显著的进步:
● 增加了Helm Charts的支持
● 支持使用Helm chart方式部署Harbor
● 重构持久化层,使得其仅依赖于PostgreSQL和Redis,这有助于实现高可用
● 增加标签功能,支持通过标签过滤的镜像复制
● 增强RBAC,包括支持LDAP的用户组管理
● 简化架构(合并admin server组件的职能到core组件)
未来发展
到了有趣的部分了。 🙂
Harbor是一个充满活力的社区,这离不开广大用户的积极参与:大家使用并公开分享经验、提出和回答问题、持续关注Slack社区、改进代码和文档。GitHub上经常有用户提出丰富而令人兴奋的想法,我们非常庆幸有着这样一群活跃的社区参与者。
我们仍然在制定1.8版本路线图,但是这里有一些我们正在考虑的主要功能,可能会在未来的某个时刻落地(时间待定,欢迎贡献!):
● 配额——系统和项目级配额、网络配额、带宽配额、用户配额等
● 复制——复制到非Harbor节点的能力
● 镜像代理和缓存——docker pull的请求将会被代理,比如对于从Docker Hub拉取的镜像,可以先对其进行扫描然后再提供给开发人员。还可以对镜像进行预缓存或者阻止不符合漏洞要求的镜像被下载
● 一键升级和回滚
● 群集——Harbor节点应该群集化,相互之间复制元数据(用户,RBAC和系统配置,漏洞扫描结果等),进而支持更大范围的集群
● BitTorrent支持的存储——镜像通过BT协议透明传输
● 改进多租户——提供额外的多租户结构(系统→租户→项目)
如果您期待Harbor拥有某些新功能,只需要在GitHub开一个issue并描述您的想法,我们会跟踪社区反馈的需求列表,并根据需求程度确定优先级。
如何参与
参与到Harbor的过程非常容易。首先,放开胆子,参与到这个激动人心的开源项目中的人们都很友好。
加入到Slack是进入的最低门槛, 你可以在里面问个问题,给个反馈,请求帮助,分享如何改进项目的想法,或者只是打个招呼!
我们喜欢使用github提出问题和pull请求,如果你想到了哪些地方可以改进,告诉我们。如果你想自己花些时间来解决一些问题–例如:文档,代码,错误消息等任何你能想到的东西,请提交pull请求(PR)吧。前面我们谈到了如何贡献,所以不要犹豫,如果你在提交PR的过程中需要帮助,在Slack找我们,这里能得到快速帮助。
相约Github,不见不散!