GitLab CI checks off Wish’s list

Wish was looking for a scalable CI tool. What they found with GitLab offers so much more.

San Francisco
Wish was looking for a CI tool that could fulfill all their needs in one solution.
Wish was experiencing single point of failures without visibility into where breaks were happening due to tool chain complexity. Teams faced issues with testing, production, and scalability.
  • End-to-end visibility

  • Independently scalable features

  • Run multiple tests simultaneously

  • Effectively scales up testing

  • Frees up developers’ time

  • Continuous deployment cadence

  • Shifts workflow from debugging to innovation

  • Zero license restrictions

reduction in pipeline runtime
the number of runners
last minute fixes per release

the customer

Connecting consumers with products is one of the top five shopping applications in over 100 countries. The e-commerce retail platform provides consumers with global merchants and gives small businesses a connection to millions of customers worldwide.

Wish considers itself a ‘unicorn’ startup and is now valued at over 11.2 billion dollars. The platform goes even further than connecting buyers with goods, it also provides a wish list of products that people might be interested in purchasing. Wish uses innovative technology to identify customers’ future shopping needs.

the challenge

Homegrown toolchain with a single point of failure

Wish was using TeamCity for CI and build management, but their automated jobs kept failing. The CI environment was precarious, and because the tool lacked visibility, it was difficult to see where breaks were happening. “It looked horrible. Very, very horrible. We had a really old TeamCity instance. It was very unmaintained, and had effectively been broken for at least six plus months, maybe a year when I came in and saw it,” said Weilu Jia, the Head of Tools and Automation Infrastructure at Wish.

Creating pipelines in TeamCity requires going through its UI. If there is a mismatch between what is written in the pipeline and what is available in the Git repository, it can cause the pipeline to fail, blocking any tests, builds, or deployments. Going back in to roll back changes is difficult. “If someone changed the pipeline, you didn’t know what they changed, and you better hope they remembered what they did,” Jia said. “So, there’s kind of this disjointedness between what was being run from what scripts the CI was trying to run and what scripts were available [in Git] for the CI to run.”

TeamCity has restrictive licensing, which means that Wish had to cap the number of agents that it could run. With a growing team, that just wasn’t going to work for Wish’s desire to scale. “It didn’t scale, so we couldn’t use it for testing every commit. It actually had a schedule, like an hourly job to run tests on an hourly cadence. It’s that you have so many commits in an hour, and now you don’t know which one caused the issue,” Jia said.

The development team used a homegrown toolchain that included GitHub for SCM and Phabricator for code reviews. Integration weakness resulted in instability which wasn’t optimal. The monitoring capabilities were also lacking, so the team was looking for a solid integration with their GitHub repos.

the solution

A CI tool that works 100% of the time

Jia and the engineering team came up with a very specific list of requirements that they were looking for in a CI tool. The list includes:

  1. No single point of failure
  2. Scalability
  3. Monitoring to detect failures
  4. Integration with GitHub
  5. Service for customers
  6. Sensitive repos
  7. Support, particularly paid support
  8. Support for Docker
  9. Pipelines that run together
  10. No downtime upgrades

Jia and his team started evaluating CI systems in order to find a tool that would grow with the company. They had lots of tests and lots of automated jobs that they needed to run, so the team was looking for a tool that would work now and with future expansions. Wish evaluated more than five tools, including Jenkins, CircleCI, Travis CI, Drone, and others, before discovering GitLab.

Wish initially started a POC with Jenkins. “It was the best thing at the time and it had been maybe 80% working out while we were POC-ing it,” Jia said. “Actually, funnily enough, we did not know about GitLab when we were doing that initial search, so we literally tried everything except for GitLab and out of the not-GitLab pieces, Jenkins was the best,” Jia said.

Usually when a team is in the midst of a POC, it becomes a part of the tooling system for the company. However, soon after working in Jenkins, they discovered GitLab and did a comparison of the services. “First, from the usability side, and then just from the list of requirements we had,” Jia said. “But, then, we also took a look at the architecture of GitLab and the architecture was better than Jenkins.”

“Jenkins has a single master host, and if that host ever went down, we would have no CI, especially. If we also want to use that for CD, then we have a problem that we can't do rollbacks, hotfixes, etc.”
Weilu Jia
Head of Tools and Automation Infrastructure at Wish

the results

More than just a CI tool

After evaluating and settling on GitLab for its CI capabilities, Wish expedited the move to GitLab and completely closed down TeamCity. They’ve since discovered other services that GitLab offers including SCM, Kubernetes support for CD, Docker containers, support for Prometheus, and security features. Wish currently uses GitLab for CI and CD pipeline. GitLab allows the team to now deploy products multiple times a day, depending on how often the code is changed. They no longer have a restricted cadence and can have continuous delivery.

All the tests that were previously unmaintained have been migrated to GitLab and are now running on every commit. Because of the scalability of runners on GitLab, they were able to effectively add several servers, which saves engineering time. “That was really great for us because it immediately caught a bunch of bugs before that would have normally hit QA,” Jia said. The QA process became a lot faster because testing and catching bugs is easier and is done sooner in the process. Servers have taken over the pain points of engineers. “Just having that scalability and flexibility on that is super helpful for us because servers are way cheaper than engineering time,” he said.

The engineering team has also reduced the time it takes for last minute fixes in the QA environment. It was previously about 15 to 25 minutes and now it has dropped to under 10 minutes. The team initially set an internal SLA for 10 minutes, but they haven’t needed to leverage that – which means that testing is catching more bugs earlier.

The benefits of having a simplified toolchain are being felt throughout the engineering team. “I think most of the value is in engineering time saved,” Jia said. “Having our toolchain be highly available and having it able to scale is what we were seeking. Now that we have more engineers and saving them time, that is the big value we’re getting out of it.”

With the move to GitLab for their CI/CD needs, Wish has removed the bottlenecks associated with complex toolchains. As an e-commerce company, it is vital to provide scalability to the development toolchain. The updated toolchain helps the company meet their Black Friday demands by throwing additional hardware to maintain their pipeline speed.

“Just having the real scalable architecture of GitLab, how every piece is independently scalable was good for us because we [the team] are all scalability nerds. So, looking at that architecture, we can say, ‘Oh my god, it’s sane architecture.’”
Weilu Jia
Head of Tools and Automation Infrastructure at Wish

All information and persons involved in case study are accurate at the time of publication.

Gitlab x icon svg
Git为Software Freedom Conservancy的注册商标,GitLab为GitLab B.V.的注册商标,我们已获授权使用“极狐GitLab”。