一、在作死的路上越走越远
最近代码服务器的Gitlab版本被Stable版本甩的很开,最新的版本已经到了14.0.5,而自己的版本还在13.9.1。所以就想着怎么给升级上去。升级之前还是小心翼翼的做了一次备份,然后就栽在它身上了(QAQ)。自己安装的是Docker版本,用了之前留下来的命令:
docker run -d --name gitlab --restart=always --hostname=code.lintian.co -p 9080:80 \
--volume $GITLAB_HOME/config:/var/opt/gitlab/config \
--volume $GITLAB_HOME/log:/var/opt/gitlab/log \
--volume $GITLAB_HOME/data:/var/opt/gitlab/data \
gitlab/gitlab-ce:13.9.1-ce.0
结果自己脑子抽了,忘了设置环境变量,导致docker映射的目录天差地别,跑起来的gitlab一直都是一个新的。
然后呢....然后自己傻傻地找原因,一直没找到,一怒之下想着自己做了备份,把除了备份压缩文件的其它文件文件以及配置全删了。
接着就是苦逼的恢复升级之路了,找到之前命令行的错误,还把原来的版本给装了回去->恢复->升级。一路从13.9.1-->13.10.5->13.12.7->14.0.5。
恢复升级完已是半夜,能登进去看代码了就上床睡觉去了。
第二天起来发现出大事了,页面好多一点就是500错误(T_T)。
一查才知道原来昨天删的文件里面有gitlab-secrets.json这个重要文件。
影响深远......
恢复路漫漫.....
二、环境介绍
宿主机:SUSE Leap 15.2
Docker: Docker version 20.10.6-ce
Gitlab: gitlab/gitlab-ce:14.0.5-ce.0
三、修复步骤
直接把所有操作给列出来了.后面再说具体操作的影响,虽说就这么几步,但是还是让我苦找了好几个小时
root@code:/# gitlab-rails console
--------------------------------------------------------------------------------
Ruby: ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux]
GitLab: 14.0.5 (25fc1060aff) FOSS
GitLab Shell: 13.19.0
PostgreSQL: 12.6
--------------------------------------------------------------------------------
Loading production environment (Rails 6.1.3.2)
irb(main):003:0> settings = ApplicationSetting.current
=> #<ApplicationSetting id: 1, default_projects_limit: 100000, signup_enabled: true, gravatar_enabled: true, sign_i...
irb(main):004:0> settings.update_column(:runners_registration_token_encrypted, nil)
=> true
irb(main):005:0> exit
root@code:/etc/gitlab# gitlab-rails dbconsole
psql (12.6)
Type "help" for help.
gitlabhq_production=# DELETE FROM public."ci_group_variables";
(0 rows)
gitlabhq_production=# DELETE FROM public."ci_variables";
(0 rows)
gitlabhq_production=# UPDATE projects SET runners_token = null, runners_token_encrypted = null;
UPDATE 13
gitlabhq_production=# UPDATE namespaces SET runners_token = null, runners_token_encrypted = null;
UPDATE 11
gitlabhq_production=# UPDATE application_settings SET runners_registration_token_encrypted = null;
UPDATE 1
gitlabhq_production=# UPDATE application_settings SET encrypted_ci_jwt_signing_key = null;
UPDATE 1
gitlabhq_production=# UPDATE ci_runners SET token = null, token_encrypted = null;
UPDATE 3
gitlabhq_production=# UPDATE ci_builds SET token = null, token_encrypted = null;
UPDATE 44
TRUNCATE web_hooks CASCADE;
四、需要付出的代价
gitlab-secrets.json这个文件存储的是gitlab加密的私钥文件,一些加密操作所用的密钥就在这个文件里面,如果这个文件丢失,就会出现一些操作密钥不对进而返回500错误.
我们可以在production.log里面发现如下的错误信息:
Processing by HelpController#index as */*
Rendered layout layouts/help.html.haml (Duration: 72.3ms | Allocations: 80060)
Completed 200 OK in 76ms (Views: 71.8ms | ActiveRecord: 0.9ms | Elasticsearch: 0.0ms | Allocations: 81723)
Started PATCH "/admin/application_settings/ci_cd" for 192.168.1.254 at 2021-07-15 08:48:35 +0000
Processing by Admin::ApplicationSettingsController#ci_cd as HTML
Parameters: {"authenticity_token"=>"[FILTERED]", "application_setting"=>{"auto_devops_enabled"=>"0", "auto_devops_domain"=>"", "shared_runners_enabled"=>"1", "shared_runners_text"=>"", "max_artifacts_size"=>"100", "default_artifacts_expire_in"=>"30 days", "keep_latest_artifact"=>"1", "archive_builds_in_human_readable"=>"", "protected_ci_variables"=>"[FILTERED]", "default_ci_config_path"=>""}}
Completed 500 Internal Server Error in 175ms (ActiveRecord: 131.5ms | Elasticsearch: 0.0ms | Allocations: 30681)
OpenSSL::Cipher::CipherError ():
app/services/application_settings/update_service.rb:50:in `update_settings'
lib/gitlab/metrics/instrumentation.rb:160:in `block in update_settings'
lib/gitlab/metrics/method_call.rb:27:in `measure'
lib/gitlab/metrics/instrumentation.rb:160:in `update_settings'
app/services/application_settings/update_service.rb:12:in `execute'
lib/gitlab/metrics/instrumentation.rb:160:in `block in execute'
lib/gitlab/metrics/method_call.rb:27:in `measure'
lib/gitlab/metrics/instrumentation.rb:160:in `execute'
app/controllers/admin/application_settings_controller.rb:259:in `perform_update'
这个文件的丢失则导致了如下一些设置无法加解密
- 两步认证失效:两步认证的用户登录错误
- 持续集成和部署:无法配置全局或者项目的CI/CD
- Kubernetes集群:无法配置集群
- Custom Pages domains
- Project error tracking
- Runner:无法配置全局或者项目的runner机器
- Project mirroring
- Web hooks
那么如何确认目前的密钥没有问题呢
gitlab-rake gitlab:doctor:secrets
另外
能够找回gitlab-secrets.json最好,找回之后直接运行gitlab-ctl reconfigure能恢复了.