OVHcloud Public Cloud Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
Errors 500 can occur on image push/pull
Incident Report for Public Cloud
Resolved
There is a known issue impacting the Managed Private Registry service, which results in image push/pull operations sometimes failing with an Error 500.

Issue seems linked to the way Harbor interacts with s3/Swift services.

Exemple:
\"error.message\":\"unknown error\"
\"err.detail\":\"s3aws: Path not found: /docker/registry/v2/repositories/{my_project}/{my_repository}/_uploads/{layer_id}/data\"

Update(s):

Date: 2021-07-28 16:01:05 UTC
Please see the latest commentary on the incident
------------------------
Nous vous invitons à consulter le dernier commentaire de l'incident



Date: 2021-07-28 15:58:46 UTC
Start Time : 29/03/2021, 09:15 UTC
End Time : 28/07/2021 15:14 UTC
Service impact : There is a known issue impacting the Managed Private Registry service, which results in image push/pull operations sometimes failing with an Error 500.
Root cause : Swift storage update
Comment : The bug is resolved on the new Harbor 2 registry architecture. The bug is still present on the old Harbor 1.10 architecture. We recommend migrating to the new infrastructure by ordering new registries and synchronising the old and new registries. To avoid double billing and to follow a synchronization guide, we provide a link to obtain a voucher and a guide: https://survey.ovh.com/index.php/852914?newtest=Y&lang=en
------------------------------------
Heure de début : 29/03/2021, 09:15 UTC
Heure de fin : 28/07/2021 15:14 UTC
Impact sur le service : Il existe un problème connu affectant le service Managed Private Registry, qui entraîne parfois l'échec des opérations de push/pull d'images avec une erreur 500.
Origine incident : Mise à jour du stockage Swift
Commentaire : Le bug est résolu sur la nouvelle architecture registry Harbor 2. Le bug est encore présent sur l'ancienne architecture Harbor 1.10. Nous recommandons de migrer sur la nouvelle infrastructure en commandant de nouveaux registry et effectuer une synchronisation entre l'ancien et le nouveau registry. Pour éviter une double facturation et suivre un guide de synchronisation, nous vous proposons un lien pour obtenir un voucher et un guide : https://survey.ovh.com/index.php/852914?newtest=Y&lang=en

Date: 2021-07-28 15:08:38 UTC
The fix is deployed, we are in a monitoring phase.
But if you encounter this issue, the workaround is to try pushing the image again.
--------------------------------------------------------------------
Le correctif est déployé, nous sommes dans une phase de suivi.
Mais si vous rencontrez ce problème, la solution est d'essayer de pousser l'image à nouveau.

Date: 2021-07-22 10:16:55 UTC
The fix is deployed, we are in a monitoring phase.
But if you encounter this issue, the workaround is to try pushing the image again.

Date: 2021-07-19 16:42:57 UTC
Un fix a été déployé mais il n'a pas encore été testé, pas de retour constaté
________________________________________________________________________________________
A fix has benn deployed, but it has not still been tested, no feedback has been found



Date: 2021-07-16 08:34:58 UTC
The investigation is still ongoing.
If you encounter this issue, the workaround is to try pushing the image again.

Date: 2021-07-07 16:51:32 UTC
The investigation is still ongoing.
_________________________
Les investigations sont toujours en cours

Date: 2021-07-07 07:35:11 UTC
The investigation is still ongoing.
If you encounter this issue, the workaround is to try pushing the image again.

Date: 2021-07-01 12:37:28 UTC
The layer data is stored under the path `/docker/registry/v2/repositories/{my_project}/{my_repository}/_uploads/{layer_id}/data/XXXXXXXXXXXXXXX`

For an unknown reason, sometimes, the `data` folder is not created so the registry can't get the status and the size of the folder.
The behavior depends on layer size and so the number of chunks.

It's hard to reproduce so I'm trying to compare some existing occurrences of this issue to determine the common factor and found the bug.
Posted Mar 29, 2021 - 09:15 UTC