Title: Still seeing the issue for endpoints staying out of sync · Issue #126578 · kubernetes/kubernetes · GitHub
Open Graph Title: Still seeing the issue for endpoints staying out of sync · Issue #126578 · kubernetes/kubernetes
X Title: Still seeing the issue for endpoints staying out of sync · Issue #126578 · kubernetes/kubernetes
Description: What happened? This issue #125638 was supposed to have fixed the issue where endpoint stay out of sync I0807 14:01:51.613700 2 endpoints_controller.go:348] "Error syncing endpoints, retrying" service="test1/test-qa" err="endpoints inform...
Open Graph Description: What happened? This issue #125638 was supposed to have fixed the issue where endpoint stay out of sync I0807 14:01:51.613700 2 endpoints_controller.go:348] "Error syncing endpoints, retrying" servi...
X Description: What happened? This issue #125638 was supposed to have fixed the issue where endpoint stay out of sync I0807 14:01:51.613700 2 endpoints_controller.go:348] "Error syncing endpoints, retrying&q...
Opengraph URL: https://github.com/kubernetes/kubernetes/issues/126578
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Still seeing the issue for endpoints staying out of sync","articleBody":"### What happened?\n\nThis issue https://github.com/kubernetes/kubernetes/issues/125638 was supposed to have fixed the issue where endpoint stay out of sync \r\n```\r\nI0807 14:01:51.613700 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test-qa\" err=\"endpoints informer cache is out of date, resource version 10168236546 already processed for endpoints test1/test-qa\"\r\nI0807 14:01:51.624576 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test-qa\" err=\"endpoints informer cache is out of date, resource version 10168236546 already processed for endpoints test1/test-qa\"\r\nI0807 14:01:51.645704 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test-qa\" err=\"endpoints informer cache is out of date, resource version 10168236546 already processed for endpoints test1/test-qa\"\r\nI0807 14:01:51.686942 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test-qa\" err=\"endpoints informer cache is out of date, resource version 10168236546 already processed for endpoints test1/test-qa\"\r\nI0807 14:01:51.768648 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test-qa\" err=\"endpoints informer cache is out of date, resource version 10168236546 already processed for endpoints test1/test-qa\"\r\nI0807 14:01:51.808043 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test2-qa\" err=\"endpoints informer cache is out of date, resource version 10168250766 already processed for endpoints test1/test2-qa\"\r\nI0807 14:01:51.930345 2 endpoints_controller.go:348] \"Error syncing endpoints, retrying\" service=\"test1/test-qa\" err=\"endpoints informer cache is out of date, resource version 10168236546 already processed for endpoints test1/test-qa\"\r\n```\r\nI also wrote a small script which would get me the out of sync endpoints compared to the endpointslices \r\n```\r\nfrom kubernetes.client import CoreV1Api, DiscoveryV1Api\r\nfrom hubspot_kube_utils.client import build_kube_client\r\nimport json\r\nimport os\r\nfrom datetime import datetime\r\n\r\ndef extract_ips_from_endpoint(endpoint):\r\n ips = set()\r\n if endpoint.subsets:\r\n for subset in endpoint.subsets:\r\n if subset.addresses:\r\n ips.update(addr.ip for addr in subset.addresses)\r\n if subset.not_ready_addresses:\r\n ips.update(addr.ip for addr in subset.not_ready_addresses)\r\n return ips\r\n\r\ndef extract_ips_from_endpoint_slice(slice):\r\n if not slice.endpoints:\r\n return set()\r\n return set(address for endpoint in slice.endpoints\r\n for address in (endpoint.addresses or []))\r\n\r\ndef compare_endpoints_and_slices(core_client, discovery_client):\r\n all_mismatches = []\r\n\r\n try:\r\n namespaces = core_client.list_namespace()\r\n except Exception as e:\r\n print(f\"Error listing namespaces: {e}\")\r\n return all_mismatches\r\n\r\n for ns in namespaces.items:\r\n namespace = ns.metadata.name\r\n print(f\"Processing namespace: {namespace}\")\r\n\r\n try:\r\n endpoints = core_client.list_namespaced_endpoints(namespace)\r\n except Exception as e:\r\n print(f\"Error listing endpoints in namespace {namespace}: {e}\")\r\n continue\r\n\r\n for endpoint in endpoints.items:\r\n name = endpoint.metadata.name\r\n\r\n try:\r\n slices = discovery_client.list_namespaced_endpoint_slice(namespace, label_selector=f\"kubernetes.io/service-name={name}\")\r\n except Exception as e:\r\n print(f\"Error listing endpoint slices for service {name} in namespace {namespace}: {e}\")\r\n continue\r\n\r\n endpoint_ips = extract_ips_from_endpoint(endpoint)\r\n slice_ips = set()\r\n\r\n for slice in slices.items:\r\n slice_ips.update(extract_ips_from_endpoint_slice(slice))\r\n\r\n if endpoint_ips != slice_ips:\r\n mismatch = {\r\n \"namespace\": namespace,\r\n \"service_name\": name,\r\n \"endpoint_ips\": list(endpoint_ips),\r\n \"slice_ips\": list(slice_ips),\r\n \"missing_in_endpoint\": list(slice_ips - endpoint_ips),\r\n \"missing_in_slice\": list(endpoint_ips - slice_ips)\r\n }\r\n all_mismatches.append(mismatch)\r\n\r\n print(f\"Completed processing namespace: {namespace}\")\r\n print(\"---\")\r\n\r\n return all_mismatches\r\n\r\ndef save_to_json(data, cluster_name):\r\n timestamp = datetime.now().strftime(\"%Y%m%d_%H%M%S\")\r\n filename = f\"{cluster_name}_mismatches_{timestamp}.json\"\r\n\r\n with open(filename, 'w') as f:\r\n json.dump(data, f, indent=2)\r\n\r\n print(f\"Mismatch data for cluster {cluster_name} saved to {filename}\")\r\n\r\ndef main():\r\n clusters = [\"test\"]\r\n all_cluster_mismatches = {}\r\n\r\n for cluster_name in clusters:\r\n print(f\"Processing cluster: {cluster_name}\")\r\n\r\n try:\r\n kube_client = build_kube_client(host=\"TEST\",\r\n token=\"TOKEN\")\r\n\r\n core_client = CoreV1Api(kube_client)\r\n discovery_client = DiscoveryV1Api(kube_client)\r\n\r\n mismatches = compare_endpoints_and_slices(core_client, discovery_client)\r\n\r\n all_cluster_mismatches[cluster_name] = mismatches\r\n\r\n save_to_json(mismatches, cluster_name)\r\n\r\n print(f\"Completed processing cluster: {cluster_name}\")\r\n print(f\"Total mismatches found in this cluster: {len(mismatches)}\")\r\n except Exception as e:\r\n print(f\"Error processing cluster {cluster_name}: {e}\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n main()\r\n```\n\n### What did you expect to happen?\n\nI expect the endpoints to eventually sync and reflect the most upto date information. \n\n### How can we reproduce it (as minimally and precisely as possible)?\n\nI have just deployed the newer patch to our cluster and that has resulted in endpoints never ending up being updated if the status goes out of sync. \n\n### Anything else we need to know?\n\n_No response_\n\n### Kubernetes version\n\nClient Version: v1.29.7\r\nKustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3\r\nServer Version: v1.29.7\r\n\n\n### Cloud provider\n\n\u003cdetails\u003e\r\n\r\n\u003c/details\u003e\r\n\n\n### OS version\n\nalmalinux-9\n\n### Install tools\n\n\u003cdetails\u003e\r\n\r\n\u003c/details\u003e\r\n\n\n### Container runtime (CRI) and version (if applicable)\n\ncri-o\n\n### Related plugins (CNI, CSI, ...) and versions (if applicable)\n\n_No response_","author":{"url":"https://github.com/kedar700","@type":"Person","name":"kedar700"},"datePublished":"2024-08-07T14:22:04.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":16},"url":"https://github.com/126578/kubernetes/issues/126578"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:75427c62-1c5a-537a-02f0-684e86734193 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | BBC4:A94F2:2583FD:32203D:6994F0FF |
| html-safe-nonce | 2832b1ebc41961f376401713349fd2c91f418d5be09e9b065a70df955f20cadc |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJCQkM0OkE5NEYyOjI1ODNGRDozMjIwM0Q6Njk5NEYwRkYiLCJ2aXNpdG9yX2lkIjoiODk1MzQ1NTg2Njg4MTA0NDczNiIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 718be5ed1a4e8c7cbe7aaa0564d3b1d76eb57c1b776cdf6be2325812a0e504e5 |
| hovercard-subject-tag | issue:2453624326 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/kubernetes/kubernetes/126578/issue_layout |
| twitter:image | https://opengraph.githubassets.com/636de8dda3f17534b4c0e6d8b4c017e0652e2ec463489776ab431c24460cf8a3/kubernetes/kubernetes/issues/126578 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/636de8dda3f17534b4c0e6d8b4c017e0652e2ec463489776ab431c24460cf8a3/kubernetes/kubernetes/issues/126578 |
| og:image:alt | What happened? This issue #125638 was supposed to have fixed the issue where endpoint stay out of sync I0807 14:01:51.613700 2 endpoints_controller.go:348] "Error syncing endpoints, retrying" servi... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | kedar700 |
| hostname | github.com |
| expected-hostname | github.com |
| None | 45bfdcf303b8bbf65a4da4dbf4669683e0c8440359e5c27eb3c96256ec925d65 |
| turbo-cache-control | no-preview |
| go-import | github.com/kubernetes/kubernetes git https://github.com/kubernetes/kubernetes.git |
| octolytics-dimension-user_id | 13629408 |
| octolytics-dimension-user_login | kubernetes |
| octolytics-dimension-repository_id | 20580498 |
| octolytics-dimension-repository_nwo | kubernetes/kubernetes |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 20580498 |
| octolytics-dimension-repository_network_root_nwo | kubernetes/kubernetes |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | dd890ce0113567a54b23fc534f145f0af038abc9 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width