Skip to content

chore(deps): update dependency ray to v2.54.0 [security]#6633

Merged
renovate[bot] merged 3 commits intodevelopfrom
renovate/pypi-ray-vulnerability
Mar 2, 2026
Merged

chore(deps): update dependency ray to v2.54.0 [security]#6633
renovate[bot] merged 3 commits intodevelopfrom
renovate/pypi-ray-vulnerability

Conversation

@renovate
Copy link
Contributor

@renovate renovate bot commented Feb 20, 2026

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package Change Age Confidence
ray 2.53.02.54.0 age confidence

GitHub Vulnerability Alerts

CVE-2026-27482

Summary

Ray’s dashboard HTTP server blocks browser-origin POST/PUT but does not cover DELETE, and key DELETE endpoints are unauthenticated by default. If the dashboard/agent is reachable (e.g., --dashboard-host=0.0.0.0), a web page via DNS rebinding or same-network access can
issue DELETE requests that shut down Serve or delete jobs without user interaction. This is a drive-by availability impact.

Details

  • Middleware: python/ray/dashboard/http_server_head.py#get_browsers_no_post_put_middleware only checks POST/PUT via is_browser_request (UA/Origin/Sec-Fetch heuristics). DELETE is not gated.
  • Endpoints lacking browser protection/auth by default:
    • python/ray/dashboard/modules/serve/serve_head.py: @​routes.delete("/api/serve/applications/") calls serve.shutdown().
    • python/ray/dashboard/modules/job/job_head.py: @​routes.delete("/api/jobs/{job_or_submission_id}").
    • python/ray/dashboard/modules/job/job_agent.py: @​routes.delete("/api/job_agent/jobs/{job_or_submission_id}") (not wrapped with deny_browser_requests either).
  • Dashboard token auth is optional and off by default; binding to 0.0.0.0 is common for remote access.

PoC

Prereqs: dashboard reachable (e.g., ray start --head --dashboard-host=0.0.0.0), no token auth.

  1. Start Serve (or have jobs present).
  2. From any browser-reachable origin (DNS rebinding or same-LAN page), issue a DELETE fetch:
fetch("http://<dashboard-host>:8265/api/serve/applications/", {
    method: "DELETE",
    headers: { "User-Agent": "Mozilla/5.0" }  // browsers set this automatically
  });

Result: Serve shuts down.
3) Similarly, delete jobs:

fetch("http://<dashboard-host>:8265/api/jobs/<job_or_submission_id>", { method: "DELETE" });
fetch("http://<dashboard-agent>:52365/api/job_agent/jobs/<job_or_submission_id>", { method: "DELETE" });

Browsers will send the Mozilla UA and Origin/Sec-Fetch headers, but DELETE is not blocked by the middleware, so the requests succeed.

Impact

  • Availability loss: Serve shutdown; job deletion. Triggerable via drive-by browser requests if the dashboard/agent ports are reachable and auth is disabled (default).
  • No code execution from this vector, but breaks isolation/trust assumptions for “developer-only” endpoints.

Fix

The fix for this vulnerability is to update to Ray 2.54.0 or higher.

Fix PR: https://github.com/ray-project/ray/pull/60526


Release Notes

ray-project/ray (ray)

v2.54.0

Compare Source

Ray Data

🎉 New Features

  • Add checkpointing support to Ray Data (#​59409)
  • Compute Expressions: list operations (#​59346), fixed-size arrays (#​58741), string padding (#​59552), logarithmic (#​59549), trigonometric (#​59712), arithmetic (#​59678), and rounding (#​59295)
  • Add sql_params support to read_sql (#​60030)
  • Add AsList aggregation (#​59920)
  • Support CountDistinct aggregate (#​59030)
  • Add credential provider abstraction for Databricks UC datasource (#​60457)
  • Support callable classes for UDFExpr (#​56725)
  • Add autoscaler metrics to Data Dashboard (#​60472)
  • Add optional filesystem parameter to download expression (#​60677)
  • Allow specifying partitioning style or flavor in write_parquet() (#​59102)
  • New cluster autoscaler enabled by default (#​60474)

💫 Enhancements

  • Improve numerical stability in scalers by handling near-zero values (#​60488)
  • Export dataset operator output schema to event logger (#​60086)
  • Iceberg: add retry policy for Storage + Catalog writes (#​60620)
  • Iceberg: remove calls to Catalog Table in write tasks (#​60476)
  • Expose logical operators and rules via package exports (#​60297, #​60296)
  • Demote Sort from requiring preserve_order (#​60555)
  • Improve appearance of repr(dataset) (#​59631)
  • Allow configuring DefaultClusterAutoscalerV2 thresholds via env vars (#​60133)
  • Use Arrow IPC for Arrow Schema serialization/deserialization (#​60195)
  • Store _source_paths in object store to prevent excessive spilling during read task serialization (#​59999)
  • Add more shuffle fusion rules (#​59985)
  • Enable and tune DownstreamCapacityBackpressurePolicy (#​59753)
  • Enable concurrency cap backpressure with tuning (#​59392)
  • Set default actor pool scale up threshold to 1.75 (#​59512)
  • Don't downscale actors if the operator hasn't received any inputs (#​59883)
  • Don't reserve GPU budget for non-GPU tasks (#​59789)
  • Only return selected data columns in hive-partitioned Parquet files (#​60236)
  • Ordered + FIFO bundle queue (#​60228)
  • Add node_id, pid, attempt number for hanging tasks (#​59793)
  • Revise resource allocator task scheduling to factor in pending task outputs (#​60639)
  • Track block serialization time (#​60574)
  • Use metrics from OpRuntimeMetrics for progress (#​60304)
  • Tabular form for streaming executor op metrics (#​59774)
  • Info-log cluster scale-up decisions (#​60357)
  • Use plain mode instead of grid mode for OpMetrics logging (#​59907)
  • Progress reporting refactors (#​59350, #​59629, #​59880)
  • Remove deprecated TENSOR_COLUMN_NAME constant (#​60573)
  • Remove meta_provider parameter (#​60379)
  • Decouple Ray Train from Ray Data by removing top-level ray.data imports (#​60292)
  • Move extension types to ray.data (#​59420)
  • Skip upscaling validation warning for fixed-size actor pools (#​60569)
  • Make StatefulShuffleAggregation.finalize allow incremental streaming (#​59972)
  • Revisit OutputSplitter semantics to avoid unnecessary buffer accumulation (#​60237)
  • Update to PyArrow 23 (#​60739, #​59489)
  • Add BackpressurePolicy to streaming executor progress bar (#​59637)
  • Support Arrow-based transformations for preprocessors (#​59810)
  • StandardScaler preprocessor with Arrow format (#​59906)
  • OneHotEncoder with Arrow format (#​59890)

🔨 Fixes

  • Fuse MapBatches even if they modify the row count (#​60756)
  • Don't push limit past map_batches by default (#​60448)
  • Fix wrong type hint of other dataset in zip and union (#​60653)
  • Fix ActorPoolMapOperator to guarantee dispatch of all given inputs (#​60763)
  • Fix ArrowInvalid error when backfilling missing fields from map tasks (#​60643)
  • Fix attribute error in UnionOperator.clear_internal_output_queue (#​60538)
  • Fix DefaultClusterAutoscalerV2 raising KeyError: 'CPU' (#​60208)
  • Fix ReorderingBundleQueue handling of empty output sequences (#​60470)
  • Fix task completion time without backpressure grafana panel metric name (#​60481)
  • Fix Union operator blocking when preserve_order is set (#​59922)
  • Fix autoscaler requesting empty resources instead of previous allocation when not scaling up (#​60321)
  • Fix autoscaler not respecting user-configured resource limits (#​60283)
  • Fix DefaultAutoscalerV2 not scaling nodes from zero (#​59896)
  • Fix Iceberg warning message (#​60044)
  • Fix Parquet datasource path column support (#​60046)
  • Fix ProgressBar with use_ray_tqdm (#​59996)
  • Fix stale stats on refit for preprocessors (#​60031)
  • Fix StreamingRepartition hang with empty upstream results (#​59848)
  • Fix operator fusion bug to preserve UDF modifying row count (#​59513)
  • Fix AutoscalingCoordinator double-allocating resources for multiple datasets (#​59740)
  • Fix DownstreamCapacityBackpressurePolicy issues (#​59990)
  • Fix AutoscalingCoordinator crash when requesting 0 GPUs on CPU-only cluster (#​59514)
  • Fix TensorArray to Arrow tensor conversion (#​59449)
  • Fix resource allocator not respecting max resource requirement (#​59412)
  • Fix GPU autoscaling when max_actors is set (#​59632)
  • Fix checkpoint filter PyArrow zero-copy conversion error (#​59839)
  • Restore class aliases to fix deserialization of existing datasets (#​59828, #​59818)
  • Fix DataContext deserialization issue with StatsActor (#​59471)

📖 Documentation

  • Sort references in "Loading data and Saving data" pages (#​60084)
  • Fix inconsistent heading levels in "How to write tests" guide (#​60706)
  • Clarify resource_limits refers to logical resources (#​60109)
  • Update read_lance doc (#​59673)
  • Fix broken link in read_unity_catalog docstring (#​59745)
  • Fix bug in docs for enable_true_multi_threading (#​60515)
  • Add more education around transformations (#​59415)

Ray Serve

🎉 New Features

  • Queue-based autoscaling for TaskConsumer deployments (phase 1). Introduces a QueueMonitor actor that queries message brokers (Redis, RabbitMQ) for queue length, enabling TaskConsumer scaling based on pending tasks rather than HTTP load. (#​59430)
  • Default autoscaling parameters for custom policies. New apply_autoscaling_config decorator allows custom autoscaling policies to automatically benefit from Ray Serve's standard parameters (delays, scaling factors, bounds) without reimplementation. (#​58857)
  • label_selector and bundle_label_selector in Serve deployments. Deployments can now specify node label selectors for scheduling and bundle-level label selectors for placement groups, useful for targeting specific hardware (e.g., TPU topologies). (#​57694)
  • Deployment-level autoscaling observability. The controller now emits a structured JSON serve_autoscaling_snapshot log per autoscaling-enabled deployment each control-loop tick, with an event summarizer that reduces duplicate logs. (#​56225)
  • Batching with multiplexing support. Batching now guarantees each batch contains requests for the same multiplexed model, enabling correct multiplexed model serving with @serve.batch. (#​59334)

💫 Enhancements

  • Replica routing data structure optimizations. O(1) pending-request lookups, cached replica lists, lazy cleanup, optimized retry insertion, and metrics throttling yield significant routing performance improvements. (#​60139)
  • New operational metrics suite. Added long-poll metrics, replica lifecycle metrics, app/deployment status metrics, proxy health and request routing delay metrics, event loop utilization metrics, and controller health metrics — greatly improving monitoring and debugging capabilities. (#​59246, #​59235, #​59244, #​59238, #​59535, #​60473)
  • Autoscaling config validation. lookback_period_s must now be greater than metrics_interval_s, preventing silent misconfigurations. (#​59456)
  • Cross-version root_path support for uvicorn. root_path now works correctly across all uvicorn versions, including >=0.26.0 which changed how root_path is processed. (#​57555)
  • Preserve user-set gRPC status codes. When deployments raise exceptions after setting a gRPC status code on the context, that code is now correctly propagated to the client instead of being overwritten with INTERNAL. Error messages are truncated to 4 KB to respect HTTP/2 trailer limits. (#​60482)
  • Replica ThreadPoolExecutor capped to num_cpus. The user-code event loop's default ThreadPoolExecutor is now limited to the deployment's num_cpus, preventing oversubscription when using asyncio.to_thread. (#​60271)
  • Generic actor registration API for shutdown cleanup. Deployments can register auxiliary actors (e.g., PrefixTreeActor) with the controller for automatic cleanup on serve.shutdown(), eliminating cross-library import dependencies. (#​60067)
  • Deployment config logging in controller. Deployment configurations are now logged in the controller for easier debugging and auditability. (#​59222, #​59501)
  • Pydantic v1 deprecation warning. A FutureWarning is now emitted at ray.init() when Pydantic v1 is detected, as support will be removed in Ray 2.56. (#​59703)

🔨 Fixes

  • Fixed tracing signature mismatch across processes. Resolved TypeError: got an unexpected keyword argument _ray_trace_ctx when calling actors from a different process than the one that created them (e.g., serve start + dashboard interaction). (#​59634)
  • Fixed ingress deployment name collision. Ingress deployment name was incorrectly modified when a child deployment shared the same name, causing routing failures. (#​59577)
  • Fixed downstream deployment over-provisioning. Downstream deployments no longer over-provision replicas when receiving DeploymentResponse objects. (#​60747)
  • Fixed replicas hanging forever during draining. Replicas no longer hang indefinitely when requests are stuck during the draining phase. (#​60788)
  • Fixed TaskProcessorAdapter shutdown during rolling updates. Removed shutdown() from __del__, which was broadcasting a kill signal to all Celery workers instead of just the local one, breaking rolling updates. (#​59713)
  • Fixed Windows test failures. Resolved tracing file handle cleanup on Windows, skipped incompatible gRPC and tracing tests on Windows. (#​60078, #​60356, #​60393, #​59771)
  • Fixed flaky tests. Addressed gauge throttling race in test_router_queue_len_metric, ensured proxy replica queue cache is populated before GCS failure tests, and added metrics server readiness checks. (#​60333, #​60466, #​60468)
  • Fixed distilbert test segfault. Worked around a pyarrow/jemalloc crash triggered by specific import ordering of FastAPI, torch, and TensorFlow. (#​60478)

📖 Documentation

  • Improved autoscaling documentation. Clarified the relationship between delays, metric push intervals, and the autoscaling control loop. (#​59475)
  • New example: video analysis inference. End-to-end notebook demonstrating a Serve application for scene change detection, - tagging, and video description. (#​59859)
  • New examples: model multiplexing and model composition. Published workload-based examples for forecasting with model multiplexing and recommendation systems with model composition. (#​59166)
  • Model registry integration guide. Added documentation for integrating Serve with model registries (e.g., MLflow). (#​59080)
  • Fixed broken documentation links. Resolved 404 errors for async inference, MLflow registry example, and LLM code examples. (#​59917, #​60071, #​59520, #​59521, #​60181)
  • Fixed monitoring docs. Corrected target replicas metric emission to enable time-series comparison with actual replicas. (#​59571)
  • Async inference template. Added an end-to-end template for building asynchronous inference applications with Ray Serve. (#​58393, #​59926)

🏗 Architecture refactoring

  • Environment variable cleanup (5-part series). Removed deprecated and redundant env vars (RAY_SERVE_DEFAULT_HTTP_HOST, RAY_SERVE_DEFAULT_HTTP_PORT, RAY_SERVE_DEFAULT_GRPC_PORT, RAY_SERVE_HTTP_KEEP_ALIVE_TIMEOUT_S, RAY_SERVE_REQUEST_PROCESSING_TIMEOUT_S, RAY_SERVE_ENABLE_JSON_LOGGING, RAY_SERVE_ALWAYS_RUN_PROXY_ON_HEAD_NODE), cleaned up legacy constant fallbacks, and added documentation for previously undocumented env vars (e.g., RAY_SERVE_CONTROLLER_MAX_CONCURRENCY, RAY_SERVE_ROOT_URL, proxy health check settings, and fault tolerance params). Users relying on removed env vars should migrate to the Serve config API (http_options, grpc_options, LoggingConfig). (#​59470, #​59619, #​59647, #​59963, #​60093)

Ray Train

🎉 New Features

  • Add TPU multi-slice support to JaxTrainer (#​58629)
  • Update async validation API (#​59428)
  • Add a CallbackManager and guardrail some callback hooks (#​60117)
  • Add inter-execution file shuffling for deterministic multi-epoch training (#​59528)
  • Resume validations on driver restoration (#​59270)

💫 Enhancements

  • Pass ray remote args to validation task (#​60203)
  • Deprecate Predictor API (#​60305)
  • Increase worker group start default timeout to 60s (#​60376)
  • Unify PlacementGroup and SlicePlacementGroup interface in WorkerGroup (#​60116)
  • Cleanup zombie RayTrainWorker actors (#​59872)
  • Add usage telemetry for checkpointing and validation (#​59490)
  • Validate that validation is called with a checkpoint (#​60548)
  • Replace pg.ready() with pg.wait() in worker group (#​60568)
  • Rename DatasetsSetupCallback to DatasetsCallback (#​59423)
  • Update "Checkpoint Report Time" metric title to "Cumulative Checkpoint Report Time" (#​58470)
  • Add training failed error back to failure policy log (#​59957)
  • Decouple Ray Train from Ray Data by removing top-level imports (#​60292)

🔨 Fixes

  • Add try-except for pg.wait() (#​60743)
  • TrainController reraises AsyncioActorExit (#​59461)

📖 Documentation

  • Add a JaxTrainer template (#​59842)
  • Update Jax doc to include GPU and multi-slice TPU support (#​60593)
  • Document checkpoint_upload_fn backend and cuda:nccl backend support (#​60541)
  • Rename checkpoint_upload_func to checkpoint_upload_fn in docs (#​60390)
  • Fix Ray Train workloads and PyTorch with ASHA templates (#​60537)
  • Publish Ray Train workload example (#​58936)

Ray Tune

🔨 Fixes

  • Avoid file deletion race by using unique tmp file names (#​60556)

Ray LLM

🎉 New Features

  • Add /tokenize and /detokenize endpoints (#​59787)
  • Add /collective_rpc endpoint for RLHF weight synchronization (#​59529)
  • Add Control Plane API for Sleep/Wakeup (#​59455)
  • Add Pause/Resume Control Plane API (#​59523)
  • Add support for classification and scoring models (#​59499)
  • Add pooling parameter (#​59534)
  • Support vLLM structured outputs with backward-compat for guided_decoding (#​59421)
  • Add CPU support to Ray Serve LLM (#​58334)
  • Add should_continue_on_error support for ServeDeploymentStage (#​59395)
  • Support configuring HttpRequestUDF resources (#​60313)

💫 Enhancements

  • Upgrade vLLM to 0.15.0 (#​60679)
  • Unify schema of success and failure rows (#​60572)
  • Prefer uniproc executor over mp executor when world_size==1 (#​60403)
  • Use compute instead of concurrency to specify ActorPool size (#​59645)
  • Remove DataContext overrides in Ray Data LLM Processor (#​60142)
  • Use numpy arrays for embeddings to avoid torch.Tensor serialization overhead (#​59919)
  • Make PrefixCacheAwareRouter imbalance threshold less surprising (#​59390)
  • Allow tokenized_prompt without prompt in vLLMEngineStage (#​59801)
  • Avoid passing enums through fn_constructor_kwargs (#​59806)
  • Refactor Control Plane endpoints into mixins (#​59502)
  • Remove CUDA_VISIBLE_DEVICES deletion workaround (#​60502)

🔨 Fixes

  • Fix nested dict to Namespace conversion in vLLM engine initialization (#​60380)
  • Fix JSON non-serializable ndarray exception in http_request_stage (#​60299)
  • Exit actor on EngineDeadError to enable recovery (#​60145)
  • Fix NIXL port conflict in prefill-decode disaggregation test (#​60057)

📖 Documentation

  • Batch inference docs reorg and update to reflect per-stage config refactor (#​59214)
  • Add resiliency section and refine doc code (#​60594)
  • Add video/audio examples for vLLMEngineProcessor (#​59446)
  • Add SGLang integration example (#​58366)
  • Remove inaccurate statement in docs (#​60425)

Ray RLlib

🎉 New Features

  • Add TQC (Truncated Quantile Critics) algorithm implementation (#​59808)
  • Add LR scheduling ability to BC and MARWIL (#​59067)
  • RLlib and Ray Tune: Hyperparameter Optimisation example (#​60182)

💫 Enhancements

  • 🔥 APPO improvements: learner pipeline performance improvements (#​59544)
  • Improve stateful model training on offline data (#​59345)
  • Create resource bundle per learner (#​59620)
  • Improve env runner sampling by replacing recursive solution with iterative solution (#​56082)
  • Improve IMPALA examples and premerge (#​59927)
  • Remove MLAgents dependency (#​59524)
  • Upgrade to gymnasium v1.2.2 (#​59530)
  • Decrease log quantity for learning tests (#​59005)
  • Update learner state warnings to the debug level (#​60178)
  • Don't log np.nanmean warnings in EMA stats (#​60408)

🔨 Fixes

  • Fix DQN RLModule forward methods to handle dict spaces (#​60451)
  • Fix LearnerGroup.load_module_state() and mark as deprecated (#​60354)
  • Fix static dimension issue in ONNX export of Torch attention models (#​60102)
  • Fix Multi-Agent Episode concatenation for sequential environments (#​59895)
  • Fix module episode returns metrics accumulation for shared module IDs (#​60234)
  • Fix rollout fragment length calculation in AlgorithmConfig (#​59438)
  • Fix checkpointable issues with cloud storages (#​60440)
  • Update flatten_observations.py for nested spaces for ignored multi-agent (#​59928)

Ray Core

🎉 New Features

  • Resource Isolation: unify config construction, add public docs, and expose cgroup_path in ray.init() (#​59372, #​60183, #​60726)
  • Support tensor-level deduplication for NIXL (#​60509)
  • Add CUDA IPC transport for RDT (#​59838)
  • Register custom transport at runtime for RDT (#​59255)
  • Support TPU v7x accelerator type for device discovery (#​60338)
  • Introduce local port service discovery (#​59613)
  • Cancel sync actor by checking is_canceled() (#​58914)
  • Support labels for ray job submit --entrypoint-resource (#​59735)
  • Add --ip option in ray attach (#​59931)
  • Add bearer token support for remote URI downloads (#​60050)
  • Support HTTP redirection download (#​59384)
  • Add ray kill-actor --name/--namespace for force/graceful shutdown (#​60258)

💫 Enhancements

  • Bound object spilling file size to avoid disk increase pressure (#​60098)
  • Replace SHA-1 with SHA-256 for internal hash operations (#​60242)
  • Use whitelist approach to block mutation requests from browser (#​60526)
  • Pass authentication headers to WebSocket connections in tail_job_logs (#​60346)
  • Add auth to Dashboard HTTP agent and client (#​59891)
  • Use dedicated service account path for Ray auth tokens (#​60409)
  • Update Kubernetes token auth verb to ray:write (#​60411)
  • Replace RAY_AUTH_MODE=k8s with separate config for Kubernetes token auth (#​59621)
  • Optimize token auth: use shared_ptr caching and avoid per-RPC construction (#​59500)
  • Optimize OpenTelemetry metric recording calls (#​59337)
  • Throttle infeasible resource warning (#​59790)
  • Add default excludes for working_dir uploads (#​59566)
  • Tell users why objects cannot be reconstructed (#​59625)
  • Extend instance allocation timeout in autoscaler v2 (#​60392)
  • Remove GCS centralized scheduling (#​59979, #​60121, #​60188)
  • Demote stale sync message drop log to DEBUG in RaySyncer (#​59616)
  • Migrate remaining std::unordered_map to absl::flat_hash_map (#​59921)
  • Add missing fields to NodeDefinitionEvent proto (#​60314)
  • Add actor and task event missing fields (#​60287)
  • Add node id to the base event (#​59242)
  • Add repr_name to actor_lifecycle_event (#​59925)
  • Support ALL in exposable event config (#​59878)
  • Support publishing events from aggregator to GCS (#​55781)
  • Update the attempt number of actor creation task when actor restarts (#​58877)
  • Unify node feasibility and availability checking for GPU fractions (#​59278)
  • Update TPU utils for multi-slice compatibility (#​59136)
  • Improve SubprocessModuleHandle.destroy_module() resource cleanup (#​60172)
  • Support viewing PIDs for Dashboard and Runtime Env Agent (#​58701)
  • Optimize autoscaler monitor by moving resource demand parsing outside loop (#​59190)
  • Avoid GCS query for is_head in dashboard agent startup (#​59378)
  • Skip reporter and event aggregator client creation in minimal mode (#​59846)
  • Support out-of-order actors by extracting metadata when creating (RDT) (#​59610)
  • Synchronize CUDA stream before registering for NIXL (#​60072)
  • Atomically send/recv for two-sided ordering (RDT) (#​60202)
  • Add get_session_name() to RuntimeContext (#​59469)
  • Make MAX_APPLICATION_ERROR_LEN configurable via env var (#​59543)
  • Preserve function signatures through Ray decorators (#​60479)

🔨 Fixes

  • Fix idle_time_ms resetting for nodes not running tasks (#​60581)
  • Fix task event loss during shutdown (#​60247)
  • Filter bad subscriber messages from taking down GCS publisher (#​60252)
  • Fix RAY_EXPERIMENTAL_NOSET_* environment variable parsing in accelerator managers (#​60577)
  • Fix ray start --no-redirect-output crash (#​60394)
  • Fix drain state propagation race condition (#​59536)
  • Fix use-after-free race condition in OpenTelemetry gauge metric callback during shutdown (#​60048)
  • Fix PSUTIL_PROCESS_ATTRS returning empty list on Windows (#​60173)
  • Fix deadlock in garbage collection when holding lock (#​60014)
  • Fix incorrect error handling in autoscaler for available_node_types on on-prem clusters (#​60184)
  • Fix invalid status transitions in autoscaler v2 (#​60412, #​59550)
  • Fix GCS crash from race condition in MetricsAgentClient exporter initialization (#​59611)
  • Fix tracing signature mismatch when calling actors from different processes (#​59634)
  • Fix crash when killing actor handle from previous session (#​59425)
  • Fix multiple deployment same name resolve (#​59577)
  • Handle dual task errors with read-only args (#​59507)
  • Handle exceptions raised by internal_ip() within StandardAutoscaler (#​57279)
  • Fix uv_runtime_env_hook.py to pin worker Python version (#​59768)
  • Fix STRICT_PACK placement groups ignoring bundle label selectors (#​60170)
  • Fix logging bug when log value is an empty string (#​59434)
  • Fix aggregator-to-GCS event conversion (#​59783)
  • Raise error on tail log job error in newer Ray versions (#​59506)
  • Fix num retries left message (#​59829)
  • Fix psutil internal API usage in dashboard disk usage reporting (#​59659)
  • Fix event exporter init ray check (#​60073)
  • Prevent use-after-free error in core worker shutdown (#​58435)
  • Fix task name inconsistency in RUNNING vs FINISHED metrics (#​59893)
  • Fix symmetric_run using wrong condition to check GCS readiness (#​59794)
  • Preserve Pydantic details when serialization fails (#​59401)
  • Retry GCP project metadata updates on HTTP 412 errors (#​60429)
  • Fix v1 autoscaler TypeError when using bundle_label_selectors (#​59850)
  • Shorten SHA-256 hex with base32 to comply with GCP label limits (#​60722)

📖 Documentation

  • Add initial user guide for Ray resource isolation with writable cgroups (#​59051)
  • Add token authentication internals documentation (#​59299)
  • Update metric exporter docs (#​59874)
  • Add internal documentation for Port Service Discovery (#​59844)
  • Update misleading Ray job diagram (#​59940)
  • Add debugging logs related to pinned argument size limit (#​60175)
  • Add slow startup tip to podman troubleshooting docs (#​59942)
  • Clarify ray.shutdown() behavior for local vs remote clusters (#​59845)
  • Improve placement group fault tolerance doc (#​59830)
  • Add head-node memory growth and OOM guidance (#​58695)
  • Add documentation for RAY_RUNTIME_ENV_BEARER_TOKEN env var (#​60136)

Dashboard

💫 Enhancements

  • Support more panels in dashboard (#​60018)
  • Add autoscaler metrics to Data Dashboard (#​60472)
  • Support viewing PIDs for Dashboard and Runtime Env Agent (#​58701)

🔨 Fixes

  • Update total for dark mode color (#​60106)

Ray Wheels and Images

Documentation

  • Add committership documentation (#​60069)
  • Update contribution guide with common labels (#​59473)
  • Add KubeRay & Volcano integration docs update (#​59636)
  • Add RayJob InTreeAutoscaling with Kueue docs after Kueue 0.16.0 release (#​59648)
  • Refactor LLM batch inference template (#​59897)
  • Add async inference template (#​58393)
  • Add RunLLM chat widget for Ray docs (#​59126)
  • Fix various typos and broken links (#​60249, #​59901, #​60181)
  • Replace Ray Tune + Train example with vanilla Ray Tune in homepage (#​60229)
  • Add Ray technical charter (#​60068)

Thanks

Thank you to everyone who contributed to this release!
@​KaisennHu, @​MiXaiLL76, @​slfan1989, @​krisselberg, @​JasonLi1909, @​Priya-753, @​pseudo-rnd-thoughts, @​zzchun, @​ZacAttack, @​pushpavanthar, @​jjyao, @​ryanaoleary, @​pcmoritz, @​akshay-anyscale, @​HassamSheikh, @​yurekami, @​Hyunoh-Yeo, @​ruoliu2, @​nrghosh, @​wxwmd, @​myandpr, @​J-Meyers, @​trilamsr, @​kouroshHakha, @​limarkdcunha, @​manhld0206, @​jreiml, @​preneond, @​yuchen-ecnu, @​Yicheng-Lu-llll, @​AchimGaedkeLynker, @​vaishdho1, @​israbbani, @​OneSizeFitsQuorum, @​Sathyanarayanaa-T, @​nadongjun, @​xinyuangui2, @​Rob12312368, @​as-jding, @​lee1258561, @​popojk, @​coqian, @​rajeshg007, @​jeffreywang-anyscale, @​kamil-kaczmarek, @​alexeykudinkin, @​Aydin-ab, @​mgchoi239, @​dragongu, @​edoakes, @​smortime, @​tk42, @​abrarsheikh, @​jakubzimny, @​Future-Outlier, @​axreldable, @​owenowenisme, @​g199209, @​cem-anyscale, @​dayshah, @​akelloway, @​daiping8, @​dlwh, @​robertnishihara, @​400Ping, @​matthewdeng, @​antoine-galataud, @​cristianjd, @​Partth101, @​goutamvenkat-anyscale, @​codope, @​seanlaii, @​andrew-anyscale, @​andrewsykim, @​liulehui, @​simonsays1980, @​Sparks0219, @​yifanmai, @​landscapepainter, @​win5923, @​kangwangamd, @​srinarayan-srikanthan, @​KeeProMise, @​srinathk10, @​my-vegetable-has-exploded, @​MengjinYan, @​yancanmao, @​yuhuan130, @​ArturNiederfahrenhorst, @​akyang-anyscale, @​rushikeshadhav, @​kongjy, @​harshit-anyscale, @​justinvyu, @​dancingactor, @​Vito-Yang, @​cr7258, [@​marwan116](https://redirect.github.com/marwan1


Configuration

📅 Schedule: Branch creation - "" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Enabled.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot added the changelog/chore A trivial change label Feb 20, 2026
@renovate renovate bot enabled auto-merge (squash) February 20, 2026 21:40
@renovate renovate bot changed the title chore(deps): update dependency ray to v2.54.0 [security] Update dependency ray to v2.54.0 [SECURITY] Feb 26, 2026
@renovate renovate bot changed the title Update dependency ray to v2.54.0 [SECURITY] chore(deps): update dependency ray to v2.54.0 [security] Feb 27, 2026
Signed-off-by: Robert Kruszewski <github@robertk.io>
@renovate
Copy link
Contributor Author

renovate bot commented Mar 2, 2026

Edited/Blocked Notification

Renovate will not automatically rebase this PR, because it does not recognize the last commit author and assumes somebody else may have edited the PR.

You can manually request rebase by checking the rebase/retry box above.

⚠️ Warning: custom changes will be lost.

Signed-off-by: Robert Kruszewski <github@robertk.io>
@renovate renovate bot merged commit e98d6b4 into develop Mar 2, 2026
49 checks passed
@renovate renovate bot deleted the renovate/pypi-ray-vulnerability branch March 2, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant