Skip to content

Bound HTTP client timeouts and reclaim pooled connections#317

Open
namedgraph wants to merge 1 commit into
masterfrom
fix-client-conn-timeouts
Open

Bound HTTP client timeouts and reclaim pooled connections#317
namedgraph wants to merge 1 commit into
masterfrom
fix-client-conn-timeouts

Conversation

@namedgraph

Copy link
Copy Markdown
Member

Fixes connection-pool exhaustion: a stalled read on the end-user backend route was held indefinitely because the pooled HTTP clients had no socket/read timeout (Apache default SO_TIMEOUT = 0 = infinite). The worker thread blocked inside the read and never reached Response.close(), so the leased connection was never returned. With connectionRequestTimeout also unset, new requests blocked forever waiting for a lease instead of failing fast, wedging the listener.

Adds socket timeout, connect timeout, connection TTL and validate-after-inactivity to the pooled HTTP clients, applied only when configured. Values are supplied via env vars through the existing CATALINA_OPTS -> system property -> JAX-RS app constructor convention (as connectionRequestTimeout already is), with image defaults in the Dockerfile -- no hardcoded values in Java:

CLIENT_SOCKET_TIMEOUT -> ...linkeddatahub.socketTimeout
CLIENT_CONNECT_TIMEOUT -> ...linkeddatahub.connectTimeout
CLIENT_CONNECTION_TIME_TO_LIVE -> ...linkeddatahub.connectionTimeToLive
CLIENT_VALIDATE_AFTER_INACTIVITY -> ...linkeddatahub.validateAfterInactivity

SignUp: close the PublicKey/Agent/Authorization client Responses on all paths (defense-in-depth; these leaked on the admin signup path).

Fixes connection-pool exhaustion: a stalled read on the end-user
backend route was held indefinitely because the pooled HTTP clients
had no socket/read timeout (Apache default SO_TIMEOUT = 0 = infinite).
The worker thread blocked inside the read and never reached
Response.close(), so the leased connection was never returned. With
connectionRequestTimeout also unset, new requests blocked forever
waiting for a lease instead of failing fast, wedging the listener.

Adds socket timeout, connect timeout, connection TTL and
validate-after-inactivity to the pooled HTTP clients, applied only
when configured. Values are supplied via env vars through the existing
CATALINA_OPTS -> system property -> JAX-RS app constructor convention
(as connectionRequestTimeout already is), with image defaults in the
Dockerfile -- no hardcoded values in Java:

  CLIENT_SOCKET_TIMEOUT             -> ...linkeddatahub.socketTimeout
  CLIENT_CONNECT_TIMEOUT            -> ...linkeddatahub.connectTimeout
  CLIENT_CONNECTION_TIME_TO_LIVE    -> ...linkeddatahub.connectionTimeToLive
  CLIENT_VALIDATE_AFTER_INACTIVITY  -> ...linkeddatahub.validateAfterInactivity

SignUp: close the PublicKey/Agent/Authorization client Responses on
all paths (defense-in-depth; these leaked on the admin signup path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant