Connectedness

Introduction Connected Sets Intermediate Value Theorem Components Path-Connectedness

Introduction

A metric space is connected if it cannot be split into two disjoint non-empty open sets. Intuitively, a connected space is "all in one piece"—you cannot separate it into isolated regions.

Why does this matter? In optimization, connectedness rules out the presence of isolated "islands" in the feasible region — it ensures that the space is topologically one piece rather than two or more separated components. The stronger property of path-connectedness, introduced later, goes further: it guarantees that any two feasible points can be joined by a continuous curve inside the feasible set, so that an optimization algorithm can in principle move from any starting point to any other. For open subsets of \(\mathbb{R}^n\), the two notions coincide, and in most practical machine-learning settings they can be used interchangeably.

Connectedness is also foundational for more advanced structures like manifolds, which we will explore in later sections. In those contexts, we often study a space by analyzing its individual connected parts. Furthermore, we will later introduce path-connectedness - a stronger property that ensures any two points can be joined by a continuous curve, which is essential for defining how we measure distances along a surface.

Connected Sets

Formally, the notion of a space being "in one piece" is captured by the absence of a separation. A separation of \(X\) is a pair of disjoint non-empty open sets whose union is \(X\). If no such separation exists, the space is rigid enough to support global continuity properties, such as the Intermediate Value Theorem.

Definition: Connected Metric Space

A metric space \(X\) is called a connected metric space if and only if \(X\) cannot be expressed as the union of two disjoint non-empty open subsets of itself.

While the definition above provides the core intuition, connectedness is often easier to analyze through its equivalent topological properties. The following theorem provides several tests for determining if a space is truly in one piece.

Theorem: Criteria for Connectedness

Suppose \(X\) is a metric space. The following statements are equivalent, and \(X\) is connected if and only if it satisfies any one of the following statements.

  1. Boundary Criterion
    Every proper non-empty subset of \(X\) has non-empty boundary in \(X\).
  2. Open-Closed Criterion
    No proper non-empty subset of \(X\) is both open and closed in \(X\).
  3. Open Union Criterion
    \(X\) is not the union of two disjoint non-empty open subsets of itself.
  4. Closed Union Criterion
    \(X\) is not the union of two disjoint non-empty closed subsets of itself.
  5. Continuity Criterion
    Either \(X = \emptyset\) or the only continuous functions from \(X\) to the discrete space \(\{0, 1\}\) are the two constant functions.
Proof Sketch:

All five criteria say the same thing in different dialects. A separation of \(X\) is a partition \(X = U \cup V\) into two disjoint non-empty open sets — exactly what (3) forbids. In such a partition, each piece is also closed, being the complement of the other open piece; so (3) and (4) are the same statement with the roles of open and closed interchanged. A subset that is both open and closed (clopen), proper and non-empty, is precisely one half of a separation, giving (2). And for the boundary \(\partial U\): the conditions "\(U\) is open" and "\(U\) is closed" together force \(\partial U = \emptyset\), and conversely \(\partial U = \emptyset\) makes \(U\) clopen; so having a proper non-empty subset with empty boundary is the same as having a clopen proper non-empty subset — giving (1).

The continuity criterion (5) is the same idea through a different lens. A continuous function \(f: X \to \{0, 1\}\) into the two-point discrete space (where every subset is clopen, since the metric \(d(0,1) = 1\) makes each singleton open) is determined by the clopen preimages \(f^{-1}(\{0\})\) and \(f^{-1}(\{1\})\), which partition \(X\). Constant functions correspond to the trivial case where one preimage is empty; non-constant continuous functions correspond to genuine separations. So the absence of non-constant continuous \(\{0,1\}\)-valued functions is the absence of clopen partitions, which is (2).

Connectedness is not just a static property of a space; it is also remarkably stable under transformations. Specifically, continuous functions cannot "break" a connected space into pieces.

Theorem: Continuity Preserves Connectedness

Every continuous image of a connected metric space is connected. That is, if \(X\) is connected and \(f: X \to Y\) is continuous, then \(f(X) \subseteq Y\) is connected.

Proof:

We use the Continuity Criterion (item 5 of the previous theorem). Let \(g: f(X) \to \{0,1\}\) be any continuous function into the discrete two-point space. Then the composition \(g \circ f : X \to \{0,1\}\) is continuous as a composition of continuous maps. Since \(X\) is connected, the Continuity Criterion forces \(g \circ f\) to be constant on \(X\), say \((g \circ f)(x) = c\) for all \(x \in X\). Every \(y \in f(X)\) has the form \(y = f(x)\) for some \(x \in X\), so \(g(y) = g(f(x)) = c\); thus \(g\) is constant on \(f(X)\). Applying the Continuity Criterion in the reverse direction, \(f(X)\) is connected. \(\square\)

When we apply these general topological principles to the familiar setting of the real line, we arrive at a definitive characterization of connectivity in one dimension.

Corollary: Connected Subsets of \(\mathbb{R}\)

Suppose \(S\) is a non-empty subset of \(\mathbb{R}\). Then \(S\) is connected if and only if \(S\) is an interval.

Proof:

(\(\Leftarrow\)) Suppose \(S\) is an interval, and for contradiction suppose \(S\) admits a separation \(S = U \cup V\) with \(U, V\) disjoint non-empty open subsets of \(S\). Pick \(a \in U\) and \(b \in V\); by swapping \(U\) and \(V\) if necessary, assume \(a < b\). Because \(S\) is an interval containing both \(a\) and \(b\), the entire closed interval \([a, b]\) lies in \(S\).

The idea is to walk from \(a\) toward \(b\) staying inside \(U\) and ask how far we can go. Formally, set \[ c \;=\; \sup\{\, t \in [a,b] : [a,t] \subseteq U \,\}. \] The set on the right is non-empty (it contains \(t = a\), since \([a,a] = \{a\} \subseteq U\)) and bounded above by \(b\), so the supremum exists and \(a \le c \le b\). Since \([a,b] \subseteq S\), we have \(c \in S\), and since \(S = U \cup V\) is a disjoint union, either \(c \in U\) or \(c \in V\). Each case leads to a contradiction.

We will use the fact that a subset \(W \subseteq S\) is open in \(S\) precisely when \(W = S \cap \tilde{W}\) for some open \(\tilde{W} \subseteq \mathbb{R}\), so that for any \(x \in W\) there exists \(\delta > 0\) with \((x - \delta, x + \delta) \cap S \subseteq W\).

Case \(c \in U\). Openness of \(U\) in \(S\) gives \(\delta > 0\) with \((c - \delta, c + \delta) \cap S \subseteq U\). By the definition of \(c\) as a supremum, there exists \(t_0 \in (c - \delta, c]\) with \([a, t_0] \subseteq U\). Combining this with \([t_0, c] \subseteq (c - \delta, c + \delta) \cap S \subseteq U\) gives \[ [a, c] \;=\; [a, t_0] \cup [t_0, c] \;\subseteq\; U. \] If \(c = b\), then \(b \in U\); but \(b \in V\) by choice, contradicting \(U \cap V = \emptyset\). If \(c < b\), shrink \(\delta\) if needed so that \(c + \delta \le b\); then \([c, c + \delta/2] \subseteq (c - \delta, c + \delta) \cap S \subseteq U\), and together with \([a, c] \subseteq U\) we obtain \([a, c + \delta/2] \subseteq U\). This makes \(c + \delta/2\) an element of \(\{t \in [a,b] : [a,t] \subseteq U\}\) strictly larger than \(c\), contradicting the fact that \(c\) is the supremum.

Case \(c \in V\). Openness of \(V\) in \(S\) gives \(\delta > 0\) with \((c - \delta, c + \delta) \cap S \subseteq V\). Note \(a \ne c\) (since \(a \in U\), \(c \in V\), and \(U \cap V = \emptyset\)), so \(a < c\). By the definition of \(c\) as a supremum, there exists \(t_0 \in (c - \delta, c]\) with \([a, t_0] \subseteq U\); in particular \(t_0 \in U \subseteq S\). Combining \(t_0 \in (c - \delta, c]\) with \(t_0 \in S\) gives \(t_0 \in (c - \delta, c + \delta) \cap S \subseteq V\), so \(t_0 \in U \cap V = \emptyset\), a contradiction.

Both cases are impossible, so no such separation exists and \(S\) is connected.

(\(\Rightarrow\)) Conversely, suppose \(S\) is connected, and for contradiction suppose \(S\) is not an interval. Then there exist \(a, b \in S\) and \(c \in \mathbb{R} \setminus S\) with \(a < c < b\). The sets \(U = S \cap (-\infty, c)\) and \(V = S \cap (c, \infty)\) are open in \(S\) (intersections of \(S\) with open subsets of \(\mathbb{R}\)), disjoint, and non-empty (\(a \in U\), \(b \in V\)). Their union is \(S \setminus \{c\} = S\) since \(c \notin S\). This is a separation, contradicting connectedness of \(S\). \(\square\)

Insight: Continuity and Interval Preservation

The fact that connected subsets of \(\mathbb{R}\) are exactly intervals is the fundamental reason why the Intermediate Value Theorem holds. Since continuity preserves connectedness, the continuous image of an interval must also be an interval. This ensures that a continuous function cannot "skip" values, providing the theoretical foundation for root-finding methods such as the bisection algorithm: whenever a continuous function changes sign on an interval, a zero is guaranteed to exist inside.

Intermediate Value Theorem

The Intermediate Value Theorem (IVT), a staple of calculus, can be viewed as a natural result of topology. If a space is connected, a continuous function cannot skip any values as it moves across that space.

Theorem: Intermediate Value Theorem

Suppose \(X\) is a connected metric space and \(f: X \to \mathbb{R}\) is continuous. Suppose \(\alpha \in (\inf f(X), \sup f(X))\). Then there exists \(z \in X\) such that \(f(z) = \alpha\).

Proof:

By the theorem that continuity preserves connectedness, \(f(X) \subseteq \mathbb{R}\) is connected. By the corollary classifying connected subsets of \(\mathbb{R}\) as intervals, \(f(X)\) is an interval. Since \(\alpha \in (\inf f(X), \sup f(X))\), the defining properties of infimum and supremum give points \(a, b \in f(X)\) with \(a < \alpha < b\); the interval \(f(X)\) then contains \(\alpha\), so there exists \(z \in X\) with \(f(z) = \alpha\). \(\square\)

This proof shows that the IVT is not just a special rule for real numbers; it is a fundamental property of how continuity and connectivity work together. This provides the logical guarantee for algorithms like the bisection method, where we must be certain that a zero exists if the function changes sign.

Insight: Topological Foundation of the IVT

The Intermediate Value Theorem is a direct consequence of the fact that continuity preserves connectedness. In optimization, this ensures that the image of any continuous objective function over a connected feasible region is itself an interval: no intermediate value can be "missed." This is the logical basis for root-finding algorithms like the bisection method, where the existence of a zero is guaranteed within any interval where the function changes sign.

Connected Components

When a space is not connected, we can decompose it into its maximal connected subsets. These "islands" of connectivity allow us to analyze the global structure of a space by studying its individual parts.

Definition: Connected Component

Suppose \(X\) is a metric space. A subset \(U\) of \(X\) is called a connected component of \(X\) if and only if \(U\) is connected and there is no proper superset of \(U\) in \(X\) that is connected.

While the definition identifies a single component, the global structure of a space is understood by how these 'islands' of connectivity are partitioned.

Theorem: Components Partition a Metric Space

Suppose \(X\) is a metric space. Then the connected components of \(X\) are mutually disjoint and all closed in \(X\). Moreover, \(X\) is the union of its connected components.

Proof Sketch:

The argument rests on one key fact, which we record for reuse:

Union Lemma. If \(\{C_i\}_{i \in I}\) is a family of connected subsets of \(X\) sharing a common point \(p\), then \(\bigcup_{i} C_i\) is connected.

Why it holds: suppose for contradiction that the union admits a separation \(\bigcup_i C_i = U \sqcup V\) into disjoint non-empty open sets. The common point \(p\) lies in exactly one piece, say \(p \in U\). Since \(V\) is non-empty, pick \(q \in V\); then \(q \in C_{i_0}\) for some index \(i_0\), so \(C_{i_0} \cap V\) is non-empty, and \(C_{i_0} \cap U\) contains \(p\) and is also non-empty. The pair \((C_{i_0} \cap U,\, C_{i_0} \cap V)\) is therefore a separation of \(C_{i_0}\), contradicting the connectedness of \(C_{i_0}\).

Cover. For each \(x \in X\), let \(C(x)\) denote the union of all connected subsets of \(X\) containing \(x\). This family is non-empty since \(\{x\}\) is trivially connected, and all its members share the common point \(x\), so the Union Lemma gives \(C(x)\) connected. By construction, every connected set containing \(x\) is a subset of \(C(x)\); hence \(C(x)\) admits no strictly larger connected superset, making it a connected component. Every point lies in its own component, so \(X = \bigcup_{x \in X} C(x)\).

Disjointness. Suppose two components \(C(x)\) and \(C(y)\) intersect at a point \(p\). Both contain \(p\), so the Union Lemma applies to \(\{C(x), C(y)\}\): the union \(C(x) \cup C(y)\) is connected and contains \(C(x)\). Maximality of \(C(x)\) as a component forbids a strictly larger connected superset, so \(C(x) \cup C(y) = C(x)\), i.e. \(C(y) \subseteq C(x)\). The symmetric argument gives \(C(x) \subseteq C(y)\), hence \(C(x) = C(y)\).

Closedness. We claim the closure of any connected set is connected. Suppose for contradiction that \(\overline{C} = U \sqcup V\) is a separation into non-empty disjoint sets open in \(\overline{C}\). Then \(C \cap U\) and \(C \cap V\) are disjoint and open in \(C\). Moreover, both are non-empty: given \(q \in U\), openness of \(U\) in \(\overline{C}\) yields \(\varepsilon > 0\) with \(B(q, \varepsilon) \cap \overline{C} \subseteq U\); since \(q \in \overline{C}\) means \(\text{dist}(q, C) = 0\), the ball \(B(q, \varepsilon)\) contains a point of \(C\), which then lies in \(C \cap U\). Symmetrically for \(V\). Their union is \(C\), giving a separation of \(C\) and contradicting connectedness. Applying this to a component \(C\): \(\overline{C}\) is connected and contains \(C\), so maximality forces \(\overline{C} = C\), making \(C\) closed. \(\square\)

This decomposition into components is a powerful tool for data analysis. It allows us to transition from abstract topology to practical algorithms that find structure in noise.

CS Insight: Clustering as Component Identification

In machine learning, clustering can be formally viewed as identifying the connected components of an underlying data manifold. Algorithms like DBSCAN define connectivity via density-reachable points, allowing for the discovery of clusters with arbitrary non-convex shapes—a feat that simple centroid-based approaches (like K-means) cannot achieve.

While the original DBSCAN is a foundational connectivity-based algorithm, its modern evolution, HDBSCAN, is widely used in contemporary data science. By analyzing the hierarchical structure of connected components across varying density levels, HDBSCAN eliminates the need for a fixed distance parameter, making it robust for datasets with non-uniform densities. This principle of "connectivity across scales" is a cornerstone of modern topological data analysis.

Furthermore, this topological concept is bridged to linear algebra through Spectral Clustering. By representing data as a graph, the number of connected components corresponds exactly to the multiplicity of the eigenvalue 0 in the Graph Laplacian matrix. This highlights how the topological structure of a space directly dictates the capability of the algorithms operating within it.

References

  • Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). "A density-based algorithm for discovering clusters in large spatial databases with noise." In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). [PDF via AAAI]
  • Campello, R. J., Moulavi, D., & Sander, J. (2013). "Density-based clustering based on hierarchical density estimates." In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Berlin, Heidelberg. [Link via Springer]

Path-Connectedness

Path-connectedness is a more intuitive and often more useful property in optimization: it asks whether we can "walk" from one point to another along a continuous curve.

Definition: Path in a Metric Space

Suppose \(X\) is a metric space. Every continuous function \(f: [0, 1] \to X\) from the closed interval \([0, 1]\) into \(X\) is called a path in \(X\) from the point \(f(0)\) to the point \(f(1)\); the points \(f(0)\) and \(f(1)\) of \(X\) are called the endpoints of the path.

With the formal definition of a single path in place, we can now define a space where every point is continuously reachable from every other point.

Definition: Pathwise Connected

Suppose \(X\) is a metric space. Then \(X\) is said to be pathwise connected if and only if for each \(a, b \in X\), there is a path in \(X\) with endpoints \(a\) and \(b\).

Theorem: Path-Connected Implies Connected

Every pathwise connected metric space is connected.

Proof:

Suppose \(X\) is a pathwise connected metric space. We assume for the sake of contradiction that \(X\) is disconnected.

According to the Open Union Criterion, there exists a separation \(X = U \cup V\), where \(U\) and \(V\) are disjoint, non-empty open sets. Since \(U\) and \(V\) are non-empty, we can pick arbitrary points \(a \in U\) and \(b \in V\). Because \(X\) is pathwise connected, there exists a continuous path \(f: [0, 1] \to X\) such that \(f(0) = a\) and \(f(1) = b\).

The closed interval \([0,1]\) is connected (connected subsets of \(\mathbb{R}\) are intervals), and \(f\) is continuous, so by the theorem that continuity preserves connectedness, the image \(K = f([0,1])\) is connected. Therefore \(K\) admits no separation. In particular, it cannot be written as \[ K = (K \cap U) \cup (K \cap V) \] with \(K \cap U\) and \(K \cap V\) disjoint non-empty open subsets of \(K\).

However, since \(a \in K \cap U\) and \(b \in K \cap V\), both intersections are non-empty. Moreover \(K \cap U\) and \(K \cap V\) are open in \(K\) (intersections of \(K\) with open subsets of \(X\)) and disjoint. Their union is \(K\) since \(K \subseteq X = U \cup V\). This gives a separation of \(K\), contradicting the connectedness of \(K\) established above. Hence no separation of \(X\) can exist, and \(X\) must be connected. \(\square\)

While the converse is not true in general — the classical counterexample is the topologist's sine curve, discussed below — it does hold for open subsets of \(\mathbb{R}^n\). In the context of most optimization problems and machine learning models, these two concepts can be treated as effectively equivalent, ensuring that a "connected" feasible region is also one where we can "walk" between any two solutions.

Counterexample: The Topologist's Sine Curve

To see that connectedness does not imply path-connectedness, consider the subset of \(\mathbb{R}^2\) \[ T \;=\; \{(x, \sin(1/x)) : x \in (0, 1]\} \;\cup\; \bigl(\{0\} \times [-1, 1]\bigr). \] This is the graph of \(\sin(1/x)\) on \((0, 1]\), together with the vertical segment \(\{0\} \times [-1, 1]\) on the \(y\)-axis. As \(x \to 0^+\), the graph oscillates between \(-1\) and \(+1\) infinitely often, "accumulating" onto the entire vertical segment.

\(T\) is connected. The graph piece \(S = \{(x, \sin(1/x)) : x \in (0, 1]\}\) is path-connected (parametrize by \(x\)), hence connected. A direct computation shows that the closure of \(S\) in \(\mathbb{R}^2\) is exactly \(T\): every point \((0, y_0) \in \{0\} \times [-1, 1]\) is the limit of \((x_n, \sin(1/x_n))\) for a suitable sequence \(x_n \to 0^+\) with \(\sin(1/x_n) = y_0\). Since the closure of a connected set is connected (established in the components-partition proof above), \(T\) is connected.

\(T\) is not path-connected. Intuitively, any continuous path from a point on the vertical segment (e.g. the origin) to a point on the graph (e.g. \((1, \sin 1)\)) must pass near \(x = 0\), where the oscillation of \(\sin(1/x)\) forces the path to traverse the full vertical range \([-1, 1]\) arbitrarily close to the segment. Continuity at the endpoint on the segment cannot survive this, so no such path exists.

The topologist's sine curve shows that path-connectedness is strictly stronger than connectedness. The pathological behavior comes from the lack of local path-connectedness near the \(y\)-axis; once we additionally assume "locally path-connected" (which holds for open subsets of \(\mathbb{R}^n\), manifolds, and CW-complexes), the two notions coincide. This example will reappear when we generalize these ideas to arbitrary topological spaces.

Insight: Path-Connectedness in Loss Landscapes

While every path-connected space is connected, the converse is not true in general. However, in deep learning, we operate in the parameter space \(\mathbb{R}^n\), which is path-connected. This vital structural property implies that there are no topological barriers (like "holes" or separate components) preventing an optimizer from theoretically moving between any two points in the weight space.

The research into Mode Connectivity further suggests that the set of "good" solutions (local minima with low loss) often forms a path-connected structure. This allows different neural network models to be linearly or non-linearly interpolated along a low-loss path, implying that optimization is not just about finding isolated points, but navigating a continuous manifold of high-performing parameters.

References

  • Garipov, T., Izmailov, P., Podoprikhin, D., Vetrov, D. P., & Wilson, A. G. (2018). "Loss surfaces, mode connectivity, and fast ensembling of DNNs." In Advances in Neural Information Processing Systems (NeurIPS). [arXiv:1802.10026]

Looking Ahead: Toward Manifolds

Connectedness and path-connectedness are essential for manifold theory. A manifold is typically defined as a space that locally looks like \(\mathbb{R}^n\) (together with Hausdorff and second-countability conditions). Most manifolds in analysis and geometry are studied one connected component at a time, so connectedness plays a structural role: many theorems assume the manifold is connected to avoid component-wise bookkeeping. On manifolds, paths become smooth curves, and the study of shortest paths leads to geodesics — central to differential geometry and modern ML methods like manifold learning.