There are many examples of using Sylow $p$-subgroups to understand the structure of general finite groups. KConrad's answer indicates some of these, and others are mentioned in comments. A finite group $G$ with Sylow $p$-subgroup $P$ is said to have a normal $p$-complement if there is a normal subgroup $K$ (necessarily of order prime to $p$) with $G = PK$ and $P \cap K = 1.$ There are many theorems relating so-called $p$-local analysis and the existence of normal $p$-complements in finite groups. One of the earliest was Burnside's normal $p$-complement theorem, which states that if a finite group $G$ has an Abelian Sylow $p$-subgroup $S$ with $N_{G}(S) = C_{G}(S)$, then $G$ has a normal $p$-complement. Another powerful theorem due to G. Frobenius is that if a finite group $G$ has a Sylow $p$-subgroup $P$ such that $N_{G}(Q)/C_{G}(Q)$ is a $p$-group for each subgroup $Q$ of $P$, then $G$ has a normal $p$-complement. Other so-called transfer theorems emerged in finite group theory in the early to mid 20th century: these are theorems which used the structure of the normalizers of non-trivial $p$-subgroups of $G$ to demonstrate the existence of non-trivial Abelian homomorphic images of $G$ in many circumstances.

Such theorems were taken to new heights in the late 1950s and the 1960s, in particular by work of J.G. Thompson, G. Glauberman and J.L. Alperin. For example, the work of Glauberman and Thompson demonstrated (for odd $p$) the existence of a non-trivial characteristic $p$-subgroup $C(P)$ of the Sylow $p$-subgroup $P$ of $G$ such that $G$ has a normal $p$-complement if and only if $N_{G}(C(P))$ has a normal $p$-complement. The use of local analysis by Thompson in his $N$-group paper formed a template/guide for the later completion of the classification of finite simple groups ( some refinements were necessary).

Another use of Sylow $p$-subgroups there was in the development of signalizer functor theory. This is an over-simplification, but the general idea here is, given a finite group $G$, to build a non-trivial subgroup $L$ of order prime to $p$ which is normalized by a Sylow $p$-subgroup $P$ of $G$, and then by $N_{G}(Q)$ for many non-trivial $p$-subgroups $Q$ of $P$, and finally by $G$ itself (so that $G$ is not simple if $P \neq 1$). This line of development again has origins in work of Thompson, later refined by others, such as Gorenstein, Goldschmidt and Glauberman.

The work of R. Brauer relates the $p$-local structure of finite groups to their representation theory in characteristic $p$, and draws many new conclusions about complex characters of finite groups. Here, defect groups play an important roles. These are $p$-subgroups whose order depends on representation-theoretic properties of $G$, and these behave like Sylow $p$-subgroups in the context of Brauer's block theory. Properties of normalizers of non-trivial subgroups of the defect group determine many representation-theoretic invariants of $G$ ( and are conjectured to determine many more).

So, in the context of finite group theory, Sylow's theorem is an indispensable tool whose use goes far beyond counting theorems.

Later edit: Regarding the last question, the idea of "pushing-up" is very important in the classification of finite simple groups. The idea here is that we have a putative finite simple group $G$, and a maximal subgroup $M$ of $G$ with Sylow $p$-subgroup $P$ and with $C_{M}(O_{p}(M)) \subseteq O_{p}(M)$. The question is to determine whether $P$ is also a Sylow $p$-subgroup of $G$. Again, what follows is an over-simplification of the necessary analysis, but the overall goal is to get many $p$-local subgroups "into one place", that is to say, into a single maximal subgroup which contains a given Sylow $p$-subgroup $P$ and many normalizers $N_{G}(R)$ for $R$ non-trivial subgroups of $P$.

The answer is yes if there is a non-identity characteristic subgroup of $P$ which is normal in $M$. For if $1 \neq C(P)$ is a characteristic subgroup of $P$ which is normal in $M$, take a Sylow $p$-subgroup $Q$ of $G$ which contains $P$. If $Q \neq P$, then $N_{Q}(P) > P$ and $C(P) {\rm char} P \lhd N_{Q}(P)$, so $N_{Q}(P) \leq N_{G}(C(P))$. But $N_{G}(C(P)) \geq M$ since $C(P) \lhd M.$ Since $M$ is maximal and $G$ is simple, $N_{G}(C(P))$ is a proper subgroup of $G$ containing $M$, so that $N_{G}(C(P)) = M$ as $M$ is maximal. But then $P < N_{Q}(P) \leq N_{G}(C(P)) = M$, contrary to the fact that $P$ is a Sylow $p$-subgroup of $M$. Hence $Q$ must be a Sylow $p$-subgroup of $G$.

If no such characteristic subgroup $C(P)$ exists, then more delicate analysis is necessary. This is a crucial dichotomy, which emerged in the 1960s, and was further pursued by many group theorists, including Aschbacher, Baumann, Glauberman, Niles and Thompson.

5more comments