A divider is a logic module that takes two binary numbers and produces their numerical quotient (and optionally, the remainder). The basic structure is a series of subtractions and multiplexers, where the multiplexer uses the result of the subtraction to select the value that gets passed to the next step. The quotient is formed from the bits used to control the multiplexers and the remainder is the result of the last subtraction. If it is implemented purely combinatorially, then the critical path through all of this logic is quite long (even with carry-lookahead in the subtractors) and the clock cycle must be very slow.
What could be done to reduce the amount of logic required for the divider, giving up the ability to have a result on every clock?
If you don’t need the level of performance provided by a pipelined divider, you can computes the quotient serially, one bit at a time. You would just need one subtractor and one multiplexer, along with registers to hold the input values, the quotient bits, and the intermediate result.
You could potentially use additional subtract-mux stages to compute more than one bit per clock period. This gives you the flexibility to trade off space and time as needed for a particular application.