Pipelining used to increase the performance of a processor by dividing an operation into smaller parts and executing these parts in parallel, which reduces the overall time required to complete the set of operations. Though, pipelining process improving the performance, but at the same huge pipelining structure, complicates the programming. Hence a deeper pipeline allows the processor to execute faster but makes the processor harder to program. Thus, optimum pipelining also needs a trade-off between efficiency and ease of use.
I do not think that even with a hypothetical machine which directly runs a high level instruction, will have 1000 micro-operations that will take advantage of such a long pipeline. For operations on large data sets, vector processors are used for applications like speech coding, 4G signal processing etc.
In addition with, If one complete instruction execution is divided into max of 10 machine cycle and pipeline stages more than 10, it requires additional hardware for same process execution. So hardware become more in-terms of processing unit.
Pipelining improves the systems throughput by allowing multiple instructions to be overlapped and execute in parallel. However we cannot indefinitely increase the number of stages in a pipeline because as we increase the pipeline depth, the intrinsic delay associated with every pipeline register we add is introduced, decreasing performance. Further, the data dependencies, if they cannot be resolved using data bypassing/ forwarding, we need to insert bubbles/ do stalling in order to continue execution which would reduce throughput. Also, if we mispredict a branch, then all instructions from the IF stage to the point of misprediction in the pipeline need to be flushed. This would be detrimental to performance if we have large number of stages in the pipeline.