There are a number of reasons why packets are fragmented at MAC layer and organized in different way than packets are delivered by the upper layer (network layer). Generally, it is possible to improve performance by fragmentation. E.g. you can look at my paper “PHY-MAC cross-layer approach to energy-efficiency improvement in low-power communications”, N. Zogovic, G. Dimic, D. Bajic, 8th International Symposium on Wireless Communication Systems, ISWCS 2011, Aachen, Germany, November 6-9, 2011; 01/2011, it presents how packets can be organized into aggregated fragments to improve energy efficiency. Another example is given in "Aggregation with fragment retransmission for very high-speed WLANs.", Li, Tianji, et al. IEEE/ACM Transactions on Networking (TON) 17.2 (2009): 591-604, where the authors tend to achieve high efficiency for next-generation very high-speed WLANs.
In addition to improving performance (as discussed above), sometimes we may be forced to do fragmentation due to the nature of the network. If we take the example of WSN (Wireless Sensor Network), The devices are anticipated to run at a very low duty cycle (under 1% according to the IEEE 802.15.4 standard), Moreover, the data rate is low (Max 250Kbps in the standard). So since we don't have enough bandwidth and time to send large packets, the standard has limited the packet size to 127 Bytes as Max PHY payload, which is equal to a maximum of 127-9 = 118 Bytes MTU (W/O security, using short addr, intra-PAN communication...etc). If the packet is larger than this MTU, a fragmentation has to be performed by the next upper layer (above the MAC layer) such as IP or 6LowPAN in the context of WSN.
Fragmentation may also be used on low-bandwidth links to address the real-time traffic requirements: to allow the delay-sensitive packets to be queued between fragments of large packets. This is called "link fragmentation and interleaving" (LFI).
Packet fragmentation is done to allow packet transfer over networks with certain Maximum Transfer Unit (MTU). If application data is bigger than MTU supported by the network then packet must be fragmented before they are transmitted over the network. If the packet is bigger than MTU size supported by the network and fragmentation is not allowed, then routers must drop the packet. To avoid this, many protocols support Path MTU (PMTU) discovery and fragment the data accordingly to avoid packets from being dropped. Apart from that, some protocols support data fragmentation to improve performance in noisy environment / in presence of narrow band interference.
There is also another reason for fragmentation, ensuring acceptable frame error rate. Take a look for example to the IEEE standard 802.15.4k-2013, published in June 2013. It uses link-level fragmentation to transmit very short fragments using communication links with over 100 dB signal attenuation. On similar lines, fragmentation, or rather frame trains are used in e.g. my nanoMAC Jussi Haapola "NanoMAC: A Distributed MAC Protocol for Wireless Sensor Networks" and Ye's S-MAC protocol.
every technology has its rules, and one of these is the MTU (maximum transmission unit), and as for the MAC protocol to make sure these rules are satisfied some time it need to fragment bigger packets.
packet fragmentation is also done to speed up the data flow within the network. Large size packets would add congestion in slow network(low bandwidth) . so gateway router to these network set lower mtu value to avoid large packet entering into the network