I am working at a broadcaster company. we have different types of data specially audio and video, unstructured data, ... .I want to know which hardware requirements in suitable for our company for setup Big Data Project.
For the full set up you have to keep focus on these 4 things :
1. STORAGE
Often, organizations already possess enough storage in-house to support a Big Data initiative. (After all, the data that will be processed and analyzed via a Big Data solution is already living somewhere.) However, agencies may decide to invest in storage solutions that are optimized for Big Data.
Large users of Big Data — companies such as Google and Facebook — utilize hyperscale computing environments, which are made up of commodity servers with direct-attached storage, run frameworks like Hadoop or Cassandra and often use PCIe-based flash storage to reduce latency. Smaller organizations, meanwhile, often utilize object storage or clustered network-attached storage (NAS).
Cloud storage is an option for disaster recovery and backups of on-premises Big Data solutions. While the cloud is also available as a primary source of storage, many organizations — especially large ones — find that the expense of constantly transporting data to the cloud makes this option less cost-effective than on-premises storage.
2. PROCESSING
Servers intended for Big Data analytics must have enough processing power to support this application. Some analytics vendors, such as Splunk, offer cloud processing options, which can be especially attractive to agencies that experience seasonal peaks. If an agency has quarterly filing deadlines, for example, that organization might securely spin up on-demand processing power in the cloud to process the wave of data that comes in around those dates, while relying on on-premises processing resources to handle the steadier, day-to-day demands.
3. ANALYTICS SOFTWARE
Agencies must select Big Data analytics products based not only on what functions the software can complete, but also on factors such as data security and ease of use. One popular function of Big Data analytics software is predictive analytics — the analysis of current data to make predictions about the future. Predictive analytics are already used across a number of fields, including actuarial science, marketing and financial services. Government applications include fraud detection, capacity planning and child protection, with some child welfare agencies using the technology to flag high-risk cases.
4. NETWORKING
The massive quantities of information that must be shuttled back and forth in a Big Data initiative require robust networking hardware. Many organizations are already operating with networking hardware that facilitates 10-gigabit connections, and may have to make only minor modifications — such as the installation of new ports — to accommodate a Big Data initiative. Securing network transports is an essential step in any upgrade, especially for traffic that crosses network boundaries.
For setting up an environment for Big Data analytics you may need several server computers that are connected to a network with Hadoop being installed as the software framework.
The hardware requirements is variable and depends on the processing and storage needs but an example configuration can be a computer grid with 20 CPU cores , 40 GB of ram and some 10 terabytes of HDD depending on your data size.
It can be a better solution to use cloud services so that the hardware requirements can be met as the need grows, in Iran several companies and academics provide such solutions, as a known example IPM's grid computing can help with that.
we have a specialized group working in big data at Imam Khomeini International University, Qazvin, please be in touch if you have any further questions.
first , i dont belive you are in range of bigdata , these can be identfied by growing rate and processing need
my advise is to use mongoDB and store the files that bigger then 16MB into Gridfs wish is mongodb storage that is enhancement of hdfs and use nodejs , python or even c# or php to build your front end web ui for query and analytics , please note that mongodb gridfs can store PBs of data without any issue
but if you need another solution go to glusterfs or use ceph both can handle PBs of data
farther normal companies who don't have skilled people to handle the mongodb or gluster or ceph thy use NAS
where its can cost you for 72TB will be around 1,5000$
but you can always reach us at platform company and we can give you a solution suitable for your budget