Difference between distributed databases and distributed file system?

Samir G Pandya

Dear Paramjit Kour,

The difference between distributed databases and distributed file system is shown:

01. Distributed databases:

In a distributed database, there are a number of databases that may be geographically distributed all over the world.
A distributed DBMS manages the distributed database in a manner so that it appears as one single database to users.
A distributed database is a collection of multiple interconnected databases, which are spread physically across various locations that communicate via a computer network.
A distributed database is a database that consists of two or more files located in different sites either on the same network or on entirely different networks. Portions of the database are stored in multiple physical locations and processing is distributed among multiple database nodes.

Features:

Databases in the collection are logically interrelated with each other. Often they represent a single logical database.
Data is physically stored across multiple sites. Data in each site can be managed by a DBMS independent of the other sites.
The processors in the sites are connected via a network. They do not have any multiprocessor configuration.
A distributed database is not a loosely connected file system.
A distributed database incorporates transaction processing, but it is not synonymous with a transaction processing system.

Advantages of Distributed databases:

Following are the advantages of distributed databases.

Modular Development − If the system needs to be expanded to new locations or new units, in centralized database systems, the action requires substantial efforts and disruption in the existing functioning. However, in distributed databases, the work simply requires adding new computers and local data to the new site and finally connecting them to the distributed system, with no interruption in current functions.
More Reliable − In case of database failures, the total system of centralized databases comes to a halt. However, in distributed systems, when a component fails, the functioning of the system continues may be at a reduced performance. Hence DDBMS is more reliable.
Better Response − If data is distributed in an efficient manner, then user requests can be met from local data itself, thus providing faster response. On the other hand, in centralized systems, all queries have to pass through the central computer for processing, which increases the response time.
Lower Communication Cost − In distributed database systems, if data is located locally where it is mostly used, then the communication costs for data manipulation can be minimized. This is not feasible in centralized systems.

02. Distributed file system:

The Distributed File System (DFS) functions provide the ability to logically group shares on multiple servers and to transparently link shares into a single hierarchical namespace. DFS organizes shared resources on a network in a treelike structure.
DFS supports stand-alone DFS namespaces, those with one host server, and domain-basednamespaces that have multiple host servers and high availability. The DFS topology data for domain-based namespaces is stored in Active Directory. The data includes the DFS root, DFS links, and DFS targets.
A distributed file system (DFS) is a file system with data stored on a server. The data is accessed and processed as if it was stored on the local client machine. The DFS makes it convenient to share information and files among users on a network in a controlled and authorized way. The server allows the client users to share files and store data just like they are storing the information locally. However, the servers have full control over the data and give access control to the clients.
Distributed file system (DFS) is a method of storing and accessing files based in a client/server architecture. In a distributed file system, one or more central servers store files that can be accessed, with proper authorization rights, by any number of remote clients in the network.
Each DFS tree structure has one or more root targets. The root target is a host server that runs the DFS service. A DFS tree structure can contain one or more DFS links. Each DFS link points to one or more shared folders on the network. You can add, modify and delete DFS links from a DFS namespace. When you remove the last target associated with a DFS link, DFS deletes the DFS link in the DFS namespace. (In earlier documentation, DFS links were called junction points.)
A DFS link can point to one or more shared folders; the folders are called targets. When users access a DFS link, the DFS server selects a set of targets based on a client's site information. The client accesses the first available target in the set. This helps to distribute client requests across the possible targets and can provide continued accessibility for users even when some servers fail.

Features:

1. Transparency:

There are four types of transparencies desirable:

Structure Transparency: Although not necessary for performance, scalability and reliability reasons, adistributed file system normally uses multiple file servers. Each file server is normally a user process orsometimes a kernel process that is responsible for controlling a set of secondary storage devices on the node onwhich it runs. In multiple file servers, the multiplicity of the file servers should be transparent to the clients of adistributed file system.
Access Transparency: Both local and remote files should be accessible in the same way. That is, the file systeminterface should not distinguish between local and remote files and the file system should automatically locate anaccessed file and arrange for the transport of the data to the client’s site.
Naming Transparency: The name of a file should give no hint as to where the file is located. Furthermore, a fileshould be allowed to move from one node to another in a distributed system without having to change the name ofthe file.
Replication Transparency: If a file is replicated on multiple nodes, both the existence of multiple copies andtheir locations should be hidden from the clients.

2. User Mobility:

In a distributed system, a user should not be forced to work on a specific node, but should have theflexibility to work on different nodes at different times. Furthermore, the performance characteristics of the file systemshould not discourage users from accessing their files from workstations other than the one at which they usually work.

Advantages of Distributed file system:

Following are the advantages of distributed file system:

Distributed file systems can be advantageous because they make it easier to distribute documents to multiple clients and they provide a centralized storage system so that client machines are not using their resources to store files.
There are a number of potential advantages using a distributed system. One of the easiest to understand is redundancy and resiliency.
If a company is serving its website from a distributed set of servers, rather than a single server, it may be able to stay up even if one server physically fails.
If data is distributed between multiple servers or disks, a common occurrence in modern distributed systems, there may not be any data loss even if a storage device ceases to work.

I hope I have answered your question.

With Best Wishes,

Samir G. Pandya

Jörg Domaschka

To me, this is a very strange question that boils down to whether or not there is a

difference between file systems and databases. And certainly there are. This is my very high level view on that matter: file systems deal with hierarchically grouped, named chunks of binary data (files) while databases (rather database management systems, DBMS) operate on named and typed data items. This leads to very different APIs for accessing the data within a DBMS (e.g. SQL) and file system (e.g. POSIX). Now, due to the fact that DBMS have more information available on the data they store, they can do more sophisticated things with it such as indexing.

The distribution aspect in both domains has undergone slight changes in the last decade. While earlier a distributed file system was more or less a synonym for remote file system (a file system on a different host), I don't think this holds any more. (at least to me) distributed file systems are file systems that potentially span across more than one (server) host. This includes e.g. Ceph and HDFS, but not NFS (ignoring things like pNFS). Similarly, IMHO distributed DBMS are DBMS that span across several servers.

Due to the differentiation we have been seeing in the last 10-15 years in the field of distributed DBMS, cf. the rise of NoSQL DBMS, the difference between DBMS and file systems has been blurred in some places. This is particularly true for pure key-value stores that do not provide more functionality than a simple file system.

Final note: DBMS do not necessarily require files (cf. caches and in-memory databases) and file systems do not necessarily require disks (cf. temp file systems).

Is Hypothesis testing applied in deep learning?

Numerical calculations of diffraction losses of fp cavity?

Can anyone please help me in analysing Lobster COHP calculations using wxDragon?

How to increase women engagement towards livestock farming in Fiji?

How to use neural network without specifying target?

How does the thioglycolate increase the no. of macrophages in the peritoneum of mice?

Which goat breed is best adapted in the pacific island countries against climate change?

What's the potential of metal enclosure of an LED luminaire not connected to ground with a Class 2 driver for providing input?

Parameters to make accurate gait video database?

Are viable seeds of Lodoicea maldivica available for exchange with any botanic garden?

Could you recommend some articles on Urban Transportation System optimization and Innovation?

A Question about Phd thesis?

After a lot of feature engineering for CTR modeling, it feels like it's basically the end of iteration? I mean, it's not cost-effective to keep doing?

How to use Desmond in HPC ?

Look for qualified candidates of Visiting Scholars to Southwest Jiaotong University?

How to combat antibiotic resistance?

How to prepare bacterial conditioned media to study the effect of bacterial secretome?

How can the resilience of agricultural systems be improved by both gradual climate change and increased climatic variability and extremes?

How to model a non-minimum phase system?

How to create a database management system of trees species using gis and remote sensing techniques?