In this article we will talk about distributed systems. What is it and what is related with. Keeping in mind that there’s alternatives definitions about it but in the end, all of them as the same purpose.
[ Definition ]
Also known as distributed computing, a distributed system is a system with multiple components located in different machines that communicate and coordinate actions in order to appear as a single machine to the end-user.
The machines can be any type of, as computers, servers, virtual-machines or any node that can be connected into a network. They usually have three primary objectives: All components run concurrently; There are no global clock; And all components fail independently of each other.
[ Benefits / Advantages ]
There are many reasons to generally decide to implement distributed systems as:
- Resource sharing — whether it’s the hardware, software or data that can be shared;
- Openness — how open is the software designed to be developed and shared with each other;
- Concurrency — multiple machines can process the same function at the same time;
- Reliability — The system generally doesn’t have any disruption if a single machine fails, as they can be made up of hundreds of nodes that are working all together. So, most of distributed systems are fault-tolerant;
- Performance — As having multiple machines, the work loads can be broken up and sent to each one, making the system extremely efficient;
- Horizontal Scalability — as each machine work independently on each node, it is easy and generally inexpensive to create and add more functionality and nodes as necessary;
- Transparency — how much access does one node have to locate and communicate with other nodes in the system, etc…
[ Challenges / Disadvantages]
Any distributed system can be overwhelming, making it a challenge as complex architectural design, construction and debugging processes is required to create a effective distributed system.
Some challenges that can be encounter are:
- Latency — The more machines we have, more latency we can experience with communications between each machine;
- Scheduling — A distributed system has to decide which jobs need to run, when they should run, and where they should run. Schedulers ultimately have limitations, leading to underutilized hardware and unpredictable runtimes;
- Security — It is difficult to provide adequate security in distributed systems because the nodes as well as the connections need to be secured;
- Overloading — Overloading may occur in the network if all the nodes of the distributed system try to send data at once;
- Database — The database connected to the distributed systems is quite complicated and difficult to handle as compared to a single user system;
- Lost of Data — Some messages and data can be lost in the network while moving from one node to another, etc…
[ How It Works ]
Everything must be interconnected such CPUs via the network and processes via the communication system. Hardware and software architectures are used to maintain a distributed system.
[ Types of Distributed Systems ]
In general, distributed systems falls into one of four different types of architecture models, as:
- Client-server — Clients contact the server for data, then format it and display it to the end-user. The end-user can also make a change from the client-side and commit it back to the server to make it permanent.
- Three-tier — Information about the client is stored in a middle tier rather than on the client to simplify application deployment. This architecture model is most common for web applications.
- n-tier — Generally used when an application or server needs to forward requests to additional enterprise services on the network.
- Peer-to-peer — There are no additional machines used to provide services or manage resources. Responsibilities are uniformly distributed among machines in the system, known as peers, which can serve as either client or server.
[ Examples of Distributed Systems]
Some distributed systems examples can be:
- Networks — The earliest example of a distributed system happened in the 1970s when ethernet was invented and LAN (local area networks) were created. For the first time computers would be able to send messages to other systems with a local IP address. Peer-to-peer networks evolved and e-mail and then the Internet as we know it continue to be the biggest, ever growing example of distributed systems. As the internet changed from IPv4 to IPv6, distributed systems have evolved from “LAN” based to “Internet” based.
- Telecommunication networks — Telephone networks have been around for over a century and it started as an early example of a peer to peer network. Cellular networks are distributed networks with base stations physically distributed in areas called cells. As telephone networks have evolved to VOIP (voice over IP), it continues to grow in complexity as a distributed network.
- Distributed artificial intelligence — Distributed Artificial Intelligence is a way to use large scale computing power and parallel processing to learn and process very large data sets using multi-agents.
[ Examples of Distributed Systems Platforms]
Some distributed systems platforms examples can be:
- GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data;
- A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations;
- CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research;
In the end, distributed systems have endless use cases, a few being electronic banking systems, massive multiplayer online games, and sensor networks, etc…
Hope you liked this small article, best regards, Ricardo Costa (Richards).
This article was created in a context of the Distributed Systems Class 2020–21, ESTG-IPG.