Bachelor
2022/2023
Distributed Computing
Type:
Elective course (Software Engineering)
Area of studies:
Software Engineering
Delivered by:
School of Software Engineering
Where:
Faculty of Computer Science
When:
3 year, 3, 4 module
Mode of studies:
offline
Open to:
students of one campus
Instructors:
Petr Panfilov
Language:
English
ECTS credits:
5
Contact hours:
60
Course Syllabus
Abstract
Distributed computing have become central concept of how computers are used, from web applications to e-commerce and to content distribution. Distributed computing help programmers aggregate the resources of many networked computers to construct highly available and scalable services. This course teaches the abstractions, design and implementation techniques that enable the building of fast, scalable, fault-tolerant distributed computing systems. A course will cover abstractions and implementation techniques for the construction of distributed computing systems, including client server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, preventing and finding errors in distributed programs, maintaining consistency of distributed state, fault tolerance, and high availability. Also topics of multithreading, network programming, and several case studies of distributed computing systems will be considered.
Learning Objectives
- To introduce students to the fundamental problems, concepts, and approaches in the design and analysis of distributed computing systems.
- To familiarize students with the stages of the distributed system design cycle, including system architecture, data and processes arrangements, naming, communication and coordination issues, existing distributed computing paradigms, techniques, and tools, and evaluating the effectiveness of distributed application systems for specific data, task, and user types.
Expected Learning Outcomes
- understand the distinction between distributed computing systems, distributed information systems and pervasive systems
- understand the evolution of the distributed computing from its early beginnings as multi-processor and multi-computer systems, to computer networks, to the emerging cloud, edge (fog, dew, mist) and heterogeneous computing environments
- discuss and explain difference between data-centric and client-centric consistency models
- discuss the use of publish-subscribe systems for coordination in distributed event matching
- explain and discuss basic principles and typical examples of real-world distributed systems such as NFS file-sharing system and the web
- explain caching and replication in Web-based systems
- know about the security policy that is to be reinforced and design issues for mechanisms that help enforce such polices
- know basic principles and key issues of actual implementation of consistency models
- know basic principles of process synchronization based on actual time
- know basic principles of recovery from a failure in distributed systems
- know basic principles of the RPC model and problems with achieving distribution transparency
- know basic principles of virtualization for making applications to run concurrently and independently of the underlying hardware and platforms
- know basics of the security management including mechanisms to distribute cryptographic keys, add and remove users from a system, prove ownership to access specified resources, etc..
- know consistency models for shared data and their implementation
- know election algorithms for coordinating mutual exclusion to a shared resource
- know how to ensure secure access control through authorization mechanisms
- know how to ensure secure communication between users or processes, possible residing on different machines
- know how to set up multicast facilities for data dissemination in distributed systems
- know naming approaches ranging from chains of forwarding links, to distributed hash tables, to hierarchical location services
- know the alternatives for implementing strong consistency for replicas
- know the design goals of distributed computing systems
- know the design issues for servers including those used in object-based distributed systems
- know the Paxos algorithm for reaching consensus among the group members
- know the role of middleware layer in separating applications from underlying platforms
- know the use of Domain Name System (DNS)
- know the way of using attributes assigned to an entity to resolve a description of an entity in distributed system
- know the widely used models of communication: Remote Procedure Call (RPC), and Message-Oriented Middleware (MOM)
- know various types of distributed systems
- know what a flat-naming system is, and what mechanisms are needed to trace the location of entities in distributed system
- know what an application-level routing means for the message-oriented communication
- understand an importance of the replication of data in distributed systems
- understand client-server organizations in distributed systems
- understand coordination of a group of processes by means of election algorithms
- understand difference between process synchronization and data synchronization
- understand general principles and scalability issues of structured name systems
- understand how caching protocols can be used as a special case of consistency protocols
- understand process migration or more specifically code migration and its role in achieving scalability of distributed system
- understand protocols or rules that communicating processes must adhere to
- understand relation between fault tolerance and reliable communication
- understand some commonly applied architectural styles toward organizing distributed computing systems
- understand the concept of processes and how the different types of processes play a crucial role in distributed systems
- understand the difference between centralized and decentralized architectures
- understand the difference in implementing naming system in distributed systems and nondistributed systems
- understand the distinction and relation between logical organization of the collection of software components and the actual physical realization of the distributed system
- understand the existing distributed computing paradigms and systematic issues
- understand the goal of process coordination, coordination problems and solutions in distributed systems
- understand the importance of cooperation and synchronization of actions between processes
- understand the issue of managing replica servers
- understand the notion of partial failure of the distributed system and issue of recovery from partial failures
- understand the peculiarities of the high-level message-queuing model of process communication
- understand the practical issues and choices that can be made to instantiate and place software components on the real machines
- understand the process resilience through process groups
- understand the usage of names in resource sharing, identifying entities, referring to locations, and other uses in distributed systems
- understand the ways that processes on different machines in distributed system can exchange information
- understand threads and their role in obtaining performance in multicore and multiprocessor environments and in structuring clients and servers
- understand traditional deterministic means of multicasting as well as probabilistic approaches
- understand typical organizations of both clients and servers
- understand various mechanisms that are generally incorporated in distributed systems to support security
Course Contents
- Introduction: Design goals
- Introduction: Types of systems
- Architectures: Architectural styles. Middleware
- Architectures: System architecture. Example
- Processes: Threads. Virtualization
- Processes: Clients. Servers
- Communication: Foundations. RPC
- Communication: Message-oriented & Multicast communication
- Naming: Names, IDs. Flat naming
- Naming: Structured naming. Attribute-based naming
- Coordination: Clock synchronization
- Coordination: Mutual exclusion. Election algorithms
- Consistency and replication: Data-centric & Client-centric models
- Consistency and replication: Replica management. Consistency protocols
- Fault tolerance
- Security
Assessment Elements
- InClass Activity
- Homeworks
- Referate (Individual Study)
- Home Assignment (Group Project)
- Final Examination
Interim Assessment
- 2022/2023 4th module0.2 * Homeworks + 0.2 * Final Examination + 0.1 * InClass Activity + 0.2 * Referate (Individual Study) + 0.3 * Home Assignment (Group Project)
Bibliography
Recommended Core Bibliography
- Distributed Systems. (2017). Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsnar&AN=edsnar.oai.ris.utwente.nl.publications.db6a761f.b353.419e.b65a.81e3740bbe53
- Tanenbaum, A. S., & Steen, M. van. (2014). Distributed Systems: Pearson New International Edition : Principles and Paradigms (Vol. 2nd ed). Harlow, Essex: Pearson. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1418515
Recommended Additional Bibliography
- Steen, M., & Tanenbaum, A. (2016). A brief introduction to distributed systems. Computing, 98(10), 967–1009. https://doi.org/10.1007/s00607-016-0508-7