Notes on Qualification Certificate of Computer and Software Technology Proficiency - System Architect
Software Architecture Design
Definition
Definition: the fundamental structures of a software system and the discipline of creating such structures and systems
Software architecture deals with the design and implementation of the high level structure of the software. It is the result of assembling a certain number architectural elements in some well-chosen forms to satisfy the major functionality and performance requirements of the system, as well as some other, non-functional requirements such as reliability, scalability, portability and availability.
Software architecture = {Elements, Forms, Rationale/Constraints}
How developers solve problems before 1970s
Data Structure
Algorithms
Separation of Concerns
During 1990s
Architecture styles
Architecture description languages
Architecture documentation
Formal methods
1996 book "Software Architecture: Perspectives on an emerging discipline" promote architecture concepts such as Components, connectors and styles
Architecture Evaluation
Architecture trade-off analysis method (ATAM)
References
Risk, sensitivity points, tradeoff points
Risk
Architecturally important decision that have not been made
Or decisions have been made but whose consequences have not been fully understood.
Sensitivity points
Parameters in an architecture to which some measurable quality attribute response is highly correlated.
Tradeoff points
A parameter of an architectural construct is host to more than one sensitivity points where the measurable quality attributes are affected differently by changing that parameter.
Elicit
Driving quality attributes
Architectural design decisions
Evaluate
To determine if the design decisions satisfactorily address the quality requirements
Three concepts
Quality attribute characterizations
Stimuli
Events that cause the architecture to response or change
Response
Concrete and measurable or observable quantities of quality requirements
Architecture decisions
Components, connectors and their properties of the architecture
Scenarios
Types
Use case scenarios
Growth scenarios
Exploratory scenarios
Eliciting and prioritizing
Utility truee
Facilitated brainstorming
Attribute based architecture style (ABAS)
A style can be thought as a set of constraints on an architecture -- constraints on component types and their interactions -- and these constraints define the set or family of architectures that satisfy them.
Analytic approaches
Markov model
Rate monotonic analysis (RMA)
Queuing analysis
Outputs -- uncover key architectural decisions
Risks and non-risks
Risks are architecturally important decisions that have not been made.
Examples
The architecture team has not decided what scheduling discipline they will use.
The team has not decided whether they will use a rational or object-oriented database.
Sensitivity points
Sensitivity points are parameters in the architecture to which some measurable quality attribute response is highly correlated.
Examples
The overall throughput in the system is highly correlated to the throughput of particular communication channel.
The availability in the system is highly correlated to the reliability of that same communication channel.
Trade-off points
Trade-off points are parameters in the architecture, any of which hosts to more than one sensitivity point where the measurable quality attributes are affected differently by changing it.
Ex., if changing the speed of the communication channel mentioned above increases the throughput but reduces its reliability, then the speed of that communication channel is a trade off point.
Steps
Presentation
1. Present the ATM
2. Present business drivers
3. Present architecture
Investigation and analysis
4. Identify architectural approaches
5. Generate quality attribute utility tree
6. Analysis architectural approaches
Testing
7. Brainstorm and prioritize scenarios
8. Analyze architectural approaches
Reporting
9. Present results
Software Architecture Analysis Method (SAAM)
Cost Benefit Analysis Method (CBAM)
Reference
The idea is that the ASs effect QAs of the system and in turn provide the stakeholders of the system with some level of utility. However, each AS also has cost associated with it and takes time to implement. Given this information, the CBAM can aid the stakeholders in choosing ASs based on the ROI they provide.
Supporting Activities
Knowledge management
Design reasoning & decision making
Documentation
Architecture Description
Architecture description language
4+1 view modal
Views
Logical view
Logical view example: a. PABX system, b. Air traffic control system
Style: object-oriented
Process view
Process view example: PABX
Style: pipes or filters, client/server, etc.
Development view
Development view example: Air control system
Style: layered
Physical view
PABX
A small PABX with process allocation
A large PABX with process allocation
Scenarios
Scenarios: Embryo a scenario for a local call -- selection phase
Mapping between views
To map classes (and their objects) in the logical architecture onto a set of tasks and processes of the process architecture
Various
1. An agent task for an active class
2. Several agents for a given class to increase throughput
3. Several classes map onto a single agent because their operations are infrequently invoked or to guarantee sequential execution.
Example
Classes
Mapping to process view
Logical to development
The logical and development views are very close, but address very different concerns. The larger the project, the greater the distance between these views.
Process to physical
Processes and process groups are mapped onto available physical hardware, in various configurations for testing or development.
A scenario-driven iterative process
Document
Architecture style and patterns
Definitions
A description of component types and their topology, a description of the pattern of data and control interaction among the components and the informal description of the benefits and drawbacks of using that style.
Definition of architecture pattern: A general and reusable solution to a commonly occurring problem in software architecture within a given context.
Definition of architecture style
1. A special method of construction, characterized by the features that make it notable
2. A family of systems in terms of pattern of structural organization; a vocabulary of components and connectors, with constraints on how they can be combined
3. Reusable 'packages' of design decisions and constraints that are applied to an architecture to induce chosen desirable qualities
Top level partitioning
Behind all the architecture styles, there is a more important aspect: how is an architecture partitioned at it's top level, according to technical or according to domain or business.
Technical partitioning
Domain partitioning
Illustration
Recognized patterns and styles
Blackboard
Metaphor
A group of specialists are seated in a room with a large blackboard. They work as a team to brainstorm a solution to a problem, using the blackboard as the workplace for cooperatively developing the solution.
The session begins when the problem specifications are written onto the blackboard. The specialists all watch the blackboard, looking for an opportunity to apply their expertise to the developing solution. When someone writes something on the blackboard that allows another specialist to apply their expertise, the second specialist records their contribution on the blackboard, hopefully enabling other specialists to then apply their expertise. This process of adding contributions to the blackboard continues until the problem has been solved.
Framework
Components
1. The software specialist modules, which are called knowledge sources (KSs). Like the human experts at a blackboard, each knowledge source provides specific expertise needed by the application.
The domain knowledge needed to solve the problem is partitioned into knowledge sources
The objective of each KS is to contribute information that will lead to a solution to the problem
A KS takes the set of current state on the blackboard and update it as encoded in its specialized knowledge
The knowledge sources are represented as:
Procedures
Set of rules
Logic assertions
The knowledge sources modify only the blackboard or control data structures (that also might be on the blackboard), and only the knowledge sources modify the blackboard.
Each KS is responsible for knowing the conditions under which it can contribute a solution
It holds preconditions that indicate the condition on the blackboard which must exist before the body of the KS is activated.
2. The blackboard, a shared repository of problems, partial solutions, suggestions, and contributed information. The blackboard can be thought of as a dynamic "library" of contributions to the current problem that have been recently "published" by other knowledge sources.
The purpose of the blackboard is to hold computational and solution-state data needed by and produced by the knowledge sources.
The blackboard consists objects from the solution space.
The objects are hierarchically organized into levels of analysis
The objects and their properties define the vocabulary of the solution space.
The relationship between objects are denoted by named links.
The blackboard can have multiple panels
3. The control shell, which controls the flow of problem-solving activity in the system. Just as the eager human specialists need a moderator to prevent them from trampling each other in a mad dash to grab the chalk, KSs need a mechanism to organize their use in the most effective and coherent fashion. In a blackboard system, this is provided by the control shell.
The knowledge sources respond opportunistically to changes on the blackboard.
This is a set of control modules that monitor the changes on the blackboard and decide what actions to take next.
Various kinds of information are made globally available to the control modules
Can be on the blackboard or kept separately
The control information can be used by control modules to determine the focus of attention
The focus of attention indicates the next thing to do
The focus of attention can be:
1. knowledge sources
2. Blackboard objects (i.e., which solution island to pursue next)
3. The combination of the above both.
Behavior sequence
1. A KS makes change(s) to blackboard object(s).
A control record also is kept
2. Each KS indicates the contribution it can make to the new solution state
3. Using information from 1 and 2, a control module selects a focus of attention.
4. Depending on the information contained in the focus of attention, an appropriate control module prepares it for execution:
a. If the focus of attention is a knowledge source, then a blackboard object is chosen to sever as the context of its invocation (knowledge-scheduling approach).
b. If the focus of attention is a blackboard object, then a knowledge source is chosen which will process that object (event-scheduling approach).
c. If the focus of attention is knowledge source and an object, then that knowledge source is ready for execution. The knowledge source is executed together with the context, thus described.
Criteria are provided to determine when to terminate the process.
Implementations
Heresay II speech recolonization system
Adobe Acrobat OCR text recognization
References
Blackboard Systems - H. Ycnny Nii
Simple model
Jigsaw example
Koala example
Solution-space
Client-Server (2-tire, 3-tier, n-tier, cloud computing)
Definition
Client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs, which share their resources with clients. A client usually does not share any of its resources, but it requests content or service from a server. Clients, therefore, initiate communication sessions with servers, which await incoming requests.
Implementations
World Wide Web
Network printing
Compare to Peer-to-Peer architecture
Component-based
Data-centric
Event-driven (or implicit invocation)
Definition
Event driven architecture (EDA) is a software architecture paradigm promoting the production, detection and consuming of, and reaction to events.
What is event?
A significant change in state
In EDA system, what is produced, published, propagated, detected or consumed is a (typically asynchronous) message called the event notification, and not the event itself.
Structure
Event header
Event body
Components
Event generator/emitter
Event channel
Event processing engine (Event sink)
Implementations
Java Swing
JavaScript
Pros and Cons
Procs
Decouple components
Reverse dependency
Cons
Performance
Complexity
Inconsistency
Resources
What is Event Driven Architecture? (EDA - part 1)
The Saga Pattern in Microservices (EDA - part 2)
The Saga Pattern in Microservices (EDA - part 2)
The Many Meanings of Event-Driven Architecture • Martin Fowler • GOTO 2017
Layered (or multilayered architecture)
Components
Presentation Layer
Service layer (Application layer)
Business layer
Persistent layer
Three-tier
Presentation tier
Logical tier
Data tier
Microservice architecture
Definition
What it is
1. Services in a microservice architecture are often processes that communicate over network to fulfill a goal using technology-agnostic protocols such as HTTP.
2. Services are organized around business capabilities
3. Services can be implemented using different programming languages, database, hardware and software environment depending on what fits best.
4. Services are small in size, messaging-enabled, bounded by context, autonomously developed, independently deployable, decentralized and built and released with automated processes.
What it is not
It's not a layer within a monolithic application
Service granularity
A key step in defining a microservice architecture is figuring out how big an individual microservice has to be.
There is no consensus or limits on the service granularity as the right answer depends on business and organizational context.
Services that are dedicated to a single task, such as calling a particular backend system or making a particular type of calculation, are called as atomic services. Similarly, services that call such atomic services in order to consolidate an output, are called as composite services.
It's bad practice to make services too small as then the runtime overhead and the operational complexity can overwhelm the benefits of the approach.
If domain-driven design is being employed in modeling the domain for which the system is being built, then a microservice could be as small as an aggregate or as large as an bounded context.
Compare to SOA
SOA and microservice architecture differ in the scope. SOA has an enterprise scope and microservice has an application scope.
Grain
Resources
Jame Lewis, Martin Fowler
Martin Fowler at GOTO 2014
Monolithic application
Implementations
Word processor
Peer-to-Peer
Definnition
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the application. They are said to form a peer-to-peer network of nodes.
Peers make a portion of their resources, such as processing power, disk storage or network bandwidth, directly available to other network participants, without the need for central coordination by servers or stable hosts. Peers are both suppliers and consumers of resources, in contrast to the traditional client–server model in which the consumption and supply of resources is divided.
Compare to Client-Server architecture
A peer-to-peer network is designed around the notion of equal peer nodes simultaneously functioning as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client–server model where communication is usually to and from a central server. A typical example of a file transfer that uses the client–server model is the File Transfer Protocol (FTP) service in which the client and server programs are distinct: the clients initiate the transfer, and the servers satisfy these requests.
Implementations
Bittorrent
eDonkey
Block diagram
Pipes and filters
Definition
In software engineering, a pipeline consists of a chain of processing elements (processes, threads, coroutines, functions, etc.), arranged so that the output of each element is the input of the next; the name is by analogy to a physical pipeline. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes, or bits, and the elements of a pipeline may be called filters; this is also called the pipes and filters design pattern. Connecting elements into a pipeline is analogous to function composition.
Narrowly speaking, a pipeline is linear and one-directional, though sometimes the term is applied to more general flows. For example, a primarily one-directional pipeline may have some communication in the other direction, known as a return channel or backchannel, as in the lexer hack, or a pipeline may be fully bi-directional.
Block diagram
Plug-ins
Definition
Plug-in is a software component which adds specific feature to an existed computer program. When a program supports plug-ins, it enables customization.
Mechanism
The host application provides services which the plug-in can use, including a way for plug-ins to register themselves with the host application and a protocol for the exchange of data with plug-ins. Plug-ins depend on the services provided by the host application and do not usually work by themselves. Conversely, the host application operates independently of the plug-ins, making it possible for end-users to add and update plug-ins dynamically without needing to make changes to the host application.
Programmers typically implement plug-ins as shared libraries, which get dynamically loaded at run time.
Implementations
Email clients
Eclipse IDE
Media Players
Reactive architecture
Representational state transfer (REST)
Definition
REST is a software architectural style that was created to guide the design and development of the architecture for the World Wide Web. REST defines a set of constraints for how the architecture of an Internet-scale distributed hypermedia system, such as the Web, should behave. The REST architectural style emphasises the scalability of interactions between components, uniform interfaces, independent deployment of components, and the creation of a layered architecture to facilitate caching components to reduce user-perceived latency, enforce security, and encapsulate legacy systems.
Any web service that obeys the REST constraints is informally described as RESTful. Such a web service must provide its Web resources in a textual representation and allow them to be read and modified with a stateless protocol and a predefined set of operations. This approach allows the greatest interoperability between clients and servers in a long-lived Internet-scale environment which crosses organisational (trust) boundaries.
Architectural constraints
Client-server architecture
Statelessness
Cacheability
Layered system
Code on demand
Uniform interface
Implementations
Web Service
APIs
Base URI
Standard HTTP Methods
GET: Get the representation of the target resources's state
POST: Let the target resource process the representation enclosed in the request
PUT: Create or replace the state of the target resource with the state defined by the representation enclosed in the request
DELETE: delete the target resource's state
Media type
Rule-based
Service-oriented (SOA)
Definition
It promotes loose-coupling between services, separate functions into distinct units or service, which developers make accessible over a network in order to combine and reuse them in the production of applications. These services and their corresponding consumers communicate with each other by passing data in a well-defined, shared format, or by coordinating an activity between two or more services.
Came from
Distributed computing
Modular programming
Offspring
Cloud computer
Mashhups
SaaS
Components
Service provider
Service broker, register or repository
Service requester/consumer
Type of services
Functional service
Enterprise service
Application service
Infrastructure service
Service structure
Interface
Contract
Implementation
Key deference with Microservice
Has an ESB
References
Service oriented architecture
Shared-nothing architecture
Space-based architecture (SBA)
Space-based architecture (SBA) is a distributed-computing architecture for achieving linear scalability of stateful, high-performance applications using the tuple space paradigm. It follows many of the principles of representational state transfer (REST), service-oriented architecture (SOA) and event-driven architecture (EDA), as well as elements of grid computing. With a space-based architecture, applications are built out of a set of self-sufficient units, known as processing-units (PU). These units are independent of each other, so that the application can scale by adding more units.
Components
Processing unit
In-memory data grid cache
Data replication engine
Virtualized middleware
Messaging grid
Data grid
Processing grid
Deployment manager
Block Diagram
References
Apache Geode
Quality Attributes
List
Operational
Availability
Availability refers to a property of software, that is there and ready to carry out its task when you need it to be.
MTBF/(MTBF + MTTR)
The degree to which a system is in a specified operational and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time.
Continuity
Performance
Recoverability
Reliability/Safety
Robustness
Scalability
Elasticity
Structural
Configurability
Extensibility
Installability
Leveerageability/reuse
Localization
Maintainability
Portability
Supportability
Upgradability
Cross-cutting
Accessibility
Archivability
Legal
Privacy
Security
Security is a measure of a system's ability to protect data and information from unauthorized access while still providing access to people and systems that are authorized.
Confidentiality
Integrity
Availability
while being attacked
Supportability
Usability/Achievability
references
WiKi
Identifying
Extract architecture characteristics from domain concerns
Extract architecture characteristics from requirements
Measuring
...
Component
Definition
An individual software component is a software package, a web service, a web resource, or a module that encapsulates a set of related functions (or data).
Two characteristics
Independently replaable
Independently upgradable
Examples and varieties
A wrapper of a collection of code
A layer or subsystem
Event processor
Distributed service
Identifying
1. Identify initial components
2. Assign requirements to components
3. Analyze roles and responsibilities
4. Analyze architecture characteristics
5. Restructure components
6. Goto 3
Design: Discovering components
Actor/Actions
Event-storming
Workflow approach
Anti-pattern: Entity trap
Component-Based Software Engineering (CBSE, or CBD)
Component
Definition
An individual software component is a software package, a web service, a web resource or a module that encapsulates a set of related functions (or data). Components communicates with each other via interfaces.
Component models
A component model is a definition of properties that components must satisfy, methods and mechanisms for the composition of components
EJB
COM/DCOM
CORBA
Component architecture
Application Server
A computer running several software components is often called an application server.
The combination of application servers and software components is usually called distributed computing.
Distributed computing
Distributed computing is a field that studies distributed systems. A distributed system is a system whose components are located on different network computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal.
Three characteristics
Concurrency of components
Lack of a global clock
Independent failure of components
Middleware
Middleware is a type of computer software which provides services to software applications beyond those available from the operation system. It can be described as 'software clue'.
Other definitions
IETF
those services found above the transport (i.e. over TCP/IP) layer set of services but below the application environment (i.e. below application-level APIs).
In this more specific sense middleware can be described as the dash ("-") in client-server, or the -to- in peer-to-peer.
Middleware includes web servers, application servers, content management systems, and similar tools that support application development and delivery.
ObjectWeb
The software layer that lies between the operating system and applications on each side of a distributed computing system in a network.
Services that can be regarded as middleware include:
Enterprise application integration (EAI)
Data integration
Message oriented middleware (MOM)
Object request broker (OBR)
Enterprise services bus
Database access services are often characterised as middleware. Some of them are language specific implementations and support heterogeneous features and other related communication features. Examples of database-oriented middleware include ODBC, JDBC and transaction processing monitors.
Categories
Human-time service
Machine-time service
Categories from QCCSTP
Layered categories
底层中间件
Technologies
JVM
CLR
ACE
Products
Sun JVM
Microsoft CLR
通用型中间件
Technologies
CORBA
EJB
COM/DCOM
Products
IONA Orbix
BEA WebLogic
IBM MQSeres
集成型中间件
Technologies
Workflow
EAI
Products
BEA WebLogic
IBM WebSphere
Communication based categories
RPC
MOM
Advantages
Asynchronicity
Routing
Transformation
ORB
IEEE-1471-2000
Conceptual model of AD
Viewpoints
Structural viewpoint
Perry and Wolf
Architecture = {elements, forms, rationals}
Elements classes
Processing elements
Data elements
Connecting elements
Behavioral viewpoint
Concerns
What are the dynamic behaviors of and within a system?
What are the kinds of actions the system produces and participates in?
How do these actions relate (ordering, synchronization, etc.)?
What are the behaviors of system components? How do they interact?
Modeling methods
Events
Processes
States
Operations on those entities
Analytic methods
Communication sequential processes
Pi-calculus
Partially ordered sets of events
Physical interconnect view point
Concerns
What are the physical communications interconnects and their layering among system components?
What is the feasibility of construction, compliance with standards, and evovability?
Viewpoint languages
Physical identifiable node
Point to point link
Shared link
Link bit error rate viewpoint
Concerns
What is the bit error rate on a communication link?
What is the feasibility of building and maintaining consistency with operational information flow?
Viewpoint language
Popular architectures in detail
Microservices
Key concepts
How to model microservices
Communication patterns
Communication implementation
Handling changes between microservices
Service discovery
Service mesh and API gateway
Workflow
Serverless
Serverless architectures are application designs that incorporate third-party “Backend as a Service” (BaaS) services, and/or that include custom code run in managed, ephemeral containers on a “Functions as a Service” (FaaS) platform. By using these ideas, and related ones like single-page applications, such architectures remove much of the need for a traditional always-on server component. Serverless architectures may benefit from significantly reduced operational cost, complexity, and engineering lead time, at a cost of increased reliance on vendor dependencies and comparatively immature supporting services
Mike Roberts' definition
1. No management of server hosts or server processors
2. Self auto-scale and provision based on load (demand)
3. Costs based on precise usage.
4. Performance capabilities defined in terms other than host size/count.
5. Implicit high availability
BaaS
FaaS
API Gateway
References
Sam's speech at goto; conference
Reliability Engineering
Rules of Probability
Failure distribution function f(t)
f(t): time to (first) failure distribution (i.e., the failure density function)
Cumulative distribution function F(t)
it describes the probability of failure (at least) up to and including time t
Survival function (S(t)) or Reliability Function (R(t))
Failure rate λ
Definition: The total number of failures within an item population, divided by the total time expended by that population, during a particular measurement interval under stated conditions.
Fault rate λ is the frequency with which an engineered system or component fails, expressed in failures per unit of time.
Failure rate is a conditional probability
Hazard rate h(t) is failure rate λ(t) when Δt tends to zero
For exponential failure distribution, the hazard rate h(t) equals λ
Proof
Example of establishing failure rate
10^-6 failures/hour is one of the mostly used unit of failure rate in engineering
Plot
In reality, λ is usually a much smaller number than 0.1 used here, it's just for the sake of ploting.
MTBF
Mean time between failures (MTBF) is the predicted elapsed time between inherent failures of a mechanical or electronic system, during normal system operation. MTBF can be calculated as the arithmetic mean (average) time between failures of a system.
The term is used for repairable systems
Calculation
From the definition
With reliability function R(t) or density function f(t)
Any practically-relevant calculation of MTBF or probabilistic failure prediction based on MTBF requires that the system is working within its "useful life period", which is characterized by a relatively constant failure rate (the middle part of the "bathtub curve") when only random failures are occurring.
Assuming a constant failure rate λ results in a failure density function as follows: f(t)=λ e^-λt, which, in turn, simplifies the above-mentioned calculation of MTBF to the reciprocal of the failure rate of the system:
MTBF = 1/λ
Proof
In practice, the mean time between failures (MTBF, 1/λ) is often reported instead of the failure rate. This is valid and useful if the failure rate may be assumed constant – only relate to the flat region of the bathtub curve, which is also called the "useful life period".
Bathtub curve: The 'bathtub curve' hazard function (blue, upper solid line) is a combination of a decreasing hazard of early failure (red dotted line) and an increasing hazard of wear-out failure (yellow dotted line), plus some constant hazard of random failure (green, lower solid line)
For network components
Calculation for in-series and in-parallel components
Intuitively, both these formulae can be explained from the point of view of failure probabilities. First of all, let's note that the probability of a system failing within a certain timeframe is the inverse of its MTBF. Then, when considering series of components, failure of any component leads to the failure of the whole system, so (assuming that failure probabilities are small, which is usually the case) probability of the failure of the whole system within a given interval can be approximated as a sum of failure probabilities of the components. With parallel components the situation is a bit more complicated: the whole system will fail if and only if after one of the components fails, the other component fails while the first component is being repaired; this is where MDT comes into play: the faster the first component is repaired, the less is the "vulnerability window" for the other component to fail.
MTTR
The average time required to repair a failed component or device
Availability: MTBF/(MTBF + MTTR)
MTTF
Definition from "Computer Architecture - A Quantitative Approach"
Calculation
An example from "Computer Architecture - A Quantitative Approach
Failure rate λ
Failure rate is a frequency with an engineered system or component fails, expressed in failures per unit of time.
Under certain engineering assumptions (e.g. besides the above assumptions for a constant failure rate, the assumption that the considered system has no relevant redundancies), the failure rate for a complex system is simply the sum of the individual failure rates of its components, as long as the units are consistent, e.g. failures per million hours. This permits testing of individual components or subsystems, whose failure rates are then added to obtain the total system failure rate
Distributions
Exponential distribution
f(t) =λ e^(-λt)
R(t) = 1- F(t) = 1 - (1 - e^(-λt)) = e^(-λt)
Normal distribution
Redundancy design
Series reliability mode
Active redundancy
Dual redundant system
General expression
Reliability assertment
FMEA
Terms
Failure: The lost of function under stated conditions
Failure mode: The specific manner or way by which a failure occurs in terms of failure of the part, component, function, equipment, subsystem, or system under investigation.
Failure effect: immediate consequence of a failure on operation, or more generally on the needs for the customer/user that should be fulfilled but now is not, or not fully, fulfilled.
Local effect: The failure effect as it applies to the item under analysis.
Probability (P): the likelihood of the failure occurring.
A: Extremely unlikely
B: Remote (relatively few failures)
C: Occasional
D: Reasonably possible
E: Frequent
Severity (S): The consequence of the failure mode.
I: No relevant effect on reliability or safety
II: Very minor, no damage, no injures
III: Minor, no damage, light injures.
IV: Critical
V: Catastrophic
The analysis should always be started by listing the functions that the design needs to fulfill.
Example FEMA worksheet
FTA: fault tree analysis
Unlike conventional logic gate diagrams, the gates in a fault tree output probabilities related to the set operations of boolean logic. The probability of gate's output event depends on the input event probabilities.
Failure probability (at input events)
Software reliability
Hardware unreliability is a result of component or material failure that results in the system not performing its intended function. Repairing or replacing the hardware component restores the system to original operating state.
Software unreliability is the result of unanticipated results of software operations.
Metric
The number of software faults, expressed as faults per thousand lines of code
Object-Oriented Design
An object contains encapsulated data and procedures grouped together to represent an entity. The 'object interface' defines how the object can be interacted with. An object-oriented program is described by the interaction of these objects.
Object-oriented design is the discipline of defining the objects and their interactions to solve a problem that was identified and documented during object-oriented analysis.
Software design pattern
Software design pattern is a general, reusable solution to a commonly occurring problem with a given context in software design.
Design pattern is more general than software design pattern
Design pattern is a reusable form of solution to a design problem.
Creational patterns
Abstract factory
Intent
Provide an interface for creating families of related or dependent objects without specifying their concrete classes.
Motivation
Structure
Sample
C++
Python
Use the class itself as factory
Builder
Intent
Separate the construction of a complex object from its representation, allowing the same construction process to create various representation.
Motivation
Each converter class is called a builder in the pattern, and the reader is called the director. Applied to this example, the Builder pattern separates the algorithm for interpreting a textual format (that is, the parser for RTF documents) from how a converted format gets created and represented. This lets us reuse the RTFReader's parsing algorithm to create different text representations from RTF documents—just configure the RTFReader with different subclasses of TextConverter.
Structure
Collaborations
Sample
Maze
Part1
Part2
Part3
Bicycle
Dependency injection
A class accepts objects it requires from an injector instead of creating the objects directly.
Structure and collaborations
Sample
1. Without the injection, the Client directly depends on the Service.
2. Pass the dependency via the Client's constructor. Drawback: hard to change the Service later.
3. Pass in the dependency via Setter method. Drawback: how to ensure the Setter was correctly called?
4. Full size implementation: The client implemented the injection interface and the injector class takes care the dependencies setup and switching.
5. The simplest implementation: manually assembly services and clients in the root place.
Factory method
Intent
Define an interface for creating an object, but let subclasses decide which class to instantiate. Factory method lets a class defer instantiation to subclasses.
Motivation
Structure
Sample
Part 1
Part 2
Prototype
Intent
Specify the kinds of objects to create using a prototypical instance, and create new objects by copying this prototype.
Motivation
Structure
Collaborations
A client asks a prototype to clone itself.
Sample
Part 1
Part 2
Part 3
Singleton
Intent
Ensure a class has only one instance and provide a global point of access to it.
Motivation
Structure
Collaborations
Clients access a Singleton instance solely through Singleton's Instance() operation.
Sample
C++
Structural patterns
Adapter
Intent
Convert the interface of a class into another interface clients expect. Adapter lets classes work together that couldn't otherwise because of incompatible interfaces.
Motivation
Structure
Collaborations
Client call operations on an Adapter instance. In turn, the adapter calls Adaptee operations that carry out the request.
Sample
C++
Bridge
Intent
Decouple an abstraction from its implementation, so the two can vary independently.
Motivation
Structure
Collaborations
Abstraction forwards client requests to its implementor object.
Composite
Intent
Compose objects into tree structures to represent parts-whole hierarchies. Composite lets client treat individual objects and compositions of objects uniformly.
Motivation
Structure
Collaborations
Clients use the Composite class interface to interact with objects in the composite structure. If the recipient is a Leaf, then the request is handled directly. If the recipient is a Composite, then it usually forward requests to its child components, possibly performing additional operations before and/or after formwarding.
Sample
part 1
Part 2
Part 3
Decorator
Intent
Attach additional responsibilities to an object dynamically. Decorators provide a flexibly alternative to subclassing for extending functionality.
Motivation
Structure
Collaborations
Decorator forwards requests to its Component. It may optionally perform additional operations before and/or after forwarding the request.
Behavioral patterns
Chain of Responsibility
Intent
Avoid coupling the sender of a request to its receiver by giving by more than one object a chance to handle the request. Chain the receiving objects and pass the request along the chain until an object handle it.
Motivation
Structure
Sample
Part 1
Part 2
Part 3
Interpreter
Intent
Given a language, define a representation for its grammar along with an interpreter that uses the representation to interpret sentences in the language.
Motivation
The regular expression language
Abstract syntax tree as an instance of the above class diagram
Collaborations
1. The client builds (or is given) the sentence as an abstract syntax tree of NonterminalExpression and TerminalExpression instances. Then the client initializes the Context and invoke the interpret operation.
2. Each NonterminalExpression node defines interpret() in terms of interpret on each subexpression. The interpret operation of each TerminalExpression defines the base case in the recursion.
3. The interpret operations at each node use the Context to store and access the state of the interpret.
Sample
Part 1
Part 2
Part 3
Part 4
Part 5
Strategy
Intent
Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from clients that use it.
Motivation
Structure
Collaborations
1. Strategy and Context interact to implement the chosen algorithm. A context may pass all data required by the algorithm to the strategy when the algorithm is called. Alternatively, the context can pass itself as an argument to Strategy operations. That lets the strategy call back on the context as required.
2. A context forwards requests from its clients to its strategy. Clients usually create and pass a ConcreteStrategy object to the context; thereafter, clients interact with the context exclusively. There is often a family of ConcreteStrategy classes
for a client to choose from.
Proxy
Intent
Provide a surrogate or placeholder for another object to control access to it.
Motivation
Structure
TODOs
Popular Web framework/architecture
Popular Java-based architectures
Readings
Domain Driven Design for Services Architecture
Networking
Three-tier hierarchical internetworking model
Access layer
Access layer devices are usually commodity switching platforms. May or may not provide layer 3 switching.
Distribution layer
Also called smart layer, workgroup layer. Routing, filtering, QoS
Core layer
Hi-speed forwarding services for different regions of the network
references
WikI
Structured cabling
references
WiKi
The six subsystems
IPv6
Ipv6 Header
IPv6 Address
Unicast
Link-local address
fe80::/10
fe80::/64 + EUI-64
RFC7217 address
Global unicast
Loopback
::1/128
Unique local
fc00::/7
Multicast
Well known
ff00::/12
Transient
ff10::/12
Solicited
ff02:0:0:0:0:1:ff00::/104
Cheetsheet
SLAAC: Stateless Address Autoconfiguration
RFC 4861 Neighbor Discovery - SLAAC - ICMPv6
Cheetsheet
IPv6 address configuration command line
IPv4
Classful network
DiffServ
DNS
Recursive query vs iterative query
DNS message
Network storage
DAS
SAN
NAS
FCASN
IPSAN
LAN
Collision domain
Broadcast domain
VLAN
Math
Linear programming
Simplex method
References
WiKi
Examples
Canonical
Ex.1,2: Minimize/maximize 2x3
Ex.3, Maximize 2x2
Ex.4 Maximize 2x2
Ex.5 Maximize 3x2
Noncanonical
Ex.1 Minimize, equality constraints
Ex.2 Minimize -- converted to the dual problem
Integral linear programming
Ex.1
Transportation problem
Stepping stone method
MODI method
Two-person zero-sum game theory
General design topics
Coupling and cohesion
Coupling
Cohesion
Types of cohesion (worst to best)
Coincidental cohesion
Parts of a module are grouped together by no reason
Ex: an utilities class
Logical cohesion
Parts of a module are grouped together because of they are logically categorized to do the same thing but are different by nature.
Ex.1: all mouse and keyboard inputs handling are grouped together
Ex.2: All M, V, C routines are in separated folders in a MVC pattern.
Temporal cohension
Grouped by when they are processed.
Procedural cohesion
Grouped because they always follow a certain sequence of execution.
Communicational/informational cohesion
Grouped because they operate on the same data
Sequential cohesion
Grouped together because the output of one part is input of another part.
Functional cohesion (best)
Parts of a module are grouped because they contribute to a well-defined task of the module.
Perfect cohesion (atomic)
Cannot be reduced anymore
Enterprise Application Integration (EAI)
Business system planning (BSP)
Steps
Preparation
Analysis
Strategy
Process
There are about 40-60 business processes in an organization (depending on its size), and it is important to choose the most profitable ones and the department responsible for a particular process.
Ex. processes
Transfer
Car rental
Invoicing
Contract creation
Data class
There are usually about 30–60 data classes, depending on the size of the organization. Future IS will use databases based on these classes
Ex. data classes
Corporation
Customer
Employee
Invoice
Supplier
Information support
Management discussion
Conclusion
Define information architecture
To define an organization's information architecture,[4] it is necessary to connect the information subsystems using matrix processes and data classes to find appropriate subsystems. The organization then reorders processes according to the product (or service) life cycle.
Establishing IS-development priorities
A number of criteria (costs and development time, for example) establish the best sequence of system implementation. High-priority subsystems may be analyzed more deeply. This information is given to the sponsor, who determines which information subsystems will be developed.
Verifying study impact
Proposals
Presentation
Final step
Criticism
Computer Science Fundamentals
Coding theory
Block coding
Hamming Distance
In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or the minimum number of errors that could have transformed one string into the other.
Rate
R = k/n, where k is message length, n is block length
Minimum distance d
Relative distance: d/n
Hamming code (SED)
Notation: (n, k), where n is block length, ke is message length, and n = 2^r - 1, where r is number of the parity bits
Example: Hamming(7,4) (with r=3)
Algorithm
Visual inspection
Steps
Any given bit (data or parity) is included in a unique set of parity bits. So if there is k parity bits, bits from 1 up to 2^k - 1 can be covered. After discounting the k parity bits, 2^k - k - 1 bits remain for use as data.
Minimal distance is 3
Means, if you flip one bit, you have to also flip at least two parity bits to make it a valid code.
Detect up to two-bits error and correct one-bit error
Hamming code with additonal parity (SECDED)
Minimal distance is 3 + 1 = 4
Can tell difference between one-bit error and two-bit error
Some two-bit error can have the same value as some one-bit errors
Stream coding
CRC
CRC-n
Polynomial has n degree, n + 1 items
Polynomial long division
1. Generator polynomial as divisor
2. Message as divident
3. Quotient discarded
4. Remainder as result
Parity bit: generator = x + 1, CRC-1
CRC is a linear function: CRC(x + y + z) = CRC(x) + CRC(y) + CRC(z)
Example: polynomial = x^3 + x + 1 (n = 3), message = 11010011101100
Calculation
Verification
Other examples
CRC-32 Table Lookup Algorithm
Computer architecture
Memory management
Fully associative cache and replacement algorithm
Directly mapped caches
Set associative caches
File Allocation
Contiguous
Linked
Index
Multi-core processors
AMP
SMP
BMP
Disk scheduling
SSFT
FCSF
Security
PKI
CA & RA
Kerberos
KDC
Taming Kerberos
Secure electronic transaction (SET) protocol
Auth0
Cyber threats
References
21 Top Cyber security Threats
WiKi
RTOS
Task scheduler
Preemptive time slicing
Video course
Round-robin (nonpreemptive)
Priority based preemptive scheduling
Video course
Rate monotonic scheduling
It's preemptive
Two types
Fixed priority
More frequency task (with shorter period) will always has a higher priority than a less frequency (with longer period) one.
Dynamic priority
Least upper bound
Example of possible schedule: below the least upper bound
Example of impossible schedule: above the least upper bound
LIM n(s^(1/n) - 1) = ln n = 0.693174...
For harmonic tasks set, the upper bound can be relaxed up to 1.0
Optimal
If a set of processes cannot be scheduled by RM scheduling, it cannot be scheduled by any other static priority based algorithm.
Database
Relational Database
Rational algebra
Select
Commutative
Project
Operations from Set theory
Union
Intersection
Set difference (or Minus)
Cartesian product (or cross product)
Join
Natural join
Natural join (⋈) is a binary operator that is written as (R ⋈ S) where R and S are relations. The result of the natural join is the set of all combinations of tuples in R and S that are equal on their common attribute names.
If the two relations have no attributes in common, then their natural join is simply their Cartesian product.
If the two relations have more than one attribute in common, then the natural join selects only the rows where all pairs of matching attributes match.
Relation calculus
Tuple calculus
Range relation R of tuple variable t: R(t). Ex. EMPLOYEE(e)
Existential quantifier: (∃t)(F)
If F is a formula, then so is (∃t)(F), where t is a tuple variable. The formula (∃t)(F) is TRUE if the formula F evaluates to TRUE for some (at least one) tuple assigned to free occurrences of t in F; otherwise, (∃t)(F) is FALSE.
Universal quantifer: (∀t)(F)
If F is a formula, then so is (∀t)(F), where t is a tuple variable. The formula (∀t)(F) is TRUE if the formula F evaluates to TRUE for every tuple (in the universe) assigned to free occurrences of t in F; otherwise, (∀t)(F) is FALSE.
Transforming the Universal and Existential Quantifiers
(∀x) (P(x)) ≡ NOT (∃x) (NOT (P(x)))
(∃x) (P(x)) ≡ NOT (∀x) (NOT (P(x)))
(∀x) (P(x) AND Q(x)) ≡ NOT (∃x) (NOT (P(x)) OR NOT (Q(x)))
(∀x) (P(x) OR Q(x)) ≡ NOT (∃x) (NOT (P(x)) AND NOT (Q(x)))
(∃x) (P(x)) OR Q(x)) ≡ NOT (∀x) (NOT (P(x)) AND NOT (Q(x)))
(∃x) (P(x) AND Q(x)) ≡ NOT (∀x) (NOT (P(x)) OR NOT (Q(x)))
(∀x)(P(x)) ⇒ (∃x)(P(x))
NOT (∃x)(P(x)) ⇒ NOT (∀x)(P(x))
Examples
Ex.1
Ex.2
Ex.3
Ex.4
Ex.5
Ex.6
Ex.7
Domain calculus
Universal schema notation: R = {A1, A2, ... , An}
Functional dependency (FD)
A functional dependency, denoted by X → Y, between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. That constraint is that, for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y] = t2[Y].
The dependency F as well as all the dependencies that can be inferred from F is called a closure of F.
Inference rules
IR1: Reflexive rule (trivial)
If X ⊇ Y, then X -> Y
IR2: augmentation rule
{X -> Y} |= XZ -> YZ
IR3: transitive rule
{X -> Y, Y -> Z} |= X -> Z
IR4: decomposition or projective rule
{X -> YZ} |= X -> Y
IR5: Union or additive rule
{X -> Y, X -> Z} |= X -> YZ
IR6: pseudotransitive rule
{X -> Y, WY -> Z} |= WX -> Z
Keys
Superkey
A set of attributes that uniquely identifies each tuple of a relation.
Because superkey values are unique, tuples with the same superkey values must have the same no-key attribute values. That is, non-key attributes are functionally dependent on the superkey.
A set of all attributes is always a superkey (the trivial superkey). Tuples in a relation are by definition unique.
Candidate key
A candidate key (or minimal superkey) is a superkey that cannot be reduced to a simpler superkey by removing an attribute.
Primary key
A choice of (or a designated) candidate key; any other candidate key is an alternate key
Minimal cover of a set of dependencies E
Definition: If F is a minimal cover of E, then
1. Every dependency in F has a single attribute on the right-hand side.
2. We cannot replace any dependency X -> A in F with a dependency Y -> A, where Y is a proper subset of X, and still have a set of dependencies that is equivalent to F.
No extraneous attribute
3. We cannot remove any dependency from F and still have a set of dependencies that is equivalent to F
Algorithm: finding a minimal cover F from a set of dependencies E
Relational decompositions
Decompose an universal relation schema R = {A1, A2, ..., An} into a set of relation schema D = {R1, R2, ..., Rm}
Attribute preservation
Dependency preservation
Each functional dependency X -> Y specified in F of R either appeared directly in one of the relation schemas Ri in the decomposition D or could be inferred from dependencies that appear in some Ri.
Lost dependency
If a decomposition is not dependency-preserving, some dependency is lost in the decomposition. To check that a lost dependency holds, we must take the JOIN of two or more relations in the decomposition to get a relation that includes all left- and right-hand-side attributes of the lost dependency, and then check that the dependency holds on the result of the JOIN—an option that is not practical.
Ex
It is always possible to find a dependency-preserving decomposition D with respect to F such that each relation Ri in D is in 3NF.
Lossless (Nonadditive) join property
Example of a loss join
Example of a lossless join
Informal conclusions of lossless join from binary decomposition
1. attr(R1) ⋃ attr(R2) = attr(R)
2. attr(R1) ⋂ attr(R2)≠ Φ (Must have common attributes)
3. The common attributes set (attr(R1) ⋂ attr(R2)) must be supper key of either R1 or R2.
Extend to triple decomposition
Video course (for binary decomposition)
Video course (for triple decomposition)
The word loss in lossless refers to loss of information, not to loss of tuples. If a decomposition does not have the lossless join property, we may get additional spurious tuples after the PROJECT (π) and NATURAL JOIN (*) operations are applied; these additional tuples represent erroneous or invalid information.
General testing algorithm
Testing algorithm for binary decomposition (NJB property test)
Ex. Only the decomposition 3 meets lossless joint property
Algorithm: Synthesis info 3NF
Normal forms
Normal forms cheat sheet
SQL
Inner join
Outer join
Left outter join
Right outer join
Full outer join
Database transaction, concurrent control and recovery mechanism
Transaction
A transaction is an executing program that forms a logical unit of processing
Insertion
Deletion
Modification (update)
Retrieval
Types
A
Embedded
Interactively
B
Read-only transaction
Read-write transaction
Either Committed or Aborted
Not allow some operations in a transaction T to be applied to the database while other operations of T are not.
State transition diagram of transaction
Transaction operations
Start: s
read_item:r
write_item: w
end: e
commit: c
abort: a
Transaction schedule (or history)
A schedule S of n transactions T1, T2, ..., Tn is the ordering of the operations of the transactions
Ex1
Ex2
Conflicting operations
Conditions
1. They belong to different transactions
2. They access the same item X
3. At least one of the operations is write_item(X)
Changing the order of conflicting operations in a schedule, will resulted in different outcome.
Types
Read-write conflict
Write-write conflict
Complete schedule
Conditions
1. Operations in S are exactly those in T1, T2, ..., Tn including a commit or abort operation as the last operation for each transaction in the schedule.
2. For any pair of operations from the same transaction Ti, their relative order of appearance in S is the same as the order of appearance in Ti.
3. For any two conflicting operations, one of the two must occur before the other in the schedule -- partially ordered
Committed projection C(S)
Desirable properties
ACID
Atomicity
Consistency perservation
Isolation
Durability or permanency
Schedule properties based on recoverability
Nonrecoverable and recoverable schedules
Nonrecoverable
A committed transaction may have to be rolled back during recovery
Recoverable
Condition
No transaction T in S commits until all transaction T' that have written some item X that T reads have committed.
Ex. recoverable
Sa: r1(X), r2(X), w1(X), r1(Y), w2(X), c2, w1(Y), c1
Sc: r1(X), w1(X), r2(X), r1(Y), w2(X), c1, c2
This is a fix of Sb below by committing T1 before committing T2
Se: r1(X), w1(X), r2(X), r1(Y), w2(X), a1, a2
This is another fix of Sb below by aborting T1 and then T2. Note T2 cannot commit in this case.
This is a cascading rollback because T2 has to be rolled back since it reads data from T1 and T1 aborted.
Ex. nonrecoverable
Sb: r1(X), w1(X), r2(X), r1(Y), w2(X), c2, a1
When T1 aborts, T2, that has committed, has to be rolled back since the value of X that T2 read is no longer valid.
Schedule properties based on serializability
Types
Serial schedule
Nonserial schedule
Conflict-serializable schedule
Serializable
Saying that a nonserial schedule is serializable is equivalent to saying that it is correct, because it is equivalent to a serial schedule, which is considered correct.
Equivalence
We don't consider result equivalence, it is unstable and depends on other internal status of the schedules.
S1 and S2 are result equivalent if the initial value of X is 100
For two schedules to be equivalent, the operations applied to each data item affected by the schedules should be applied to that item in both schedules in the same order.
Conflict-equivalence
Two schedules are said to be conflict equivalent if the relative order of any two conflicting operations is the same in both schedules.
Equivalence = Conflict-equivalence: We define a schedule S to be serializable if it is conflict equivalent to some serial schedule S'.
Examples
1. D is serializable because it conflict equivalent to A
2. C is not serializable because it is not conflict equivalent to either A or B
How to test?
Try to move operations in T2 of schedule C and D to make the schedule a serial schedule with either T1,T2 or T2,T1 and to check whether the order of any two conflicting operations got changed.
Database model
A collection of named data items
Examples of data item
A database record
A file
A disk block
A data item has a name
Record Id
Block address
Filename
Granularity: size of data time
Database operations
Read_item(X)
1. Find address of the disk block that contains the item X
2. Copy the disk block into main memory buffer. Buffer size = block size
3. Copy item X from the buffer to the program variable named X
Write_item(X)
1. Find address of the disk block that contains the item X
2. Copy the disk block into main memory buffer. Buffer size = block size
3. Copy item X from the program variable named X into its correct location in the buffer
4. Store the updated disk block from the buffer back disk (either immediately or at some late point in time)
Database cache
Buffer replacement policy
LRU
Log
A log is a sequential, append-only that is kept on disk
Log buffers
Hold the last part of log entries in main memory
Log entries are firstly added to the log buffer.
Log entries
[start_transaction, T]
[write_item, T, X, old_value, new_value]
[read_item, T, X]
[commit, T]
[abort, T]
T reaches commit point
1. All its operations that access the database have been executed successfully
2. Effect of all the transaction operations on the database have been recorded in the log file
Any portion of the log that is in the log buffer but has not been written to the disk yet must now be written to the disk -- force-writing
Concurrency control
Binary lock
A binary-valued variable LOCK, associated with each data item in the database
Data structure
Records in lock table: <Data_item_name, LOCK, Locking_trasaction>
A waiting queue of a data item
Only needs to maintain records for items that are currently locked in the lock table. Hence the value of the lock variables in lock table will always be 1.
Operations
lock_item(X)
unlock_item(X)
Shared/Exclusive lock (or read/write lock, multi-mode lock)
The LOCK variable has tree values: read-locked, write-locked, unlocked
Data structure
Record in lock table: <Data_item_name, LOCK, No_of_reads, Locking_transaction(s)>
Unlocked data items are not traced, so the value of the LOCK variable will be either read_locked or write_locked
A waiting queue associated to each locked data item
Operations
read_lock(X)
write_lock(X)
unlock(X)
Lock conversion
Upgrading
Downgrading
Nonserialized shedule
Caused by locks released too earlier
When T1, T2 serialized
When T1, T2 not serialized
Two phase lock protocol (2PL)
All locking operations, including the upgrading, must precede the first unlock operation in a transaction.
A transaction can be considered as being divided into two phases: expanding (acquisition) phase and shrinking (release) phase
If every transaction in a schedule follows the two-phase lock protocol, the schedule is guaranteed to be serializable, obviating the need to test for serializability of schedules.
T1' and T2', unlike T1 and T2 above, now follow the 2PL, can guarantee the serializability. But they can produce a deadlock.
2PL can introduce deadlock
Deadlock
Deadlock occurs when each transaction T in a set of two or more transactions is waiting for some item that is locked by some other transaction T'.
Example
The schedule
The wait-for graph
Deadlock prevention protocols
Not practical protocols
Lock all as a whole in advance
Lock in same order
Transaction timestamp based protocols
Wait-die
Wound-wait
Recovery
Updating policy
No-steal: Deferred update
No physical update of the database on disk until after a transaction commits
NO-UNDO/REDO
Steal: Immediate update
The database may be updated by some operations of a transaction before the transaction reaches its commit point.
UNDO/REDO
Variation
All updates are required to be recorded in the database on disk before a transaction commits
UNDO/NO-REDO
The UNDO and REDO operations are required to be idempotent
Disk block cache (buffer)
Cache directory
<disk page address, buffer location, dirty bit, transaction IDs, ...>
Pin/Unpin
Cache replace (flush)
Blocks
Data file blocks
Index file blocks
Log file blocks
BFIM (before image) and AFIM (after image)
In-place updating
Write-Ahead logging
The BFIM must be recorded in log entry and flushed to disk before the BFIM is overwritten with the AFIM in the database on disk.
Log entry types
REDO-type log entry
Includes the new value (AFIM)
UNDO-type log entry
Includes the old value (AFIM)
Shadowing
Log is not necessary
Steal/no-steal
steal: It happens when replace an existing page, which has been updated but whose transaction has not committed.
No-steal: a cache page updated by a transaction cannot be written to disk before the transaction commits
UNDO never needed
Force/No-force
Force: all pages updated by a transaction are immediately written to disk before the transaction commits
REDO never needed
No-force: a committed transaction can still has pages updated in the cache but have not written to the disk.
Compare policies
The 'fastest' mean highest performance
The 'slowest' one, No Steal/Force, is also the easiest one to implement
An example WAL (write-ahead logging) protocol with UNDO/REDO log using Steal/No-force approach
1. The BFIM of an item cannot be overwritten by its AFIM in the database on disk until all UNDO-type log records for the updating transaction up to this point have been force-written to disk.
means log will always be written before the data page
2. The committing operation of a transaction cannot be completed until all the REDO-type log records for that transaction have been force-written to disk.
All the logs need to be written before a committing complete
Housekeeping transactions
Active transaction list
Committed transactions list since last checkpoint
Aborted transaction list since last checkpoint
Two-phase commit protocol
Rollback
After whatever failure happened and before the transaction commits
Cascading rollback
If a transaction T is rolled back, any transaction S that has, in the interim, read the value of some data item X written by T must also be rolled back.
ARIES big picture
Distributed Database
Parallel vs Distributed
Multi-processor parallel
Shared memory
Shared disk
Shared-nothing
Heterogeneity
Conditions
Connected via computer network
Logical interrelation of connected databases
Possible absence of homogeneity among connected nodes
Most important advantages
Availability
Reliability
Properties
Transparency
Data organization transparency (distribution or network transparency)
Location transparency
Naming transparency
Replication transparency
Fragmentation transparency
Horizontal fragmentation
Vertical fragmentation
Scalability
Horizontal scalability
Vertical scalability
Autonomy
It determines the extend to which each individual notes or DBs in a connected DDB can operate independently.
Design autonomy
Communication Autonomy
Execution autonomy
Techniques
Fragmentation and sharding
Fragmentation types
Horizontal fragmentation
Vertical fragmentation
Mixed fragmentation
Fragmentation schema
Allocation schema
Replication
Types
No replication
Fully-replicated
System can continue to operate as long as at least one site is up
Partial replication
Some fragments of the database may be replicated whereas others may not.
Replication schema
Project optimization
Project crashing
Critical path method
Ex.1
Goal: finish the project in 16 weeks
Ex.2
Goal: minimum time
Ex.3
Project resource leveling
References
PERT / CPM Resource Leveling - Ed Dansereau
Ex.1
Software development life-cycle (SDLC)
Model
Waterfall
V-Shape
Spiral
Rapid Prototype
Iterative Incremental
Process
RUP
Four phases
1. Inception
Outputs
Business case
Business context
Success factor
Expected revenue
Market recoginizaton
A basic use case model
Project plan
Initial risk assessment
Project description
The core project requriements
Constraints
Key features
2. Elaboration
Outputs
Use case model (80% completion)
Description of software architecture
Description of software development process
An executable architecture that realizes the architectural significant use cases.
Revised business case and risk list
A development plan for the overall project
Prototypes that demonstrably mitigates each identified technical risks
A preliminary user manual (optional)
3. Construction
Outputs
The software system itself that is ready to be transferred to the end users
User manual
4. Transition
The released software product
Nigh disipline
Technical
1. Business modeling
2. Requirements
3. Analysis and design
4. Implementation
5. Testing
6. Deployment
Supporting
7. Configuration and change management
8. Project management
9. Environment
James Martin RAD
Agile
Joint application design (JAD)
Structured systems analysis and design method (SSADM)
SSADM is a waterfall method for the analysis and design of information systems. SSADM can be thought to represent a pinnacle of the rigorous document-led approach to system design, and contrasts with more contemporary agile methods such as DSDM or Scrum.
Builds on a set of methods
Peter Checkland's soft systems methodology
Larry Constantine's structured design
Yourdon Structured Method
Jackson Structured Programming
Tom DeMarco's structured analysis
Three most important techniques
Logical data modeling
Data flow modeling
Entity event modeling
Domain specific software architecture (DSSA)
Concept
Definitions
Formal: A context for patterns of problem elements, solution elements and situation that define mapping between them. -- Will Tracz
A domain specific software architecture is, in effect, a multiple-point solution to a set of application specific requirements (which define a problem domain). -- Will Tracz
The domain specific software architecture, which we call a reference architecture, is specified by reference requirements, the product of a domain analysis. Application systems are constructed by tailoring the reference architecture to meet the specific system requirements and populating the architecture with components from the DSSA library. -- James W. A.
What make DSSA distinct
The separation of user needs from system requirements and implementation constraints.
The separation of functional requirements and implementation constraints.
Separate problem-space analysis from solution-space analysis
Case-based reasoning and reverse engineering are no central mechanisms for identifying reusable resources, but rather existing applications are used as vehicle to validate the architectures that are derived, to-down, from generalized user requirements.
Five stages of DSSA process (Will Tracz)
DSSA process life cycle (James W. A.)
It's a software life cycle based on the development and use of domain-specific software architectures, components and tools. It is a process life cycle supported by a DSSA library and development environment.
Four phases
Domain library
Domain library is a library containing domain-specific software assets for reuse in DSSA process.
Reference requirements
A reference requirement is a generic requirement for the domain.
It defines the problem space.
Reference architecture
A reference architecture is a generic set of architecture component specification for a domain (and at least one instance).
A reference architecture is composed of component class specifications.
A component class specification is an element of the reference architecture that specifies what elements of the architecture do and what their interfaces are.
It defines the solution space.
System architecture
A system architecture is an instance of an architecture that meets the specifications in a reference architecture tailored to meet the requirement of a specific system.
This drawing depicts the DSSA concept. The component class specifications in the reference architecture are realized in multiple system architectures with existing and reengineered components from the DSSA library, generated components, and new components.
Distributed computing
Definition
It means using multiple (real or virtual) computers cooperatively together, thereby producing faster performance and more robust system than a single computer doing all the work.
Proxy and load balancer
HTTP proxy
4-layer proxy
Look into the IP header, modify the IP header (NAT)
One tcp connection
7-layer proxy
Look into the HTTP header. Also modify the IP header.
Smart balancing
Great for microservice
Forward proxy and reverse proxy
Distributed operating system
Definition
A system software over a collection of independent, networked, communicating, and physically separate computational nodes.
Each node holds a specific software subset
1. Microkernal
2. System management components
Coordinate the node's individual and collaborative activities
How to optimize distributed system?
Cluster
A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.
Load balancing
Layer-4 balancing
Layer-7 balancing
Fail-over redundancy
VIP & VRRP
Cloud computing
Public cloud infrastructure
Splitting
Rejion
Availability zone
Data center
Gateways
Management gateway
Message gateway
Talks to hypervisors which manage VMs on physical computer
Failure in the cloud
Timeout tactic
Long tail latency
"Launch instance" requests to AWS
Two techniques to handle long tail latency
Hedged requests
Alternative requests
Load balancer
Increase the amount of work that can be handled
Message distributing algorithm
Round-robin
Hierarchy of load balancer
Increase the availability of services
Time coordination
NTP
GPS satellite
Vector clock
Order of events
Data coordination
Choices
Zookeeper
Consul
etcd
Domain Driven Design (DDD)
What is domain
It's the subject area to which the user applies the program or system.
It's a simplification
It's an interpretation of the reality and abstracts of aspects relevant to solving the problem at hand and ignores extraneous detail.
Architecture case-study
Redis
References
Redis设计与实现 第二版
Kafka
References
Understand Kafka as if you had designed it
Part 1
Part 2
Git
References
Git Objects
J2EE
Architecture
Model 2 architecture
Types of enterprise Java Beans
Entity bean
Session bean
Message driven bean
Domain-driven design
References
Book: Domain-Driven Design by Eric Evans
Domain model
Domain
The subject area to which the user applies the program
A domain model is not a particular diagram; it is the ideal that the diagram is intended to convey. it's not just the knowledge in a domain expert's head; it's a rigorously organized and selectively abstraction of that knowlege.
Utility of a model
1. The model dictates the form of design of the heart of the software.
2. The model is the backbone language.
3. The model is distilled knowledge.