Clustering of core services
Description:
Extended Architecture (EA) currently has between 100 – 120 core services. These are proprietary microservices that make up the EA product suite. EA currently runs on a single instance of each service. The purpose of this requirement is to ensure that EA can scale reliably by running multiple instances of each service at any given time.
Business Case
- Being able to successfully scale EA will remove any current limitation on the size of call centres Redbox can service.
- Being able to successfully run multiple instances of EA services should improve the overall reliability and resilience of EA
Personas effected
- John Williams - Head of Digital Transformation/ CTO/ Strategy
- Penn Gwynn - Head of Contact Centre
- Ron Westly - Head of Command and Control
- Toby Lerone - Head of IT Service Management
Epic
As an EA administrator I would like to run 4 instances of each EA core service in a cluster, so that I can support an increase in capacity of my call centre
Functionality
- User Story 1: As an EA administrator running services in a K8s cluster, I would like EA to perform in conformance with BAU so that I can verify that EA can run successfully in a cluster
- Acceptance Criteria 1: Given that EA is running in a K8s cluster consisting of 4 instances of each service, when calls are executed, then all call events are processed in conformance with BAU
- User Story 2: As an EA administrator running services in a K8s cluster, I would like EA to perform in conformance with BAU when there is a failure of services, so that i can verify the reliability of the clustered system
- Acceptance Criteria 1: Given that EA is running in a K8s cluster consisting of 4 instances of each service, when 1, 2 and 3 instances of each service are shutdown successively, then all call events continue to be processed in conformance with BAU
Non Functional Requirements
Ref | Area | MoSCoW | Requirement | Comments |
---|---|---|---|---|
1 | Error-handling | M | Ease with which the system can degrade gracefully if errors occur - eg does the entire system go down and lose data if the internet goes down | On failure of microservices we don’t lose any data. |
2 | Legal and Regulatory | specific legal and regulatory requirements associated with the feature | NA | |
3 | Licensing | new/amended licensing requirements associated with the feature or with introduced 3rd party components) | NA | |
4 | Localizability | need to include localised features eg currency; date formats | NA | |
5 | Performance | M | ability to meet specific performance standards/requirements | 4 instances of each microservice clustered and tested. |
6 | Concurrency | M | Specific concurrency requirements | As per performance |
7 | Resilience | M | ability to handle failure of an individual component within the system | As per performance |
8 | Scalability | M | requirements to support increasing numbers of users/concurrency without incurring significant cost | As per performance |
9 | Security | adherence to defined/specified customer/industry security standards | NA | |
10 | Storage | Specific storage requirements/considerations | NA | |
11 | Supportability | M | ease with which Support could/need to access logs etc to diagnose a problem | Error logging |
12 | Test requirements | ease with which the functionality could/should be supported by automated testing | NA | |
13 | Training | specific training/installation/configuration documentation that is associated with this feature that need to be created/updated | NA | |
14 | User Experience | specific user experience requirements that would ensure the functionality is acceptable to customers eg can complete action within x clicks | NA |
- Simon Jolly (Unlicensed) (Technical Architect) review and signed-off
- Sergey Shafiev (Unlicensed) (Team Lead) to review and sign-off (Signed of by Kirill Zotkin (Unlicensed) on behalf of Sergey)
- Vikash Mahabir (Unlicensed) (QA Manager) to review and sign-off