ITS has extensive experience with Amazon, now the most widespread cloud provider, and has been using its Amazon Web Services (AWS) in its projects since the company was founded. This experience helps us successfully implement, develop, and support projects of any complexity up to high load IoT systems working with millions of devices simultaneously.
Our AWS Certified Solutions Architects can quickly prototype a future system to validate any business idea. Later, this prototype can be transformed and developed into a full-fledged high load system.
When starting to create the architecture of a future project, we pay special attention not only to the implementation of business logic but also to the reliability control of the systems we create. To do this, we include the parameters listed below in our projects and implement them based on existing AWS services. This allows us to use the world’s best developments and not spend resources on creating them from scratch.
Scalability is the ability of the produced system to handle the load increase.
Today hundreds of people use your service, tomorrow there are thousands of them, and six months later, you need to serve requests from millions of customers.
The load itself may be uneven. It may increase day by day gradually with an increase in the number of requests, or it may increase steplike at a certain time of the day and decrease in an hour or two. It all depends on the system’s business logic, on how it is used.
Depending on the kind of the load increase and the characteristics of the designed system, you can make calculations and select AWS services with the optimal cost for deployment in the cloud of business logic. It can be Elastic Compute Cloud (EC2), AWS Lambda, Elastic Container Service (ECS), Elastic Kubernetes Service (EKS).
Fault tolerance is the ability of a system to remain functional when one or more of its components fail.
Here AWS Elastic Load Balancing (ELB) come to help in conjunction with several EC2 behind them, using AWS Lambda, deploying infrastructure in several zones of one region, and even in several regions of the world.
Network security is the provision of information security of a computer network and its resources.
In the modern world of global digitalization, network security can in no way be neglected. If your service is accessible over the Internet, expect network attacks. There are enough people in the world who are interested in testing themselves at hacking weakly protected resources. Therefore, the question is not whether there will be network attacks or not. The question is: “Are you ready to protect your resources and your users’ data when the attacks begin?”
The following services, when correctly built into the project’s architecture and properly configured and used, help to be prepared: Virtual Private Cloud (VPC), WebApplication Firewall (WAF), AWS Shield, Identity & Access Management (IAM), Audit Manager, etc.
This property shows how easily and fully you can check the functionality of the developed system.
Testing is not only a functional or fault-based check. Testing is a tool that allows increasing the number of satisfied users. We write easy-to-test code → we get the ability to easily make changes and add new functionality → we are not afraid to break something by new changes → we experience less stress → we make fewer mistakes and implement the project better → we get users who are more satisfied. And the maximum effect in this chain is provided by the application of automated testing, which we use at all stages of development and deployment of our projects:
- During coding — unit and integration tests;
- During the project compilation and assembly, all tests written in the previous stage are subject to mandatory launch and verification of the results;
- During the application deployment on the AWS infrastructure, validation is done by functional test scripts;
- An additional stage is performing load testing, which allows you to make sure that the system is able to work even at increased, and even extreme, load. To do this, we use jMeter scripts that run on EC2 instances in the same AWS region as the services under test. Thus, we minimize the impact of network latency on the results of stress tests.
Monitoring and observability
Modern applications running in the cloud have a complex distributed structure and are composed of many microservices. This complexity is the price of increased scalability and testability. To see and understand how and how well your service works inside, you need a developed monitoring system that includes logging, gathering various metrics (Amazon CloudWatch), tracing requests between microservices (AWS X-Ray), and a dashboard to view and analyze the collected metrics (Amazon Managed Grafana). And to timely react to an emergency, you need to send out alerts that they are going to occur (Amazon CloudWatch).
Such a system will allow you to understand precisely where the problem emerged and provide enough data to find out what caused the problem and fix it.
Over time, any service acquires new functionality and new business logic. This will require frequent uploads of new applications versions to the cloud and changes to the cloud infrastructure itself. Therefore, it is important to ensure that system maintenance does not require much effort to prevent an increase in the project cost. Here automation comes to help, the same as in testing. AWS Certified DevOps Engineers will be able to customize the CI/CD process and describe the entire cloud infrastructure established by the architects as code using AWS CloudFormation or Terraform. This eliminates human error when deploying and making changes to several necessary for any successful system copies of the cloud infrastructure: for development, testing, and production use.
Below you can see high-level architectural diagrams of the two implemented high load projects:
IoT + SSO platform for smart home (≈ 10 million devices online)
The task was to design and implement a platform for managing smart home devices. The platform had to be easily scalable and perform two main functions:
- Provide functionality for creating and managing user accounts — Single sign-on (SSO). Provide access to accounts over OAuth2 to such services as Amazon Alexa, Google Assistant, and others.
- Remote control of smart devices — Internet of Things (IoT). Its main purpose is to deliver control messages from users and responses from devices back to users.
The entry point to the system that accepts requests from mobile apps and voice assistants is Amazon CloudFront, which is protected from network attacks by AWS Shield and AWS WAF. The core of the system is made up of microservices deployed on the EC2 instances. The microservices provide the logic for registering user accounts and IoT devices in the system, identification and authentication, sending and receiving control messages, collecting statistics, etc. Elastic Load Balancing is responsible for balancing the load between instances and the ability to easily scale the system. Amazon Relational Database Service (RDS) is used as the primary storage and the Amazon ElastiCache Redis service is used as cache.
A platform for unifying video advertising into a streaming format and seamlessly proxying it from advertising brokers to streaming services and IpTV
Hundreds of millions of people watch IpTV and various streaming services every day. Various businesses want to show advertisements for their goods and services to this huge audience. The video advertising format must be supported by streaming players. However, advertising brokers do not always provide content in the required format. Therefore, one of the most important system’s purposes is transcoding. For this purpose, AWS Elemental MediaConvert is used. The original and transcoded files are stored in Amazon Simple Storage Service (S3). Amazon DynamoDB is used to store and track transcoding statuses. The business logic is enabled by AWS Lambda and more than a dozen microservices deployed on Amazon Elastic Kubernetes Service (EKS). And Amazon CloudFront is used as a CDN to transfer content to streaming players.
The second important system’s function is the collection and analysis of the statistical data on the system performance and ad sale and view. Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and S3 are responsible for data collection and saving. Amazon API Gateway, AWS Lambda, and Amazon Athena enable receiving and processing analytical requests from teams that sell advertising.
Finally, a few words about how and what other AWS services and tools ITS specialists use in their work:
- Amazon Route 53 — a highly available and scalable cloud web service for configuring domain names of all network resources used in the projects.
- Amazon SageMaker provides machine learning technologies that we use for machine vision systems and other similar tasks.
- Amazon Simple Queue Service (SQS) — a message queuing service that enables you to set up sending/receiving messages in large distributed systems.
- Amazon Simple Notification Service (SNS) — a message service, which we mostly use to send SMS and mobile push notifications.
- AWS Compute Optimizer allows analyzing and maintaining the AWS infrastructure at the optimal level for its cost.
- AWS Tools and SDKs — saves developers time and effort spent on accessing the AWS resources during the systems’ development and debugging.