Pariksheet Tricks & Tricks

Wednesday, 5 August 2020

How to Configure Terraform and The Benefits of Using Terraform as a Tool for Infrastructure-as-Code (IaC)

DevOps tools have enabled software engineers to deploy application source code in a better way. Including Infrastructure-as-Code (IaC) in the DevOps cycle helps the transition to the cloud model, accommodating the shift from ‘static’ to ‘dynamic’ infrastructure.

The advent of DevOps helped minimise the dependence on sysadmins who used to set up infrastructure manually, while seated at some unknown corner of the office. Managing servers and services manually is not a very complicated task in a data centre. But when we move to the cloud, scale and start working with many resources from multiple providers (AWS, GCP, Azure, etc), manually setting up and configuring to achieve on-demand capacity slows things down. Being repetitive, the manual process is most likely error-prone. And it cannot be managed and automated when working with resources from different service providers together.

Infrastructure-as-Code (IaC)
This approach is the management and configuration of infrastructure (virtual machines, databases, load balancers and connection topology) in a descriptive cloud operating model. Infrastructure can be maintained just like application source code under the same version control. This lets the engineers maintain, review, test, modify and reuse their infrastructure and avoid direct dependence on the IT team. Systems can be deployed, managed and delivered fast, and automatically, through the IaC. There are many tools available for IaC such as CloudFormation (only for AWS), Terraform, Chef, Ansible and Puppet.

Figure 1: Life cycle of IaC using Terraform

Figure 2: Terraform template code for provisioning an EC2 instance

What is Terraform?
Terraform is an open source provisioning tool from HashiCorp (more can be read at http://terraform.io/) written in the Go language. It is used for building, changing and versioning infrastructure, safely and efficiently. Provisioning tools are responsible for the creation of server and associated services rather than configuration management (installation and management of software) on existing servers. Terraform acts as a provisioner, and focuses on the higher abstraction level of setting up the servers and associated services.

The infrastructure Terraform can manage includes low level components such as compute instances, storage, and networking, as well as high level components such as DNS entries, SaaS features, etc. It leaves the configuration management (CM) to tools such as Chef that do the job better. It lays the foundation for automation in infrastructure (both cloud and on-premise) using IaC. Its governance policy makes the cloud operating model system-compliant, which otherwise is only known internally to the IT team.

Terraform is cloud-agnostic and uses a high level declarative style language called HashiCorp Configuration Language (HCL) for defining infrastructure in ‘simple for humans to read’ configuration files. Organisations can use public templates and can have a unique private registry. Templates are a maintained repository containing pre-made modules for infrastructure components needed under version control systems (like Git).

Figure 3: The ‘Terraform init’ step initialises all the required resources and plugins

Figure 4: ‘Terraform plan’ dry runs the instantiation

Installation
Installing Terraform is very simple; just follow the steps mentioned below.
1. Download the archive

(https://releases.hashicorp.com/terraform/${VER}/terraform_${VER}_linux_amd64.zip):
export VER=”0.12.9” wget
 
https://releases.hashicorp.com/terraform/${VER}/terraform_${VER}_linux_amd64.zip

2. Once downloaded, extract the archive:

unzip terraform_${VER}_linux_amd64.zip

3. The last step created a Terraform bin file in the working directory. For Terraform to be accessible everywhere, move it to usr/local/bin:

sudo mv terraform /usr/local/bin

4. Confirm the Terraform installation:

terraform –v //v0.12.9

Life cycle
Terraform template (code) is written in HCL language and stored as a configuration file with a .tf extension. HCL is a declarative language, which means our goal is just to provide the end state of the infrastructure and Terraform will figure out how to create it. Terraform can be used to create and manage infrastructure across all major cloud platforms. These platforms are referred to as ‘providers’ in Terraform jargon, and cover AWS, Google Cloud, Azure, Digital Ocean, Open Stack and many others.
Let us now discuss each stage in the IaC lifecycle (Figure 1), which is managed using Terraform templates.

Figure 5: ‘Terraform apply’ instantiates the validated infrastructure in the planning step

Code
Figure 2 is sample code for starting an EC2 or t2.micro instance on AWS. As visible, only a few lines will instantiate an instance on AWS. It’s also implicit that the same code can be maintained under VCS and be used to instantiate instances in various regions with different resource configurations, removing error-prone and time-consuming manual work.

Provider: All major cloud players such as AWS, Azure, GCP and OpenStack have their APIs for Terraform. These APIs are maintained by the community.

Username: Key given by provider
Password: Key given by provider
Region: Specify region of deployment

Resource: There are many kinds of resources such as an open-stack basic instance, AWS EC2, Droplet and Azure VM, which can be created as follows:

resource <provider_instance_type> <identifier>

Image ID: This is machine image specific, a tag for an image we need to install (Ubuntu and Windows).
Flavour type: This is the type of instance governing the CPU, memory and disk space.

Here, template defines the provider as AWS, and provides the access key, secret key and region for being able to connect to AWS. After that, resource to be create is specified ie. an aws_instance here and is named as “example”. Also, the count and instance type (size of instance) is mentioned as code and can be seen in Figure 2.

The same code can be used to set up an instance in another region. Also, Terraform offers users the power of using variables and other logical statements such as if-else and the for loop, which can optimise the setting up of infrastructure even more.

Figure 6: Terraform showing the absolute changes made to the infrastructure

Maintain and reuse
Configuration templates, i.e., pre-made modules for infrastructure components are written, reviewed, managed, and reused after storing them under VCS. Organisations can use public templates contributed by the community as well as have unique private templates stored in a central registry.

Init
The Terraform binary contains the basic functionality and everything else is downloaded as and when required. The ‘terraform init’ step analyses the code, figures out the provider and downloads all the plugins (code) needed by the provider (here, it’s AWS). Provider plugins are responsible for interacting over APIs provided by the cloud platforms using the corresponding CLI tools. They are responsible for the life cycle of the resource, i.e., create, read, update and delete. Figure 3 shows the checking and downloading of the provider ‘aws’ plugins after scanning the configuration file.

Plan
The ‘terraform plan’ is a dry run for our changes. It builds the topology of all the resources and services needed and, in parallel, handles the creation of dependent and non-dependent resources. It efficiently analyses the previously running state and resources, using a resource graph to calculate the required modifications. It provides the flexibility of validating and scanning infrastructure resources before provisioning, which otherwise would have been risky. The ‘+’ symbol signifies the new resources that will be added to the already existing ones, if any. Figure 4 shows the generation of the plan and shows the changes in resources, with ‘+’ indicating addition and ‘-’ indicating deletion.

Validation
Administrators can validate and approve significant changes produced in ‘terraform planning’s’ dry run. This completely prevents specific workspaces from exceeding predetermined thresholds, lessening the costs and increasing productivity. Also, any standards or governing policy for the cloud operating model have not been put in place yet. A policy has not been codified yet; rather, certain practices are internally known among teams and organisations. Terraform enforces sentinel policies as code before provisioning the workflow to minimise the risks through active policy enforcement.

Apply
The ‘terraform apply’ executes the exact same provisioning plan defined in the last step, after being reviewed. Terraform transforms configuration files (.tf) based on appropriate API calls to cloud provider(s) automating resource creation (using the provider’s CLI) seamlessly. This will create the resources (here, an EC2 server) on AWS in a flexible and straightforward manner. Figure 5 shows the resources that will be created, and Figure 6 confirms the changes that took place.

If we proceed to the AWS console to verify the instantiation, a new EC2 instance will be up and running, as shown in Figure 7.

Destroy
After resources are created, there may be a need to terminate them. As Terraform tracks all the resources, terminating them is also simple. All that is needed is to run ‘terraform destroy’. Again, Terraform will evaluate the changes and execute them after you give permission.

Figure 7: EC2 instance can be seen in the initialising state

Additional features of Terraform
1. It provides a GUI to manage all the running services. It also provides an access control model based on the organisation, teams and users. Its audit logging emits logs whenever a change (here, change signifies sensitive write to existing IaC) happens in the infrastructure.

2. Existing pipeline integration: Terraform can be triggered from within most continuous integration/continuous deployment (CI/CD) DevOps pipelines such as Travis, Circle, Jenkins and Gitlab. This enables a plugin provisioning workflow and sentinel policies into the CI/CD pipeline.

3. Terraform supports many providers (more at https://www.terraform.io/docs/providers/index.html), allowing users to easily manage resources no matter where they are located. Instances can be provisioned on cloud platforms such as AWS, Azure, GCP and OpenStack using APIs provided by the cloud service providers.

4. Terraform uses a declarative style when the desired end state is written directly. Here, the tool is responsible for figuring out and achieving that end state by itself.

Let’s say, we want to deploy five Elastic instances on AWS using Chef (Figure 8a) and Terraform (Figure 8b).
Observing the script given in Figure 8, one can see that both are equivalent and will produce the same results.

But let’s assume a festive sale is coming up. The expected traffic will increase, and infrastructure must scale our application. Let’s say, five more instances are required to handle the predicted traffic.]

As language is procedural, setting count as 10 will start additional 10 instances rather than adding extra five, thus initiating a total of 15 instances. We must manually remember the previous count, as shown in Figure 9a. Hence, we must write a completely new script adding one more redundant syntax code file.

As language is declarative, setting the count as 10 (as can be seen in Figure 9b) will start an additional five instances. We do not have to manually remember the previous count. We have provided the end state and everything else will be handled by Terraform itself. This demonstrates all possible scenarios when manual interventions occur, such as slow speed, prone to human error, etc.

Advantages of incorporating IaC using Terraform
1. Prevents configuration drift: Terraform, being provisioning software, binds you to make changes in your container and only then deploy the new ones across every server. This separates server configuration from any dependency, resulting in identical instances across our infrastructures.

Figure 8a and 8b: Sample code for instantiating five instances

2. Easy collaboration: The terraform registry (Terraform’s central registry version control) enables teams to collaborate on infrastructure.

3. No separate documentation needed: The code written for infrastructure will become your documentation. By looking at the script, thanks to its procedural nature, we can figure out what’s currently deployed and its configuration.

Figure 9a and 9b: Sample code for instantiating ten instances

4. Flexibility: Terraform not only handles IaaS (AWS, Azure, etc) but also PaaS (SQL, NodeJS). It can also store local variables such as cloud tokens and passwords in encrypted form on the terraform registry.

5. Masterless: Terraform is masterless, by default, i.e., it does not need a master node to keep track of all configuration and distributing updates. This saves the extra infrastructure and maintenance costs we’d have to incur in maintaining an extra master node. Terraform directly uses the cloud providers’ API, thus saving extra infrastructure costs and other overheads.

Terraform is an open source tool that helps teams manage infrastructure in an efficient, automated and reusable manner. It has a simple modular syntax and supports multi-cloud infrastructure configuration. Enterprises can use Terraform in their DevOps methodology to construct, modify, manage and deliver infrastructure at a faster pace with less manual intervention.

Data Analytics: A Game-Changer In Healthcare Industry

Artificial intelligence and machine learning can be used to analyse both structured and unstructured data to support medical professionals in decision making, and related policy decisions by providing a holistic view of a particular situation.

With an increase in number of users adopting technologies like artificial intelligence (AI), cloud-computing, the Internet of Things (IoT), the data is rapidly increasing in size and complexity. This is driving the growth of data analytics in various industries. Also, it is having a profound impact on healthcare services where massive data from areas such as pharmaceuticals, medical bills and patients’ health history can be used for prediction and prevention instead of treatment and response.

In a recent development, IIIT-Delhi signed a Memorandum of Understanding (MoU) with the Lords Education and Health Society (Wish Foundation) with an aim to carry on research on health data analytics. For the coronavirus outbreak too, institutions such as World Health Organization (WHO) are utlilising data analytics and machine learning (ML) to track and predict its spread so that better decisions can be made.

According to a report from Research and Markets, global Big Data analytics market size in healthcare is expected to reach US$ 105.08 billion at a CAGR of 12.3 per cent during the forecast period of 2019 to 2027.

Data sources

In hospitals and other treatment centres, electronic health records (EHRs) are created in order to have separate comprehensive digital records for every patient containing such information as medical history, test reports, allergies and the like. These can be updated by a doctor even when the patient visits a different hospital and some new data needs to be added.

A large amount of this data consists of medical imaging reports, which can be analysed through algorithms that identify patterns in pixels based on previously available images to help with diagnosis. Last year, Google revealed its plans to provide search functionality to aid in performing tasks such as data entry and billing in these systems.

Apart from previous health data for statistics, real-time information related to vitals can be obtained from wearables and other health devices for population health management through predictive analytics. The IoT makes it possible to collect information using sensors in health equipment for remote monitoring. This is useful in identifying and prioritising people more at risk early on. The data integrated from various sources needs to be accurate for correct analysis.

What are we trying to achieve

Collecting and analysing large datasets and images through data analytics solution saves time by quickly filtering huge amounts of data to find accurate and customised solutions for tough problems that are beyond the scope of researchers and medical staff. Benefits are multifaceted and inter-related. Some of the major ones are discussed next.

Improved patient satisfaction and engagement

The aim of data analytics in any field is to make better decisions. As medical data gets analysed, better decision making and informed strategic planning by health professionals leads to improved patient satisfaction and recovery. Tedious and monotonous operations, such as making appointments, get streamlined. Researchers can analyse success rate of results from different treatments and filter out the ones that have better health outcomes. In the past, University of Florida made use of Google Maps and public health data to prepare neighbourhood-level hotspot maps for efficient delivery of primary care in most areas.

Healthcare dashboards can be used by healthcare professionals to monitor key performance indicators of patient statistics in real time. Health tracking devices like fitness bands make patients more aware about their health so that they take the necessary precautions and medications to prevent their problem from escalating. When a patient’s parameters are alarming, the system immediately alerts the doctor to take necessary action(s) to deal with the situation.

Reduction in expenses

Big Data can be used to predict the admission rate of patients using ML. This is done by evaluating information including the number of patients present at a time, the costs incurred, total time of stay, among others to get an overall view of the process. This can help in understanding the right amount of hospital and clinic staff required so that the patients’ care is proper and quick, with lower wait times and without unnecessary costs.

By leveraging population health data and identifying people that need more attention, patients can be kept away from hospitals. This reduces hospital expenses due to reduction in readmission rates.

Preventing fraud in insurance companies

Insurance companies provide high cover for small premium and are prone to fraud, such as in personal injury claims. Fraudsters often receive money into their accounts through identity theft. In other instances, they claim the records to be from leading hospitals and display more costly treatments than actually provided.

Using data to validate a pre-payment and analytics to perform checks against public record databases ensures the validity of information provided to insurers. After verifying medical records, claims can be processed quickly so that genuine cases get resolved easily and treatment institutions are paid faster.

Besides efficient insurance claims processing, payers can use predictive analytics to determine at-risk claims. They can also find the best providers for specific health conditions, enabling patients to get better returns on their claims.

Technologies to achieve the feat

Formulation of a well developed data analytics solution is crucial for preventive, predictive, precise and personalised healthcare. Cloud services can economically and easily manage huge volumes of data and share information across different systems, making data analytics possible.

For scalable implementation, it is always a good idea to process all the information to extract the important data while not missing out on any detail, which can be then integrated into the system to gain meaningful insights. This is where the idea of decentralised processing, that is, edge cloud computing comes into play. Unlike traditional cloud, the information is stored locally, increasing network performance and reducing response times.

AI and ML can be used to analyse both structured (such as ontologies) and unstructured data to support medical professionals in decision making, and related policy decisions by providing a holistic view of a particular situation. The Pittsburgh Health Data Alliance started working with Amazon Web Services and utilising its ML research program to boost innovation and revolutionise disease treatment with data last year.

Apache spark is a popular framework with built-in fault tolerance for iterative and interactive processes in Big Data analysis. It is written in Scala and can be integrated with Hadoop. PySpark API allows using spark framework with Python, a popular language among developers for ML due to its various advantages like flexibility, robustness and ease of implementation.

For predictive modeling, the inputs and outputs for the model can be discrete or continuous, and are decided based on the type of problem. Neural networks are trained with past data available to understand the patterns and trends in the gathered datasets, which is known as data mining. Based on similarities in characteristics with the data of former patients, new ones can be more effectively treated.

It is important to assess the performance of a model to understand the improvements that can be made.

Big data analytics architecture framework

Impact and challenges

One of the best ways to understand the impact of health data analytics is from the response of Taiwan to coronavirus pandemic (COVID-19). According to a Stanford health report, the state integrated its national health insurance database with its immigration and customs database to create Big Data for analytics. QR code scanning and online reporting were also allowed.

This allowed identification of susceptible cases who had recent travel history to high-risk areas and were visiting clinics with symptoms through real-time tracking. People who had travelled to high-risk areas were tracked through their phones to make sure they stayed at home during the quarantine period. This helped them nip the disease in the bud.
Healthcare data involves numerous complexities that need to be tackled. For instance, medical data comes from different sources and has to comply with regulations set by different state governments and other administrative departments. There has to be a common standard when it comes to sharing patient information.

In 2019, a former patient at University of Chicago Medical Center filed a lawsuit against the institution’s partnership with Google to improve predictive analysis, stating that it violated Health Insurance Portability and Accountability Act (HIPAA) passed in 1996 in the US by recording the entry and exit dates of patients. The lawsuit was dismissed later on.

Technically too, developing infrastructure to interface datasets from different data providers on such a large scale is not easy. Security loopholes in the system can have drastic implications. So, advanced security measures like firewalls and encryption algorithms are essential to avoid risks and maintain the trust of patients.

Due to such issues, cloud alternatives are considered a safe option to reduce vulnerability by many companies. Corporate Health International (CHI) is one such organisation that is utilising the computational efficiency provided by hardware and software stack from Intel. Servers running Intel Xeon are used for both data processing and AI development.

In spite of realising its potential, healthcare organisations in many countries are still not actively using analytics due to these issues. Developments are ongoing to resolve such problems and avail the untapped benefits from analytics that are widespread in other industries already.

Tuesday, 4 August 2020

Udemy Free Courses AWS, JENKINS, ANSIBLE, GIT, MAVEN, DevOPS

Hi All,
Below is the collection of AWS and DevOps related courses available for FREE on udemy. These courses are content-rich and should be good enough to get started with AWS and gain in-depth knowledge of DevOps tools and cloud technologies. So create an Udemy account if you don't have already, enroll these courses before they are closed or converted to paid. So, Don't just beg for the dumps to become loosers.. learn yourself, study hard, gain and share your knowledge with the community, help others and enjoy!

➡️AWS

➡️JENKINS

https://www.udemy.com/working-with-jenkins/
https://www.udemy.com/jenkins-beginner-tutorial-step-by-step/
https://www.udemy.com/jenkins-quick-start/
https://www.udemy.com/jenkins-devops-pipeline-as-code/
https://www.udemy.com/devops-crash-course-cicd-with-jenkins-pipelines-groovy-dsl/
https://www.udemy.com/jenkins-intro/

➡️ANSIBLE

https://www.udemy.com/ansible-essentials-simplicity-in-automation/
https://www.udemy.com/ansible-quick-start/
https://www.udemy.com/devops-series-server-automation-using-ansible/
https://www.udemy.com/devops-beginners-guide-to-automation-with-ansible/
https://www.udemy.com/just-enough-ansible/

➡️GIT

➡️MAVEN
https://www.udemy.com/maven-quick-start/

➡️DevOPS

https://www.udemy.com/learn-devops/
https://www.udemy.com/devops-for-operations/
https://www.udemy.com/linux-academy-devops-essentials/
https://www.udemy.com/learn-devops-kubernetes-deployment-by-kops-and-terraform/
https://www.udemy.com/devops-series-setup-environment-using-virtual-machines/

Thanks,
️Team AWS Cloud Group️