CLOUD COMPUTING EXPERT

Wednesday, 23 December 2015

CLOUD COMPUTING SECURITY ISSUES, CHALLENGES AND SOLUTION

THREATS IN CLOUD COMPUTING

Threats Cloud computing faces just as much security threats that are currently found in the existing computing platforms, networks, intranets, Internets in enterprises. These threats, risk vulnerabilities come in various forms.

The Cloud Security Alliance (Cloud Computing Alliance, 2010) did a research on the threats facing cloud computing and it identified the flowing major threats:

o Failures in Provider Security

o Attacks by Other Customers

o Availability and Reliability Issues

o Legal and Regulatory Issues

o Perimeter Security Model Broken

o Integrating Provider and Customer Security Systems

o Abuse and Nefarious Use of Cloud Computing

o Insecure Application Programming Interfaces

o Malicious Insiders

o Shared Technology Vulnerabilities

o Data Loss/Leakage

o Account, Service & Traffic Hijacking

o Unknown Risk Profile

CLOUD COMPUTATION IMPLEMENTATION GUIDELINES

Steps to Cloud Security

Edwards (2009) stated that, with the security risk and vulnerability in the enterprise cloud computing that are being discovered enterprises that want to proceed with cloud computing should, use the following steps to verify and understand cloud security provided by a cloud provider:

· Understand the cloud by realizing how the cloud's uniquely loose structure affects the security of data sent into it. This can be done by having an in-depth understanding of how cloud computing transmit and handles data.

· Demand Transparency by making sure that the cloud provider can supply detailed information on its security architecture and is willing to accept regular security audit. The regular security audit should be from an independent body or federal agency.

· Reinforce Internal Security by making sure that the cloud provider's internal security technologies and practices including firewalls and user access controls are very strong and can mesh very well with the cloud security measures

Consider the Legal Implications by knowing how the laws and regulations will affect what you send into the cloud.

· Pay attention by constantly monitoring any development or changes in the cloud technologies and practices that may impact your data's security.

Information Security Principles C I A (Confidentiality, Integrity, Availability)

• Confidentiality Prevent unauthorized disclosure

• Integrity Preserve information integrity

• Availability Ensure information is available when needed

Identify Assets & Principles

· Customer Data Confidentiality, integrity, and availability.

· Customer Applications Confidentiality, integrity, and availability.

· Client Computing Devices Confidentiality, integrity, and availability.

ISSUES TO CLARIFY BEFORE ADOPTING CLOUD COMPUTING

The world's leading information technology research and advisory company, has identified seven security concerns that an enterprise cloud computing user should address with cloud computing providers (Edwards, 2009) before adopting:

· User Access. Ask providers for specific information on the hiring and oversight of privileged administrators and the controls over their access to information. Major Companies should demand and enforce their own hiring criteria for personnel that will Operate heir cloud computing environments.

· Regulatory Compliance. Make sure your provider is willing to submit to external Audits and security certifications.

· Data location. Enterprises should require that the cloud computing provider store and process data in specific jurisdictions and should obey the privacy rules of those Jurisdictions.

· Data Segregation. Find out what is done to segregate your data, and ask for proof that encryption schemes are deployed and are effective.

· Disaster Recovery Verification. Know what will happen if disaster strikes by asking whether your provider will be able to completely restore your data and service, and find out how long it will take.

· Disaster Recovery. Ask the provider for a contractual commitment to support specific types of investigations, such as the research involved in the discovery phase of a lawsuit, and verify that the provider has successfully supported such activities in the past. Without evidence, don't assume that it can do so.

· Long-term Viability. Ask prospective providers how you would get your data back if they were to fail or be acquired, and find out if the data would be in a format that you could easily import into a replacement application.

SOLUTION OF SECURITY ISSUES

1. Find Key Cloud Provider First solution is of finding the right cloud provider. Different vendors have different cloud IT security and data management. A cloud vendor should be well established, have experience, standards and regulation. So there is not any chance of cloud vendor closing.

2. Clear Contract Contract with cloud vendor should be clear. So if cloud vendor closes before contract, enterprise can claim.

3. Recovery Facilities Cloud vendors should provide very good recovery facilities. So, if data are fragmented or lost due to certain issues, they can be recovered and continuity of data can be managed.

4. Better Enterprise Infrastructure Enterprise must have infrastructure which facilitates installation and configuration of hardware components such as firewalls, routers, servers, proxy servers and software such as operating system, thin clients, etc. Also should have infrastructure which prevents from cyber-attacks.

5. Use of Data Encryption for security purpose Developers should develop the application which provides encrypted data for the security. So additional security from enterprise is not required and all security burdens are placed on cloud vendor.IT leaders must define strategy and key security elements to know where the data encryption is needed.

6.Prepare chart regarding data flow There should be a chart regarding the flow of data. So the IT managers can have idea where the data is for all the times, where it is being stored and where it is being shared. There should be total analysis of data.

Monday, 27 July 2015

HADOOP : WORD COUNT PROBLEM

WHAT IS WORD COUNT

Word count is typical problem which works on hadoop distributed file system and map reduce is a intended count the no. of occurrence of each word in the provided input file. Word count operation takes places in two phases:-

1. Mapper phase: In this phase first the text is tokenized into word then we form a key value pair with these word where the key being the word itself and value ‘1’. Mapper class execute completely on the entire data set splitting the word and forming the initial key value pair. Only after this entire process is completed the reducer start.

2. Reducer phase: In reduce phase the key are grouped together and the value for similar keys are added. This could give the no of occurrence of each word in the input file. It creates an aggregation phase for key.

MAP REDUCE ALGORITHM

1. Map reduce is a programming model and it is design to compute large volume of data in a parallel fashion.

2. Map operation written by user in which takes a set of input key/values pairs and produces a set of intermediate key #1

3. Reduce function also written by user, accept an intermediate key #1 and a set of value for that key .it merges together these values to form a possible smaller set of value.

4. Map reduce operations are carried out in hadoop.

5. Hadoop is a distributed sorting engine.

Map and reduce in word count problem Algorithm

mapper(file name , file-count);

for each word in file-contents;

emit(word,1)

reducer (word, value);

sum=0

for each value in values

sum=sum + value

emit(word , sum)

DATA FLOW DIAGRAM

METHODOLOGY

Map reduce is a 3 steps approach to solving a problem :-

Step 1:- Map

The purpose of a map step is to group or divide data into set based on desire value. While using map function we need to be careful about 3 things.

1. How do we want to divide or group the data?

2. Which part of the data we need or which part of the data is extraneous?

3. In what form or structure do we need our data?

Step 2:- Reduce

Reduce operation combine different values for each given key using a user defined function. Reduce operation will take up each key and pick up all the values created from map step and process them one by one using custom define logic. It will take 2 parameters:-

1. Key

2. Array of values

Step 3:-Finalized

It is used to do any required transformation on the final output of the reduce.

FUTURE SCOPE

It is used in many applications because of parallel processing like document clustering, web link graph reversing and inverted index construction. As it is map reduce so it increases the efficiency of handling big data .it is used where we need data to be available all the times and security is needed.map reduce related woks are:-

· Yahoo!: Web map application uses hadoop to create a database of information on all known webpage.

· Facebook: Hadoop provides Hive data center.

· Backspace: It analyzes server log files and usage data using hadoop.

HADOOP: OVERVIEW

Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models. A Hadoop frame-worked application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.

Hadoop Architecture

Hadoop framework includes following four modules:

· Hadoop Common: These are Java libraries and utilities required by other Hadoop modules. These libraries provides file system and OS level abstractions and contains the necessary Java files and scripts required to start Hadoop.

· Hadoop YARN: This is a framework for job scheduling and cluster resource management.

· Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.

· Hadoop MapReduce: This is YARN-based system for parallel processing of large data sets.

We can use following diagram to depict these four components available in Hadoop framework.

MapReduce

Hadoop MapReduce is a software framework for easily writing applications which process big amounts of data in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

The term MapReduce actually refers to the following two different tasks that Hadoop programs perform:

The Map Task: This is the first task, which takes input data and converts it into a set of data, where individual elements are broken down into tuples (key/value pairs).
The Reduce Task: This task takes the output from a map task as input and combines those data tuples into a smaller set of tuples. The reduce task is always performed after the map task.

Typically both the input and the output are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

The MapReduce framework consists of a single master JobTracker and one slaveTaskTracker per cluster-node. The master is responsible for resource management, tracking resource consumption/availability and scheduling the jobs component tasks on the slaves, monitoring them and re-executing the failed tasks. The slaves TaskTracker execute the tasks as directed by the master and provide task-status information to the master periodically.

The JobTracker is a single point of failure for the Hadoop MapReduce service which means if JobTracker goes down, all running jobs are halted.

Hadoop Distributed File System

Hadoop can work directly with any mountable distributed file system such as Local FS, HFTP FS, S3 FS, and others, but the most common file system used by Hadoop is the Hadoop Distributed File System (HDFS).

The Hadoop Distributed File System (HDFS) is based on the Google File System (GFS) and provides a distributed file system that is designed to run on large clusters (thousands of computers) of small computer machines in a reliable, fault-tolerant manner.

HDFS uses a master/slave architecture where master consists of a single NameNode that manages the file system metadata and one or more slave DataNodes that store the actual data.

A file in an HDFS namespace is split into several blocks and those blocks are stored in a set of DataNodes. The NameNode determines the mapping of blocks to the DataNodes. The DataNodes takes care of read and write operation with the file system. They also take care of block creation, deletion and replication based on instruction given by NameNode.

HDFS provides a shell like any other file system and a list of commands are available to interact with the file system. These shell commands will be covered in a separate chapter along with appropriate examples.

How Does Hadoop Work?

Stage 1

A user/application can submit a job to the Hadoop (a hadoop job client) for required process by specifying the following items:

The location of the input and output files in the distributed file system.
The java classes in the form of jar file containing the implementation of map and reduce functions.
The job configuration by setting different parameters specific to the job.

Stage 2

The Hadoop job client then submits the job (jar/executable etc) and configuration to the JobTracker which then assumes the responsibility of distributing the software/configuration to the slaves, scheduling tasks and monitoring them, providing status and diagnostic information to the job-client.

Stage 3

The TaskTrackers on different nodes execute the task as per MapReduce implementation and output of the reduce function is stored into the output files on the file system.

Monday, 20 July 2015

GREEN CLOUD COMPUTING VERUS MOBILE CLOUD COMPUTING

Computing means any goal-oriented activity requiring, benefiting from, or creating computers. Thus, computing includes designing and building hardware and software systems for a wide range of purposes; processing, structuring, and managing various kinds of information; doing scientific studies using computers; making computer systems behave intelligently; creating and using communications and entertainment media; finding and gathering information relevant to any particular purpose.

GREEN COMPUTING

Green computing is the environmentally responsible and eco-friendly use of computers and their resources. In broader terms, it is also defined as the study of designing, manufacturing/engineering, using and disposing of computing devices in a way that reduces their environmental impact. Many IT manufacturers and vendors are continuously investing in designing energy efficient computing devices, reducing the use of dangerous materials and encouraging the recyclability of digital devices and paper. Green computing is also known as green information technology (green IT). Green computing, or green IT, aims to attain economic viability and improve the way computing devices are used. Green IT practices include the development of environmentally sustainable production practices, energy efficient computers and improved disposal and recycling procedures.

MOBILE COMPUTING

Mobile computing is human–computer interaction by which a computer is expected to be transported during normal usage. Mobile computing involves mobile communication, mobile hardware, and mobile software. Communication issues include ad-hoc and infrastructure networks as well as communication properties, protocols, data formats and concrete technologies. Hardware includes mobile devices or device components. Mobile software deals with the characteristics and requirements of mobile applications. Thus, mobile computing is the ability to use computing capability without a pre-defined location and/or connection to a network to publish and/or subscribe to information The purpose of this paper is to explore the comparison between Green cloud computing and Mobile Cloud computing and security issues and define which common security solutions are.

GREEN CLOUD COMPUTING

Green cloud is a buzzword that refers to the potential environmental benefits that information technology (IT) services delivered over the Internet can offer society. The term combines the words green -- meaning environmentally friendly -- and cloud, the traditional symbol for the Internet and the shortened name for a type of service delivery model known as cloud computing.

Benefits of Green Cloud Computing

· Reduced Cost

· Automatic Updates

· Green Benefits of Cloud computing

· Remote Access

· Disaster Relief

· Self-service provisioning

· Scalability

· Reliability and fault-tolerance

· Ease of Use

· Skills and Proficiency

· Response Time

· Increased Storage

· Mobility

Security Issues in Green cloud computing

The chief concern in cloud environments is to provide security around multi-tenancy and isolation, giving customers more comfort besides “trust us” idea of clouds. There has been survey works reported that classifies security threats in cloud based on the nature of the service delivery models of a cloud computing system However, security requires a holistic approach. Service delivery model is one of many aspects that need to be considered for a comprehensive survey on cloud security. Security at different levels such as Network level, Host level and Application level is necessary to keep the cloud up and running continuously. In accordance with these different levels, various types of security breaches may occur.

There are four types of issues raise while discussing security of a cloud.

· Data Issues

· Privacy issues

· Infected Application

· Security issues

Solution to security issues in Green Cloud Computing

1) Control the consumer access devices: Be sure the consumer’s access devices or points such as Personal Computers, virtual terminals, gazettes, pamphlets and mobile phones are secure enough. The loss of an endpoint access device or access to the device by an unauthorized user can cancel even the best security protocols in the cloud. Be sure the user computing devices are managed properly and secured from malware functioning and supporting advanced authentication features.

2) Monitor the Data Access: Cloud service providers have to assure about whom, when and what data is being accessed for what purpose. For example many website or server had a security complaint regarding snooping activities by many people such as listening to voice calls, reading emails and personal data etc.

3) Share demanded records and Verify the data deletion: If the user or consumer needs to report its compliance, then the cloud service provider will share diagrams or any other information or provide audit records to the consumer or user. Also verify the proper deletion of data from shared or reused Many providers do not provide for the proper degaussing of data from drives each time the drive space is abandoned. Insist on a secure deletion process and have that process written into the contract.

4) Security checks events: Ensure that the cloud service provider gives enough details about fulfillment of promises, break remediation and reporting contingency. These security events will describe responsibility, promises and actions of the cloud computing service provider

MOBILE CLOUD COMPUTING

Mobile cloud computing is the combination of cloud computing and mobile networks to bring benefits for mobile users, network operators, as well as cloud providers. Cloud computing exists when tasks and data are kept on the Internet rather than on individual devices, providing on-demand access. Mobile apps may use the cloud for both app development as well as hosting. A number of unique characteristics of hosted apps make the mobile cloud different from regular cloud computing. Mobile apps may be more reliant upon the cloud to provide much of the computing, storage, and communication fault tolerance than regular cloud computing does.

Benefits of Mobile Cloud Computing

Extending battery lifetime

· Improving data storage capacity and processing power

· Improving reliability

Security Issues in Mobile cloud Computing

Cloud computing as opposed to standard computing has several issues which can cause reluctance or

fear in the user base. Some of these issues include concerns about privacy and data ownership and security. Some of these concerns are especially relevant to mobile devices. In this section, the paper discusses some of these issues, including both incidents involving them and techniques used to combat them.

· Privacy

· Data Ownership

· Data Access and Security

Solution to Security issues in Mobile Cloud computing

Individuals and enterprises take advantage of the benefits for storing large amount of data or applications on a cloud. However, issues in terms of their integrity, authentication, and digital rights must be taken care of

1) Integrity: Every mobile cloud user must ensure the integrity of their information stored on the cloud. Every access they make must me authenticated and verified. Different approaches in preserving integrity for one’s information that is stored on the cloud is being proposed.

2) Authentication: Different authentication mechanisms have been presented and proposed using cloud computing to secure the data access suitable for mobile environments. Some uses the open standards and even supports the integration of various authentication methods. For example, the use of access or log-in IDs, passwords or PINS, authentication requests, etc.

3) Digital rights management: Illegal distribution and piracy of digital contents such as video, image, audio and e-book, programs becomes more and more popular. Some solutions to protect these contents from illegal access are implemented such as provision of encryption and decryption keys to access these contents. A coding or decoding must be done before any mobile user can have access to such digital contents