Monday, June 14, 2021

Virtualization

Virtualization is being used by a growing number of organizations and the concept of virtualization can simplify the operations in the IT sector and allows these organizations to respond faster to changing demands.

What you mean by Virtualization ???

Virtualization is the process of creating a virtual environment on an existing server to run your desired program, without interfering with any of the other services provided by the server or host platform to other users.

In computing, Virtualization refers to the process of creating a “virtual” version of something – such as an operating system (OS), a server, network resources or a storage device.

So, What does the Virtualization Do ??

It allows you to run multiple applications and operating systems on the same server, (making use of one server & this server is being shared among multiple machines) which provides efficient resource utilization and helps in reducing costs.

In other words, Virtualization is a technique, which allows to share a single physical instance of a resource or an application among multiple customers and organizations. It does by assigning a logical name to a physical storage and providing a pointer to that physical resource when demanded.

Virtualization plays a very important role in the cloud computing technology, normally in the cloud computing, users share the data present in the clouds like application etc, but actually with the help of virtualization users shares the Infrastructure.

The Virtual environment can be a single instance or a combination of many such as operating systems, Network or Application servers, computing environments, storage devices, and other such environments

The machine on which the virtual machine is going to create is known as Host Machine and that virtual machine is referred as a Guest Machine.

This virtual machine is managed by a software or firmware, which is known as hypervisor.

Hypervisor acts as a link between the hardware and the virtual environment and distributes the hardware resources such as CPU usage, memory allotment between the different virtual environments.

Working Of Virtualization

With virtualization, an application, a guest OS or data storage is separated from the underlying software or hardware. A thin software layer, known as a hypervisor, which partitions, or more specifically, abstracting and isolating these different OS and applications from the underlying computer hardware. Therefore, it wouldn’t be incorrect to say that virtualization is enabled by the functions of the hypervisor.

After virtualization, different user applications managed by their own operating systems (guest OS) can run on the same hardware, independent of the host OS. This is often done by adding additional software, called a virtualization layer. This virtualization layer is known as hypervisor or virtual machine monitor (VMM). The VMs are shown in the upper boxes, where applications run with their own guest OS over the virtualized CPU, memory, and I/O resources.

What this means is that the underlying hardware (which is known as the host machine) can independently operate and run one or more virtual machines (known as guest machines). The hypervisor also helps manage these independent Virtual Machines by distributing hardware resources such as memory allotment, CPU usage network bandwidth and more amongst them. It does this by creating pools of abstracted hardware resources, which it then allocates to Virtual Machines. It also can stop and start virtual machines, when requested by the user.

Another key component of hypervisors is ensuring that all the Virtual Machines stay isolated from others – so when a problem occurs in one Virtual Machine, the others remain unaffected. Finally, the hypervisor also handles the communication amongst Virtual Machines over virtual networks – enabling VMs to connect with one another.

The most common form is known as Type 1, where the layer sits on the hardware and virtualizes the hardware platform so that multiple virtual machines can utilize it. A type 2 hypervisor, on the other hand, uses a host operating system to create isolated guest virtual machines.

Each virtual server mimics the functionalities of a dedicated server – on one server. Each server is then designated an individual and separate OS, software and rebooting provisions, via root access. In a virtual server environment, website admins and ISPs can have separate and different domain names, IP addresses, analytics, logs, file directories email administration and more. Security systems and passwords also function separately, as it would in a dedicated server environment.

Numerous benefits are provided by virtualization which includes, reduction in costs, efficient utilization of resources, better accessibility, and minimization of risk among others.

Features

Partitioning: Multiple virtual servers can run on a physical server at the same time.
Encapsulation of data: All data on the virtual server, including boot disks, is encapsulated in a file format.
Isolation: The Virtual server running on the physical server is safely separated and don't affect each other.
Hardware Independence: When the virtual server runs, it can migrate to a different hardware platform.

Advantages

The number of servers gets reduced by the use of the virtualization concept.
Improve the ability of technology.
The business continuity was also raised due to the use of virtualization.
It creates a mixed virtual environment.
Increase efficiency for the development and test environment.
Lowers Total Cost of Ownership (TCO).

Cloud V/S Virtualization

Essentially there is a gap between these two terms, though cloud technology requires the concept of virtualization. Virtualization is a technology - it can also be treated as software that can manipulate hardware. At the same time, cloud computing is a service that is the result of manipulation.
Virtualization is the foundation element of cloud computing, whereas Cloud technology is the delivery of shared resources as a service-on-demand via the internet.
Cloud is essentially made-up of the concept of virtualization.

Types of Virtualization

Hardware Virtualization
Software Virtualization
OS Virtualization
Server Virtualization
Storage Virtualization

#Hardware Virtualization

Virtualization means abstraction. Hardware virtualization is accomplished by abstracting the physical hardware layer by use of a hypervisor or VMM (Virtual Machine Monitor).

When the virtual machine software or virtual machine manager (VMM) or hypervisor software is directly installed on the hardware system is known as hardware virtualization.

The primary task of the hypervisor is to process monitoring, memory & hardware controlling. After hardware virtualization is done, different operating systems can be installed, and various applications can run on it.

Hardware virtualization, when done for server platforms, is also called server virtualization. Hardware virtualization is mainly done for the server platforms, because controlling virtual machines is much easier than controlling a physical server.

Hardware virtualization is of three kinds.

Full Virtualization: Here, the hardware architecture is completely simulated. Guest software doesn't need any modification to run any applications.
Emulation Virtualization: Here, the virtual machine simulates the hardware & is independent. Furthermore, the guest OS doesn't require any modification.
Para-Virtualization: Here, the hardware is not simulated; instead, the guest software runs its isolated system.

Advantages

Lower Cost: Because of server consolidation, the cost decreases; now, multiple OS can exist together in a single hardware. This minimizes the quantity of rack space, reduces the number of servers, and eventually drops the power consumption.
Efficient resource utilization: Physical resources can be shared among virtual machines. Another virtual machine can use the unused resources allocated by one virtual machine in case of any need.
Increase IT flexibility: The quick development of hardware resources became possible using virtualization, and the resources can be managed consistently also.
Advanced Hardware Virtualization features: With the advancement of modern hypervisors, highly complex operations maximize the abstraction of hardware & ensure maximum uptime. This technique helps to migrate an ongoing virtual machine from one host to another host dynamically.

#Software Virtualization

It is also called application virtualization is the practice of running software from a remote server.

Software virtualization is similar to that of virtualization except that it is capable to abstract the software installation procedure and create virtual software installation.

Many applications & their distributions became typical tasks for IT firms and departments. The mechanism for installing an application differs. So virtualized software is introduced which is an application that will be installed into its self-contained unit and provide software virtualization.

Some of the examples are Virtual Box, VMware, etc.

The DLL (Data Link Layer) redirect the entire virtualized program's calls to the file system of the server. When the software is run from the server in this procedure, no changes are required to be made on the local system.

Types

Operating System Virtualization – hosting multiple OS on the native OS
Application Virtualization – hosting individual applications in a virtual environment separate from the native OS
Service Virtualization – hosting specific processes and services related to a particular application

Advantages

Ease of Client Deployment: Virtual software makes it easy to link a file in a network or file copying to the workstation.
Software Migration: Before the concept of virtualization, shifting from one software platform to another was time-consuming; and has a significant impact on the end-system user. The software virtualization environment makes migration easier.
Easy to Manage: Application updates become a simple task.

#Server Virtualization

It is the division of physical server into several virtual servers and this division is mainly done to improvise the utility of server resource.

In other word it is the masking of resources that are located in server which includes the number & identity of processors, physical servers & the operating system. This division of one physical server into multiple isolated virtual servers is done by server administrator using software.

The virtual environment is sometimes called the virtual private-servers.

In this process, the server resources are kept hidden from the user. This partitioning of physical server into several virtual environments; result in the dedication of one server to perform a single application or task.

For Server Virtualization, there are three popular approaches.

Virtual Machine model
Para-virtual Machine model
Operating System (OS) layer Virtualization

Server virtualization can be viewed as a part of overall virtualization trend in the IT companies that include network virtualization, storage virtualization & management of workload. This trend brings development in automatic computing. Server virtualization can also used to eliminate server sprawl (Server sprawl is a situation in which many under-utilized servers utilize more space or consume more resources than can be justified by their workload) & uses server resources efficiently.

Virtual Machine model: are based on host-guest paradigm, where each guest runs on a virtual replica of hardware layer. This technique of virtualization provide guest OS to run without modification. However it requires real computing resources from the host and for this a hypervisor or VM is required to coordinate instructions to CPU.
Para-Virtual Machine model: is also based on host-guest paradigm & uses virtual machine monitor too. In this model the VMM modifies the guest operating system's code which is called 'porting'. Like that of virtual machine, similarly the Para-virtual machine is also capable of executing multiple operating systems. The Para-virtual model is used by both Xen & UML.
Operating System Layer Virtualization: Virtualization at OS level functions in a different way and is not based on host-guest paradigm. In this model the host runs a single operating system kernel as its main/core and transfers its functionality to each of the guests. The guest must use the same operating system as the host. This distributed nature of architecture eliminated system calls between layers and hence reduces overhead of CPU usage. It is also a must that each partition remains strictly isolated from its neighbors because any failure or security breach of one partition won't be able to affect the other partitions. OS-Level Virtualization never uses a hypervisor.

Advantages

Cost Reduction: Server virtualization reduces cost because less hardware is required.

Independent Restart: Each server can be rebooted independently and that reboot won't affect the working of other virtual servers.
Disaster Recovery: Disaster Recovery is one of the best advantages of Server Virtualization. In Server Virtualization, data can easily and quickly move from one server to another and these data can be stored and retrieved from anywhere.
Faster deployment of resources: Server virtualization allows us to deploy our resources in a simpler and faster way.
Security: It allows uses to store their sensitive data inside the data centers.
Disadvantages of Server Virtualization

There are the following disadvantages of Server Virtualization -
The biggest disadvantage of server virtualization is that when the server goes offline, all the websites that are hosted by the server will also go down.
There is no way to measure the performance of virtualized environments.
It requires a huge amount of RAM consumption.
It is difficult to set up and maintain.
Some core applications and databases are not supported virtualization.
It requires extra hardware resources.

Uses of Server Virtualization

A list of uses of server virtualization is given below -

Server Virtualization is used in the testing and development environment.
It improves the availability of servers.
It allows organizations to make efficient use of resources.
It reduces redundancy without purchasing additional hardware components.

#Storage Virtualization

It pools the physical storage from different network storage devices and makes it appear to be a single storage unit that is handled from a single console. As we all know there has been a strong bond between physical host & locally installed storage device; and with the change in paradigm, local storage is no longer needed. More advanced storage has come to the market with an increase in functionality. Storage virtualization is the significant component of storage servers & facilitates management and monitoring of storage in a virtualized environment.

Storage virtualization helps the storage administrator to backup, archive and recovery data more efficiently, in less amount of time by masking the actual complexity of SAN (Storage Area Network). Through the use of software hybrid appliances, the storage administrator can implement virtualization.

Storage virtualization is becoming more and more important in different forms such as:

Storage Tiering: Using the storage technique as a bridge or as a stepping stone, this technique analyzes and select out the most commonly used data & place it on its highest performing storage pool and the least used data in the weakest performance storage pool.
WAN Environment: Instead of sending multiple copies of the same data over WAN, WAN accelerator is used to locally cache the data and present it in a LAN speed, and not impacting the WAN performance.
SAN Storage: SAN technology present the storage as block-level storage & the storage is presented over the Ethernet network of OS.
File Server: OS writes the data to a remote server location to keep it separate and secure from local users.

Benefits

Data is stored in a very convenient location. This is because if the host failure data don't get compromised necessarily.
By using storage level abstraction, it becomes flexible how storage is provided, protected, partitioned and used.
Storage Devices are capable of performing advanced functions such as disaster recovery, duplication, replication of data & re-duplication of data

Implementation Levels of Virtualization

Virtualization Structures/ Tools and Mechanisms
Virtualization of CPU, Memory, I/O Devices

Desktop virtualization

Desktop virtualization is technology that lets users simulate a workstation load to access a desktop from a connected device remotely or locally. This separates the desktop environment and its applications from the physical client device used to access it. Desktop virtualization is a key element of digital workspace and depends on application virtualization.

How does desktop virtualization work?

Desktop virtualization can be achieved in a variety of ways, but the most important two types of desktop virtualization are based on whether the operating system instance is local or remote.

Local Desktop Virtualization

Local desktop virtualization means the operating system runs on a client device using hardware virtualization, and all processing and workloads occur on local hardware. This type of desktop virtualization works well when users do not need a continuous network connection and can meet application computing requirements with local system resources. However, because this requires processing to be done locally you cannot use local desktop virtualization to share VMs or resources across a network to thin clients or mobile devices.

Remote Desktop Virtualization

Remote desktop virtualization is a common use of virtualization that operates in a client/server computing environment. This allows users to run operating systems and applications from a server inside a data center while all user interactions take place on a client device. This client device could be a laptop, thin client device, or a smartphone. The result is IT departments have more centralized control over applications and desktops, and can maximize the organization’s investment in IT hardware through remote access to shared computing resources.

What is virtual desktop infrastructure?

A popular type of desktop virtualization is virtual desktop infrastructure (VDI). VDI is a variant of the client-server model of desktop virtualization which uses host-based VMs to deliver persistent and nonpersistent virtual desktops to all kinds of connected devices. With a persistent virtual desktop, each user has a unique desktop image that they can customize with apps and data, knowing it will be saved for future use. A nonpersistent virtual desktop infrastructure allows users to access a virtual desktop from an identical pool when they need it; once the user logs out of a nonpersistent VDI, it reverts to its unaltered state. Some of the advantages of virtual desktop infrastructure are improved security and centralized desktop management across an organization.

What are the benefits of desktop virtualization?

Resource Management:
Desktop virtualization helps IT departments get the most out of their hardware investments by consolidating most of their computing in a data center. Desktop virtualization then allows organizations to issue lower-cost computers and devices to end users because most of the intensive computing work takes place in the data center. By minimizing how much computing is needed at the endpoint devices for end users, IT departments can save money by buying less costly machines.
Remote work:
Desktop virtualization helps IT admins support remote workers by giving IT central control over how desktops are virtually deployed across an organization’s devices. Rather than manually setting up a new desktop for each user, desktop virtualization allows IT to simply deploy a ready-to-go virtual desktop to that user’s device. Now the user can interact with the operating system and applications on that desktop from any location and the employee experience will be the same as if they were working locally. Once the user is finished using this virtual desktop, they can log off and return that desktop image to the shared pool.
Security:
Desktop virtualization software provides IT admins centralized security control over which users can access which data and which applications. If a user’s permissions change because they leave the company, desktop virtualization makes it easy for IT to quickly remove that user’s access to their persistent virtual desktop and all its data—instead of having to manually uninstall everything from that user’s devices. And because all company data lives inside the data center rather than on each machine, a lost or stolen device does not post the same data risk. If someone steals a laptop using desktop virtualization, there is no company data on the actual machine and hence less risk of a breach.

Server Virtualization

What is Server Virtualization?

Server virtualization is used to mask server resources from server users. This can include the number and identity of operating systems, processors, and individual physical servers.

Server Virtualization Definition

Server virtualization is the process of dividing a physical server into multiple unique and isolated virtual servers by means of a software application. Each virtual server can run its own operating systems independently.

Key Benefits of Server Virtualization:

Higher server ability
Cheaper operating costs
Eliminate server complexity
Increased application performance
Deploy workload quicker

Three Kinds of Server Virtualization:

Full Virtualization: Full virtualization uses a hypervisor, a type of software that directly communicates with a physical server's disk space and CPU. The hypervisor monitors the physical server's resources and keeps each virtual server independent and unaware of the other virtual servers. It also relays resources from the physical server to the correct virtual server as it runs applications. The biggest limitation of using full virtualization is that a hypervisor has its own processing needs. This can slow down applications and impact server performance.
Para-Virtualization: Unlike full virtualization, para-virtualization involves the entire network working together as a cohesive unit. Since each operating system on the virtual servers is aware of one another in para-virtualization, the hypervisor does not need to use as much processing power to manage the operating systems.
OS-Level Virtualization: Unlike full and para-virtualization, OS-level visualization does not use a hypervisor. Instead, the virtualization capability, which is part of the physical server operating system, performs all the tasks of a hypervisor. However, all the virtual servers must run that same operating system in this server virtualization method.

Why Server Virtualization?

Server virtualization is a cost-effective way to provide web hosting services and effectively utilize existing resources in IT infrastructure. Without server virtualization, servers only use a small part of their processing power. This results in servers sitting idle because the workload is distributed to only a portion of the network’s servers. Data centers become overcrowded with underutilized servers, causing a waste of resources and power.

By having each physical server divided into multiple virtual servers, server virtualization allows each virtual server to act as a unique physical device. Each virtual server can run its own applications and operating system. This process increases the utilization of resources by making each virtual server act as a physical server and increases the capacity of each physical machine.

REF

https://www.w3schools.in/cloud-virtualization/os-virtualization/

https://www.w3schools.in/cloud-computing/cloud-virtualization/#Types_of_Virtualization

https://www.javatpoint.com/cloud-computing-data-virtualization

https://www.redswitches.com/blog/virtualization-types-cloud-computing/

https://www.atlantic.net/vps-hosting/what-is-server-virtualization/

https://www.brainkart.com/article/Virtualization-Structures-Tools-and-Mechanisms_11333/

Data Stores || NoSqL Stores

Data Storage: NoSQL Stores , Key-Value Stores , Columnar Stores ,
Document Stores ,Graph Databases ,Case Studies , HDFS, HBase, Hive,
MongoDB, Neo4j

Role of Data Scientist & Big Data Sources

Role of Data Scientist

“Start-ups are producing so much data that hiring has increased dramatically. Salaries are on the rise for data scientists who are able to work closely with developers to provide value to end users,".

The role of a data scientist is becoming more pivotal to even traditional organizations who didn’t previously invest much of their budgets in technology positions.
Big data is changing the way old-school organizations conduct business and manage marketing, and the data scientist is at the center of that transformation.
Data scientists are often experts in technologies such as Hadoop, Pig, Python, and Java. Their jobs can focus on data management, analytics modeling, and business analysis. Because they tend to specialize in a narrow niche of data science, data scientists often work in teams within a company.
Data scientists can be real change-makers within an organization, offering insight that can illuminate the company’s trajectory toward its ultimate business goals.
Data scientists are integral to supporting both leaders and developers in creating better products and paradigms. And as their role in big business becomes more and more important, they are in increasingly short supply.

Big Data Sources

The bulk of big data generated comes from three primary sources: social data, machine data and transactional data.

Social data comes from the Likes, Tweets & Retweets, Comments, Video Uploads, and general media that are uploaded and shared via the world’s favorite social media platforms. This kind of data provides invaluable insights into consumer behavior and sentiment and can be enormously influential in marketing analytics. The public web is another good source of social data, and tools like Google Trends can be used to good effect to increase the volume of big data.

Machine data is defined as information which is generated by industrial equipment, sensors that are installed in machinery, and even web logs which track user behavior. This type of data is expected to grow exponentially as the internet of things grows ever more pervasive and expands around the world. Sensors such as medical devices, smart meters, road cameras, satellites, games and the rapidly growing Internet Of Things will deliver high velocity, value, volume and variety of data in the very near future.

Transactional data is generated from all the daily transactions that take place both online and offline. Invoices, payment orders, storage records, delivery receipts – all are characterized as transactional data yet data alone is almost meaningless, and most organizations struggle to make sense of the data that they are generating and how it can be put to good use.

Introduction to BigData

Every day, we create roughly 2.5 quintillion bytes of data

500 million tweets are sent
294 billion emails are sent
4 petabytes of data are created on Facebook
4 terabytes of data are created from each connected car
65 billion messages are sent on WhatsApp
5 billion searches are made

Big Data is a collection of data that is huge in volume, yet growing exponentially with time.

It is a data with so large size and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also a data but with huge size.

Data: The quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.

Examples Of Big Data

* The New York Stock Exchange generates about one terabyte of new trade data per day.

* Social Media: The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.

* A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.

Characteristics Of Big Data

Big data can be described by the following characteristics:

Volume (Scale of data)
Variety (Forms of data)
Velocity (Analysis of data flow)
Veracity (Uncertainty of data)
Value
Unfortunately, sometimes volatility isn’t within our control. The volatility, sometimes referred to as another “V” of big data, is the rate of change and lifetime of the data

Volume

The name Big Data itself is related to a size which is enormous.
Size of data plays a very crucial role in determining value out of data.
Also, whether a particular data can actually be considered as a Big Data or not, is dependent upon the volume of data.
Hence, 'Volume' is one characteristic which needs to be considered while dealing with Big Data.

As an example of a high-volume dataset, think about Facebook. The world’s most popular social media platform now has more than 2.2 billion active users, many of whom spend hours each day posting updates, commenting on images, liking posts, clicking on ads, playing games, and doing a zillion other things that generate data that can be analyzed. This is high-volume big data in no uncertain terms.

Variety

Variety refers to heterogeneous sources and the nature of data, both structured and unstructured.
During earlier days, spreadsheets and databases were the only sources of data considered by most of the applications.
Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications.
This variety of unstructured data poses certain issues for storage, mining and analyzing data.

Facebook, of course, is just one source of big data. Imagine just how much data can be sourced from a company’s website traffic, from review sites, social media (not just Facebook, but Twitter, Pinterest, Instagram, and all the rest of the gang as well), email, CRM systems, mobile data, Google Ads – you name it. All these sources (and many more besides) produce data that can be collected, stored, processed and analyzed. When combined, they give us our second characteristic – variety. Variety, indeed, is what makes it really, really big. Data scientists and analysts aren’t just limited to collecting data from just one source, but many. And this data can be broken down into three distinct types – structured, semi-structured, and unstructured.

Velocity

The term 'velocity' refers to the speed of generation of data.
How fast the data is generated and processed to meet the demands, determines real potential in the data.
Big Data Velocity deals with the speed at which data flows in from sources like business processes, application logs, networks, and social media sites, sensors, Mobile devices, etc.
The flow of data is massive and continuous.

Facebook messages, Twitter posts, credit card swipes and ecommerce sales transactions are all examples of high velocity data.

Veracity

This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.

Veracity refers to the quality, accuracy and trustworthiness of data that’s collected. As such, veracity is not necessarily a distinctive characteristic of big data (as even little data needs to be trustworthy), but due to the high volume, variety and velocity, high reliability is of paramount importance if a business is draw accurate conclusions from it. High veracity data is the truly valuable stuff that contributes in a meaningful way to overall results. And it needs to be high quality.

Value
Value sits right at the top of the pyramid and refers to an organization’s ability to transform those tsunamis of data into real business. With all the tools available today, pretty much any enterprise can get started with big data

Types Of Big Data

Following are the types of Big Data:

Structured
Unstructured
Semi-structured

Structured

By structured data, we mean data that can be processed, stored, and retrieved in a fixed format. It refers to highly organized information that can be readily and seamlessly stored and accessed from a database by simple search engine algorithms. For instance, the employee table in a company database will be structured as the employee details, their job positions, their salaries, etc., will be present in an organized manner.

Unstructured

Unstructured data refers to the data that lacks any specific form or structure whatsoever. This makes it very difficult and time-consuming to process and analyze unstructured data. Email is an example of unstructured data. Structured and unstructured are two important types of big data.

Semi-structured

Semi structured is the third type of big data. Semi-structured data pertains to the data containing both the formats mentioned above, that is, structured and unstructured data. To be precise, it refers to the data that although has not been classified under a particular repository (database), yet contains vital information or tags that segregate individual elements within the data.

Importance

Ability to process Big Data brings in multiple benefits, such as-

Businesses can utilize outside intelligence while taking decisions- Access to social data from search engines and sites like facebook, twitter are enabling organizations to fine tune their business strategies.

Improved customer service- Traditional customer feedback systems are getting replaced by new systems designed with Big Data technologies. In these new systems, Big Data and natural language processing technologies are being used to read and evaluate consumer responses.

Early identification of risk to the product/services, if any

Better operational efficiency- Big Data technologies can be used for creating a staging area or landing zone for new data before identifying what data should be moved to the data warehouse. In addition, such integration of Big Data technologies and data warehouse helps an organization to offload infrequently accessed data.

Advantages of Big Data (Features)

- Opportunities to Make Better Decisions. ...
- Increasing Productivity and Efficiency. ...
- Reducing Costs. ...
- Improving Customer Service and Customer Experience. ...
- Fraud and Anomaly Detection. ...
- Greater Agility and Speed to Market. ...
- Questionable Data Quality. ...
- Heightened Security Risks.
One of the biggest advantages of Big Data is predictive analysis. Big Data analytics tools can predict outcomes accurately, thereby, allowing businesses and organizations to make better decisions, while simultaneously optimizing their operational efficiencies and reducing risks.
By harnessing data from social media platforms using Big Data analytics tools, businesses around the world are streamlining their digital marketing strategies to enhance the overall consumer experience. Big Data provides insights into the customer pain points and allows companies to improve upon their products and services.
Being accurate, Big Data combines relevant data from multiple sources to produce highly actionable insights. Almost 43% of companies lack the necessary tools to filter out irrelevant data, which eventually costs them millions of dollars to hash out useful data from the bulk. Big Data tools can help reduce this, saving you both time and money.
Big Data analytics could help companies generate more sales leads which would naturally mean a boost in revenue. Businesses are using Big Data analytics tools to understand how well their products/services are doing in the market and how the customers are responding to them. Thus, the can understand better where to invest their time and money.
With Big Data insights, you can always stay a step ahead of your competitors. You can screen the market to know what kind of promotions and offers your rivals are providing, and then you can come up with better offers for your customers. Also, Big Data insights allow you to learn customer behavior to understand the customer trends and provide a highly ‘personalized’ experience to them.

Applications

1) Healthcare

Big Data has already started to create a huge difference in the healthcare sector.
With the help of predictive analytics, medical professionals and HCPs are now able to provide personalized healthcare services to individual patients.
Apart from that, fitness wearables, telemedicine, remote monitoring – all powered by Big Data and AI – are helping change lives for the better.

2) Academia

Big Data is also helping enhance education today.
Education is no more limited to the physical bounds of the classroom – there are numerous online educational courses to learn from.
Academic institutions are investing in digital courses powered by Big Data technologies to aid the all-round development of budding learners.

3) Banking

The banking sector relies on Big Data for fraud detection.
Big Data tools can efficiently detect fraudulent acts in real-time such as misuse of credit/debit cards, archival of inspection tracks, faulty alteration in customer stats, etc.

4) Manufacturing

According to TCS Global Trend Study, the most significant benefit of Big Data in manufacturing is improving the supply strategies and product quality.
In the manufacturing sector, Big data helps create a transparent infrastructure, thereby, predicting uncertainties and incompetencies that can affect the business adversely.

5) IT

One of the largest users of Big Data, IT companies around the world are using Big Data to optimize their functioning, enhance employee productivity, and minimize risks in business operations.
By combining Big Data technologies with ML and AI, the IT sector is continually powering innovation to find solutions even for the most complex of problems.

6) Retail

Big Data has changed the way of working in traditional brick and mortar retail stores.
Over the years, retailers have collected vast amounts of data from local demographic surveys, POS scanners, RFID, customer loyalty cards, store inventory, and so on.
Now, they’ve started to leverage this data to create personalized customer experiences, boost sales, increase revenue, and deliver outstanding customer service.

Retailers are even using smart sensors and Wi-Fi to track the movement of customers, the most frequented aisles, for how long customers linger in the aisles, among other things. They also gather social media data to understand what customers are saying about their brand, their services, and tweak their product design and marketing strategies accordingly.
7) Transportation
Big Data Analytics holds immense value for the transportation industry. In countries across the world, both private and government-run transportation companies use Big Data technologies to optimize route planning, control traffic, manage road congestion, and improve services. Additionally, transportation services even use Big Data to revenue management, drive technological innovation, enhance logistics, and of course, to gain the upper hand in the market.
Use cases: Fraud detection patterns
Traditionally, fraud discovery has been a tedious manual process. Common methods of discovering and preventing fraud consist of investigative work coupled with computer support. Computers can help in the alert using very simple means, such as flagging all claims that exceed a prespecified threshold. The obvious goal is to avoid large losses by paying close attention to the larger claims.
Big Data can help you identify patterns in data that indicate fraud and you will know when and how to react.
Big Data analytics and Data Mining offers a range of techniques that can go well beyond computer monitoring and identify suspicious cases based on patterns that are suggestive of fraud. These patterns fall into three categories.
1. Unusual data. For example, unusual medical treatment combinations, unusually high sales prices with respect to comparatives, or unusual number of accident claims for a person.
2. Unexplained relationships between otherwise seemingly unrelated cases. For example, a real estate sales involving the same group of people, or different organizations with the same address.
3. Generalizing characteristics of fraudulent cases. For example, intense shopping or calling behavior of items/locations that have not happened in the past, or a doctor’s billing for treatments and procedures that he rarely billed for in the past. These patterns and their discovery are detailed in the following sections. It should be mentioned, that most of these approaches attempt to deal with "non-revealing fraud". This is the common case. Only in cases of "self-revealing" fraud (such as stolen credit cards) will it become known at some time in the future that certain transactions had been fraudulent. At that point only a reactive approach is possible, the damage has already occurred; this may, however also set the basis for attempting to generalize from these cases and help detect fraud when it re-occurs in similar settings (see Generalizing characteristics of fraud, below).
Unusual data
The unusual data refer to three different situations: unusual combinations of other quite acceptable entries, a value that is unusual with respect to a comparison group, and an unusual value of and by itself. The latter case is probably the easiest to deal with and is an example of "outlier analysis". We are interested here only in outliers that are unusual, but are still acceptable values; an entry of a negative number for the number of staplers that a procurement clerk purchased would simply be a data error, and presumably bear no relationship to fraud. An unusual high value could be detected simply by employing descriptive statistics tools, such as measures of mean and standard deviation, or a box plot; for categorical values the same measures for the frequency of occurrence would be a good indicator.
Somewhat more difficult is the detection of values that are unusual only with respect to a reference group. In a case of real estate sales price, the price as such may not be very high, but it may be high for dwellings of the same size and type in a given location and economic market. The judgment that the value is unusual would only become apparent through specific data analysis techniques..
Unexplained Relationships
The unexplained relationships may refer to two or more seemingly unrelated records having unexpectedly the same values for some of the fields. As an example, in a money laundering scheme funds may be transferred between two or more companies; it would be unexpected if some of the companies in question have the same mailing address. Assuming that the stored transactions consist of hundreds of variables and that there may be a large number of transactions, the detection of such commonalities is unlikely if left uninvestigated. When applying this technique to many variables and/or variable combinations, the presence of an automated tool is indispensable. Again positive findings do not necessarily indicate fraud but are suggestive for further investigation.
Generalizing Characteristics of Fraud
Once specific cases of fraud have been identified we can use them in order to find features and commonalities that may help predict which other transactions are likely to be fraudulent. These other transactions may already have happened and been processed, or they may occur in the future. In both cases, this type of analysis is called "predictive data mining".
The potential advantage of this method over all alternatives previously discussed is that its reliability can be statistically assessed and verified. If the reliability is high, then most of the investigative efforts can be concentrated on handling the actual fraud cases, rather than screening many cases, which may or may not be fraudulent.

Introduction to Cloud Computing & its Characteristics

The term cloud refers to a network or the internet.
CC is a technology that uses remote servers on the internet to store, manage, and access data online rather than local drives.
The data can be anything such as files, images, documents, audio, video, and more.
Cloud computing provides shared services as opposed to local servers or storage resources, enables access to information from most web-enabled hardware and allows for cost savings – reduced facility, hardware/software investments, support.

There are the following operations that we can do using cloud computing:

Developing new applications and services
Storage, back up, and recovery of data
Hosting blogs and websites
Delivery of software on demand
Analysis of data
Streaming videos and audios

Why Cloud Computing?

Small as well as large IT companies, follow the traditional methods to provide the IT infrastructure. That means for any IT company, we need a Server Room that is the basic need of IT companies.

In that server room, there should be a database server, mail server, networking, firewalls, routers, modem, switches, QPS (Query Per Second means how much queries or load will be handled by the server), configurable system, high net speed, and the maintenance engineers.

To establish such IT infrastructure, we need to spend lots of money. To overcome all these problems and to reduce the IT infrastructure cost, Cloud Computing comes into existence.

Let's look at some of the most common reasons to use the cloud.

File storage: You can store all types of information in the cloud, including files and email. ...
File sharing: The cloud makes it easy to share files with several people at the same time. ...
Backing up data: You can also use the cloud to protect your files.

Characteristics of Cloud Computing

1. On-demand self-service- A consumer can unilaterally ("one-sided") provide or offer computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.

2. Broad network access- Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).

3. Resource pooling- The provider’s computing resources are pooled (grouping together of resources for the purposes of maximizing advantage or minimizing risk to the users) to serve multiple consumers. Resources can be dynamically assigned and reassigned according to customer demand. Customer generally may not care where the resources are physically located but should be aware of risks if they are located offshore

4. Rapid elasticity- Capabilities can be expanded or released automatically (i.e., more CPU power, or ability to handle additional users). To the customer this appears seamless, limitless, and responsive to their changing requirements.

5. Measured service- Customers are charged for the services they use. There is a metering concept where customer resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.

1) Agility

The cloud works in a distributed computing environment. It shares resources among users and works very fast.

2) High availability and reliability

The availability of servers is high and more reliable because the chances of infrastructure failure are minimum.

3) High Scalability

Cloud offers "on-demand" provisioning of resources on a large scale, without having engineers for peak loads.

4) Multi-Sharing

With the help of cloud computing, multiple users and applications can work more efficiently with cost reductions by sharing common infrastructure.

5) Device and Location Independence

Cloud computing enables the users to access systems using a web browser regardless of their location or what device they use e.g. PC, mobile phone, etc. As infrastructure is off-site (typically provided by a third-party) and accessed via the Internet, users can connect from anywhere.

Advantages of Cloud Computing

1) Back-up and restore data

Once the data is stored in the cloud, it is easier to get back-up and restore that data using the cloud.

2) Improved collaboration

Cloud applications improve collaboration by allowing groups of people to quickly and easily share information in the cloud via shared storage.

3) Excellent accessibility

Cloud allows us to quickly and easily access store information anywhere, anytime in the whole world, using an internet connection. An internet cloud infrastructure increases organization productivity and efficiency by ensuring that our data is always accessible.

4) Low maintenance cost

Cloud computing reduces both hardware and software maintenance costs for organizations. Maintenance of cloud computing applications is easier, since they do not need to be installed on each user's computer and can be accessed from different places. So, it reduces the cost also.

5) Mobility

Cloud computing allows us to easily access all cloud data via mobile.

6) Services in the pay-per-use model

Cloud computing offers Application Programming Interfaces (APIs) to the users for access services on the cloud and pays the charges as per the usage of service.

7) Unlimited storage capacity

Cloud offers us a huge amount of storing capacity for storing our important data such as documents, images, audio, video, etc. in one place.

8) Data security

Data security is one of the biggest advantages of cloud computing. Cloud offers many advanced features related to security and ensures that data is securely stored and handled.

Disadvantages of Cloud Computing

1) Internet Connectivity

As you know, in cloud computing, every data (image, audio, video, etc.) is stored on the cloud, and we access these data through the cloud by using the internet connection. If you do not have good internet connectivity, you cannot access these data. However, we have no any other way to access data from the cloud.

2) Vendor lock-in

Vendor lock-in is the biggest disadvantage of cloud computing. Organizations may face problems when transferring their services from one vendor to another. As different vendors provide different platforms, that can cause difficulty moving from one cloud to another.

3) Limited Control

As we know, cloud infrastructure is completely owned, managed, and monitored by the service provider, so the cloud users have less control over the function and execution of services within a cloud infrastructure.

4) Security

Although cloud service providers implement the best security standards to store important information. But, before adopting cloud technology, you should be aware that you will be sending all your organization's sensitive information to a third party, i.e., a cloud computing service provider. While sending the data on the cloud, there may be a chance that your organization's information is hacked by Hackers.

Evolution of CC

Cloud computing has its roots as far back in 1950s when mainframe computers came into existence.

In making cloud computing what it is today, five technologies played a vital role. These are distributed systems and its peripherals, virtualization, web 2.0, service orientation, and utility computing.

Monday, June 14, 2021

Types of Virtualization

#Server Virtualization

It is the division of physical server into several virtual servers and this division is mainly done to improvise the utility of server resource.

In other word it is the masking of resources that are located in server which includes the number & identity of processors, physical servers & the operating system. This division of one physical server into multiple isolated virtual servers is done by server administrator using software.

The virtual environment is sometimes called the virtual private-servers.

In this process, the server resources are kept hidden from the user. This partitioning of physical server into several virtual environments; result in the dedication of one server to perform a single application or task.

Disadvantages of Server Virtualization

Uses of Server Virtualization

#Storage Virtualization

Storage virtualization helps the storage administrator to backup, archive and recovery data more efficiently, in less amount of time by masking the actual complexity of SAN (Storage Area Network). Through the use of software hybrid appliances, the storage administrator can implement virtualization.

How does desktop virtualization work?

Local Desktop Virtualization

Remote Desktop Virtualization

What is virtual desktop infrastructure?

What are the benefits of desktop virtualization?

What is Server Virtualization?

Server Virtualization Definition

Key Benefits of Server Virtualization:

Three Kinds of Server Virtualization:

Why Server Virtualization?

Big Data Sources

Examples Of Big Data

* The New York Stock Exchange generates about one terabyte of new trade data per day.

* Social Media: The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.

* A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.

Characteristics Of Big Data

Types Of Big Data

Structured

Unstructured

Semi-structured

Advantages of Big Data (Features)

Use cases: Fraud detection patterns

Big Data can help you identify patterns in data that indicate fraud and you will know when and how to react.

Big Data analytics and Data Mining offers a range of techniques that can go well beyond computer monitoring and identify suspicious cases based on patterns that are suggestive of fraud. These patterns fall into three categories.

Why Cloud Computing?

Characteristics of Cloud Computing

1. On-demand self-service- A consumer can unilaterally ("one-sided") provide or offer computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.

2. Broad network access- Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).

4. Rapid elasticity- Capabilities can be expanded or released automatically (i.e., more CPU power, or ability to handle additional users). To the customer this appears seamless, limitless, and responsive to their changing requirements.

5. Measured service- Customers are charged for the services they use. There is a metering concept where customer resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.

Advantages of Cloud Computing

1) Back-up and restore data

2) Improved collaboration

3) Excellent accessibility

4) Low maintenance cost

5) Mobility

6) Services in the pay-per-use model

7) Unlimited storage capacity

8) Data security

Disadvantages of Cloud Computing

1) Internet Connectivity

2) Vendor lock-in

3) Limited Control

4) Security

Evolution of CC

Cloud computing has its roots as far back in 1950s when mainframe computers came into existence.

In making cloud computing what it is today, five technologies played a vital role. These are distributed systems and its peripherals, virtualization, web 2.0, service orientation, and utility computing.