AWS re:Invent -New services and features

Amazon Web Service’s has come up with many new services and features in recently held AWS re:Invent 2016. In this blog post I am writing about new announcements related to storage and data migration .

AWS Snowmobile:

AWS snowmobile is nothing but a beast to carry petabytes of your data to AWS cloud over a truck. yes on a truck, I am not kidding ūüôā

snow-ball

This secure data truck stores up to 100 PB of data and can help you to move exabytes to AWS in a matter of weeks. Snow mobile attaches to you data center network and
appears as local NFS mount volume. It includes a network cable connected to a high-speed switch capable of supporting 1 Tb/sec of data transfer spread across
multiple 40 Gb/sec connections. As per AWS official blogs Snowmobile is available in all AWS Regions. We need to contact AWS Sales team to use this service.

AWS Snowball Edge:

The new Snowball Edge appliance has all the features of its twin brother Snowball which is launched in 2015.

snow-ball-ede

AWS Snowball Edge is Petabyte-scale data transport with  on-board storage and compute. It arrives with your S3 buckets, Lambda code, and clustering configuration pre- installed. you can execute AWS Lambda functions and process data locally on the AWS Snowball Edge.You can order Snowball Edge with just a few clicks in the AWS Management Console.

Amazon EFS – New Feature

Amazon Elastic File System (Amazon EFS) provides simple file storage for use with Amazon EC2 instances in the AWS Cloud. With new feature we can mount Amazon EFS file systems on our on-premises datacenter servers when connected to Amazon VPC with AWS Direct Connect.

S3 Storage Management with new features

S3 Object Tagging:

S3 Object tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object.With
these users have the ability to create IAM policies, setup Lifecycle policies, and customize storage metrics.

S3 Analytics, Storage Class Analysis:

This new S3 Analytics feature automatically identifies the optimal lifecycle policy to move less frequently accessed storage to S3 Standard – Infrequent Access .Users
can configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag. Once an infrequent access pattern is observed, we can easily create
a new lifecycle age policy based on the results.

S3 CloudWatch Metrics :

This helps understand and improve the performance of applications that use Amazon S3 by monitoring and alarming on 13 new S3 CloudWatch  Metrics. Users can receive 1-minute CloudWatch Metrics, set alarms, and access dashboards to view real-time operations and performance such as bytes downloaded from  their Amazon S3 storage.

you can read my blog post related to Data migration to AWS by clicking below link.Thanks and happy reading ūüôā

Cloud data migration with AWS

 

Advertisements

Cloud data migration with AWS

Data migration is a key challenge in any cloud migration and as a storage admin it always fascinated me to understand efforts it take to migrate petabytes of data to the public cloud.In this post I will try to give a brief outline of  3 out of 8 ways in which we can migrate data to Amazon web services.

AWS Direct Connect:-In AWS direct connect we will be having a dedicated network connection from your  data center premises to AWS. With the high available speeds you can either directly copy data from any of your server  to an S3 bucket using cli commands or do host based migration to any ec2 instance with sufficient number of EBS volumes.
Multiple connections can be used simultaneously for increased bandwidth or redundancy.We can also use the AWS  partner network in case AWS direct connect location is not available near to your data centers.

directcnnect
AWS direct Connect

 

AWS Import/Export Snowball:- AWS import export snowball is petabyte scale storage migration solution. AWS will ship you a storage device as shown below to you data center,  which can copy 50 or 80 TB of storage.

snowball
Snowball

once you receive a Snowball, you plug it in, connect it to your network, configure the IP address , and install the AWS Snowball client. Use the client to identify the directories you want to copy. Data will be encrypted while copying to snowball  and decrypted when AWS offloads it to S3. As per AWS it takes 21 hours to copy 80TB of data from your data source to a Snowball  by using a 10 Gbps at 80 percent network utilization. AWS has also shown use case where customer was able to migrate 1PB of data in 1 week time using multiple snowballs parallel.

AWS storage gateway:- Storage gateway is installed on a local host in your data center. It creates an¬†on-premises virtual appliance that provides seamless and secure integration between your on-premises applications and AWS’s storage infrastructure.It can create iSCSI volumes for storing all or recently accessed data on premise¬†for faster response while asynchronously ¬†uploading this data to Amazon S3 in background.

GW
AWS Storage Gateway

A combination of AWS snow ball with either direct  connect or storage gateway will help in making migration much faster and easier.We can do a one time migration of data using snow ball and later make differential data update using direct connect or storage gateway.Hope this has given some basic idea on migrations with AWS solutions. Thanks for reading.

You can also read my blog post on a NextGen SSD Interface-NVMe .click on below link. Happy reading. ūüôā

https://sskanth.com/2016/04/20/nvme-nextgen-sd-interface/

 

 

 

NVMe -NextGen SSD Interface

nvm What is NVMe?

let  me  first start with NVM. NVM is non volatile memory, which means all the flash drives and SSD which have  revolutionized our storage world. NVMe is a protocol to write and access data to NVM. As of now NVMe is promoted  by Group of companies that includes Cisco, Dell, EMC, HGST, Intel, Micron, Microsoft, NetApp, Oracle, PMC-Sierra, Samsung, SanDisk and Seagate.

Why did you say it is next generation SSD interface? what are we using right now?

The Small Computer System Interface (SCSI) is the most used standard for physically connecting and transferring data  for hard disk drives and tape drives from almost more than a decade. We still use same SCSI interface even for flash drives which is becoming bottle neck in utilizing flash to full potential. SCSI protocol is best for HDD, but it is loosing steam when it comes to SSD. That is why we are looking at NVMe which is developed exclusively for flash technology.

Can you explain more on how NVMe is different from SCSI?

sure, first let us try to understand difference between HDD and SSD.

HDD:A hard drive stores data on a series of spinning magnetic disks, called platters. There’s an actuator arm with read/write heads attached to it. This arm positions the  read-write heads over the correct area of the drive to read or write information. Because the drive heads must align over an area of the disk in order to read or write  data (and the disk is constantly spinning), there’s a wait time before data can be accessed. The drive may need to read from multiple locations in order to launch a program or load a file, which means it may have to wait for the platters to spin into the proper position multiple times before it can complete the command. If a drive  is asleep or in a low-power state, it can take several seconds more for the disk to spin up to full power and begin operating.

hdd

SSD:

Solid-state drives are called that specifically because they don’t rely on moving parts or spinning disks. Instead, data is saved to a pool of NAND flash.Because SSDs have no moving parts, they can operate at speeds far above those of a typical HDD.

ssd

So to access data from HDD we use SCSI protocol. SCSI sends a  command one-at-a-time and waits for platter to adjust under actuator  arm  and fetch data back. we are using same SCSI protocol for SSD also ,which is diminishing its performance. SSD can serve more IO at same time as it got no rotational component but we are using SCSI  commands which process one command at time .Here comes NVMe. NVMe parallelizes instructions. NVMe is designed to have up to 64 thousand queues. Moreover, each of those queues in turn can have up to 64 thousand commands Simultaneously. That is, at the same time. Inshort  NVMe is  exclusively developed to leverage SSD technology.

Do you have any metrics to support your claims about NVMe?

yes,you can see below graphs published by SNIA(Storage Networking Industry Association)  which shows the difference of NVMe when compared to SAS and SATA protocols for random and sequential workloads.

randomsequential

Where can i get more information about NVMe?

You can get more information about latest development in NVME from the official site http://www.nvmexpress.org/.

Anything  else?

This may be first time you hearing about NVMe but I bet you it won‚Äôt be the last. NVMe is for sure here to stay as Flash technology comes to realize its full potential. Hope you enjoyed reading about it ūüôā

 

What is Software Defined Storage?

Software-Defined-Storage-for-Dummies

Any one following trends in  IT will definitely  a see new approach in data centers  towards software defined model.The new buzz word for this is SDDC-software defined data center. In SDDC all infrastructure components are virtualized and delivered as a service. Similar trend can be observed in Storage which is heart of DC called SDS-software defined storage.

The exact definition of SDS is still evolving, but the generally accepted definition is that software defined storage is ‚Äúwhere the management and intelligence of the storage system is decoupled from the underlying physical hardware.‚ÄĚ

sds_02

The following are the areas that will make difference with implementation of SDS.

Administration: As a storage admin I follow different process to achieve  same task(provisioning, reclamation) in  arrays manufactured from different  vendors  like EMC, NetApp, Hitachi. With implementation of SDS we can manage all storage infra in data center  from single pane and also follow same steps for  tasks irrespective of manufacturer. SDS makes extensive use of  API to  communicate to the arrays. In data centers  with private cloud implementation SDS will definitely help in improving automation and  orchestration. Example would be EMC ViPR.

Use of commodity hardware:Any  new storage array we buy  ,we end up buying license for similar set of features like Snapshots, Cloning, Replication, Data Mobility, Encryption, and Thin Provisioning. In SDS since intelligence of the storage system is decoupled from the underlying physical  hardware we can save our  costs on this repetitive  features. We can also make any  x86 commodity hardware into a robust enterprise storage with help of some SDS solutions like Data core, Nexenta.

Cloud integration: any new software or HW solution is not complete without integration to public cloud and same is the case for SDS. SDS can  be used to pool resources from cloud and also manage both  in house and public cloud assets under single pane. Since SDS is the solution that can abstract storage from underlying physical hardware it will be useful in seamless  data transfer between private to public cloud and vice versa.

In short software-defined storage solutions is a fundamental component of software defined data center , providing a range of scale-out solutions to meet rapidly growing and changing data demands.

 

 

An AWSomeday

AWS event got a lot of craze among IT folks in Hyderabad. Almost 400 odd IT engineers turned up to this event on Nov17 ,2015 at ITC Kakatiya, Begumpet, Hyderabad. I know most of folks registered for event but delegates are invited based on some selection process by AWS. This post is to give glimpse of how exactly event happened and take aways from the event.

Event started with a welcome note from Chandra balani, Head of Business development. He gave us quick view of AWS history. Chandra said that AWS is available to public from 2006 ,but prior that they used this technology to run Amazon.com site for almost a decade. So AWS has total experience of 20 years in the cloud industry.

awsomeday

The tech part started with Harshith taking the charge. He is a great guy. I remember him from last years AWSomeday event. He carry’s entire event on his shoulders with all technical stuff. He gave deeper understanding of AWS core and application services. Showed how to deploy and automate your infrastructure on the AWS Cloud. We are given a student guide with information on AWS storage, compute ,network and applications.

 

The event has it’s fun part too. There are lots of contests on twitter with hash tag #AWSomday. There are stalls by AWS partners &experts. AWS experts are really nice in answering most of the questions from delegates. Event ended with goodies presentation to lucky winners and participant certificate to delegates.

 

 

 

learning about all flash arrays

I started learning about flash arrays as all flash array started occupying¬†data center floors¬†and trend is going to continue for next 5 to 10 years. Gartner predicts in a report that ‚ÄúBy 2019, 20% of traditional high-end storage arrays will be replaced by dedicated solid-state arrays (SSA).‚ÄĚ.let me start this flash series by explaining new glossary which is mostly used in by all flash array (AFA) vendors.

1)PE cycles
2)different types of flash available now
3)over provisioning
4)compression and Dedupe
5)garbage collection

PE cycles:the life expectancy of a flash drive is expressed in program/erase( PE )cycles. flash cells wear out a little every time they are erased or programmed. This is similar to erasing same spot of paper with an eraser multiple times which may result in tearing  of paper.

Different types of flash available now:below is the picture of different types of flash available as of now and their differences.

disks

Over provisioning:-This is the inclusion of extra storage capacity in solid-state drive. That extra capacity is not visible to the host as available storage. It is like under promise and over deliver .vendor will give you with more hidden capacity which will help in distributing total number of writes  and erases over more number of flash cells. This will increase life expectancy of drive.

Compression and Dedupe: There is a thin line of difference between dedupe and compression which most people fail to understand(including me.it took me reading multiple websites to understand it ūüôā ).

Dedupe: Dedupe occurs at file level. For suppose you sent a mail with an attachment of 1mb size to 10 people. The exchange server will save only one copy of attachment and mark remaining as duplicate.this saves a lot of space as we used  just  1mb instead of 10mb.

Compression: With compression you are using some algorithm or other to reduce the size of a particular file by eliminating  redundant bits. But if your users or applications have stored the same file multiple times, then no matter how good your compression method is your storage will end up with multiple copies of the compressed files.

Garbage collection:It took me some time to understand this.I will put it my way and provide you the links of blogs where I understood better.

flash writes data in different way. when we are trying to update data ,instead of rewriting in old cells, it writes in new free cells. This will make old cells data invalid or stale. Garbage collection is a process by which this old stale data is erased and make those stale cells available for next usage.You can find better diagrammatic explanation in wiki and also in some tech blogs in below links.

Link to understand garbage collection