AWS re:Invent -New services and features

December 11, 2016December 11, 2016 Srik DevOps & Cloud amazon, aws, efs, re:Invent, snowball, Snowmobile

Amazon Web Service’s has come up with many new services and features in recently held AWS re:Invent 2016. In this blog post I am writing about new announcements related to storage and data migration .

AWS Snowmobile:

AWS snowmobile is nothing but a beast to carry petabytes of your data to AWS cloud over a truck. yes on a truck, I am not kidding 🙂

snow-ball

This secure data truck stores up to 100 PB of data and can help you to move exabytes to AWS in a matter of weeks. Snow mobile attaches to you data center network and
appears as local NFS mount volume. It includes a network cable connected to a high-speed switch capable of supporting 1 Tb/sec of data transfer spread across
multiple 40 Gb/sec connections. As per AWS official blogs Snowmobile is available in all AWS Regions. We need to contact AWS Sales team to use this service.

AWS Snowball Edge:

The new Snowball Edge appliance has all the features of its twin brother Snowball which is launched in 2015.

snow-ball-ede

AWS Snowball Edge is Petabyte-scale data transport with on-board storage and compute. It arrives with your S3 buckets, Lambda code, and clustering configuration pre- installed. you can execute AWS Lambda functions and process data locally on the AWS Snowball Edge.You can order Snowball Edge with just a few clicks in the AWS Management Console.

Amazon EFS – New Feature

Amazon Elastic File System (Amazon EFS) provides simple file storage for use with Amazon EC2 instances in the AWS Cloud. With new feature we can mount Amazon EFS file systems on our on-premises datacenter servers when connected to Amazon VPC with AWS Direct Connect.

S3 Storage Management with new features

S3 Object Tagging:

S3 Object tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object.With
these users have the ability to create IAM policies, setup Lifecycle policies, and customize storage metrics.

S3 Analytics, Storage Class Analysis:

This new S3 Analytics feature automatically identifies the optimal lifecycle policy to move less frequently accessed storage to S3 Standard – Infrequent Access .Users
can configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag. Once an infrequent access pattern is observed, we can easily create
a new lifecycle age policy based on the results.

S3 CloudWatch Metrics :

This helps understand and improve the performance of applications that use Amazon S3 by monitoring and alarming on 13 new S3 CloudWatch Metrics. Users can receive 1-minute CloudWatch Metrics, set alarms, and access dashboards to view real-time operations and performance such as bytes downloaded from their Amazon S3 storage.

you can read my blog post related to Data migration to AWS by clicking below link.Thanks and happy reading 🙂

Cloud data migration with AWS

June 12, 2016 Srik DevOps & Cloud amazon, aws, cloud, migration, snowball

Data migration is a key challenge in any cloud migration and as a storage admin it always fascinated me to understand efforts it take to migrate petabytes of data to the public cloud.In this post I will try to give a brief outline of 3 out of 8 ways in which we can migrate data to Amazon web services.

AWS Direct Connect:-In AWS direct connect we will be having a dedicated network connection from your data center premises to AWS. With the high available speeds you can either directly copy data from any of your server to an S3 bucket using cli commands or do host based migration to any ec2 instance with sufficient number of EBS volumes.
Multiple connections can be used simultaneously for increased bandwidth or redundancy.We can also use the AWS partner network in case AWS direct connect location is not available near to your data centers.

AWS Import/Export Snowball:- AWS import export snowball is petabyte scale storage migration solution. AWS will ship you a storage device as shown below to you data center, which can copy 50 or 80 TB of storage.

once you receive a Snowball, you plug it in, connect it to your network, configure the IP address , and install the AWS Snowball client. Use the client to identify the directories you want to copy. Data will be encrypted while copying to snowball and decrypted when AWS offloads it to S3. As per AWS it takes 21 hours to copy 80TB of data from your data source to a Snowball by using a 10 Gbps at 80 percent network utilization. AWS has also shown use case where customer was able to migrate 1PB of data in 1 week time using multiple snowballs parallel.

AWS storage gateway:- Storage gateway is installed on a local host in your data center. It creates an on-premises virtual appliance that provides seamless and secure integration between your on-premises applications and AWS’s storage infrastructure.It can create iSCSI volumes for storing all or recently accessed data on premise for faster response while asynchronously uploading this data to Amazon S3 in background.

A combination of AWS snow ball with either direct connect or storage gateway will help in making migration much faster and easier.We can do a one time migration of data using snow ball and later make differential data update using direct connect or storage gateway.Hope this has given some basic idea on migrations with AWS solutions. Thanks for reading.

You can also read my blog post on a NextGen SSD Interface-NVMe .click on below link. Happy reading. 🙂

https://sskanth.com/2016/04/20/nvme-nextgen-sd-interface/

NVMe -NextGen SSD Interface

April 20, 2016April 22, 2016 Srik ALL FLASH ARRAYS flash, nvm, nvme

nvm What is NVMe?

let me first start with NVM. NVM is non volatile memory, which means all the flash drives and SSD which have revolutionized our storage world. NVMe is a protocol to write and access data to NVM. As of now NVMe is promoted by Group of companies that includes Cisco, Dell, EMC, HGST, Intel, Micron, Microsoft, NetApp, Oracle, PMC-Sierra, Samsung, SanDisk and Seagate.

Why did you say it is next generation SSD interface? what are we using right now?

The Small Computer System Interface (SCSI) is the most used standard for physically connecting and transferring data for hard disk drives and tape drives from almost more than a decade. We still use same SCSI interface even for flash drives which is becoming bottle neck in utilizing flash to full potential. SCSI protocol is best for HDD, but it is loosing steam when it comes to SSD. That is why we are looking at NVMe which is developed exclusively for flash technology.

Can you explain more on how NVMe is different from SCSI?

sure, first let us try to understand difference between HDD and SSD.

HDD:A hard drive stores data on a series of spinning magnetic disks, called platters. There’s an actuator arm with read/write heads attached to it. This arm positions the read-write heads over the correct area of the drive to read or write information. Because the drive heads must align over an area of the disk in order to read or write data (and the disk is constantly spinning), there’s a wait time before data can be accessed. The drive may need to read from multiple locations in order to launch a program or load a file, which means it may have to wait for the platters to spin into the proper position multiple times before it can complete the command. If a drive is asleep or in a low-power state, it can take several seconds more for the disk to spin up to full power and begin operating.

hdd

SSD:

Solid-state drives are called that specifically because they don’t rely on moving parts or spinning disks. Instead, data is saved to a pool of NAND flash.Because SSDs have no moving parts, they can operate at speeds far above those of a typical HDD.

ssd

So to access data from HDD we use SCSI protocol. SCSI sends a command one-at-a-time and waits for platter to adjust under actuator arm and fetch data back. we are using same SCSI protocol for SSD also ,which is diminishing its performance. SSD can serve more IO at same time as it got no rotational component but we are using SCSI commands which process one command at time .Here comes NVMe. NVMe parallelizes instructions. NVMe is designed to have up to 64 thousand queues. Moreover, each of those queues in turn can have up to 64 thousand commands Simultaneously. That is, at the same time. Inshort NVMe is exclusively developed to leverage SSD technology.

Do you have any metrics to support your claims about NVMe?

yes,you can see below graphs published by SNIA(Storage Networking Industry Association) which shows the difference of NVMe when compared to SAS and SATA protocols for random and sequential workloads.

random sequential

Where can i get more information about NVMe?

You can get more information about latest development in NVME from the official site http://www.nvmexpress.org/.

Anything else?

This may be first time you hearing about NVMe but I bet you it won’t be the last. NVMe is for sure here to stay as Flash technology comes to realize its full potential. Hope you enjoyed reading about it 🙂

What is Software Defined Storage?

March 12, 2016March 12, 2016 Srik virtualization sddc, sds, software defined data center, Software Defined Storage

Software-Defined-Storage-for-Dummies

Any one following trends in IT will definitely a see new approach in data centers towards software defined model.The new buzz word for this is SDDC-software defined data center. In SDDC all infrastructure components are virtualized and delivered as a service. Similar trend can be observed in Storage which is heart of DC called SDS-software defined storage.

The exact definition of SDS is still evolving, but the generally accepted definition is that software defined storage is “where the management and intelligence of the storage system is decoupled from the underlying physical hardware.”

sds_02

The following are the areas that will make difference with implementation of SDS.

Administration: As a storage admin I follow different process to achieve same task(provisioning, reclamation) in arrays manufactured from different vendors like EMC, NetApp, Hitachi. With implementation of SDS we can manage all storage infra in data center from single pane and also follow same steps for tasks irrespective of manufacturer. SDS makes extensive use of API to communicate to the arrays. In data centers with private cloud implementation SDS will definitely help in improving automation and orchestration. Example would be EMC ViPR.

Use of commodity hardware:Any new storage array we buy ,we end up buying license for similar set of features like Snapshots, Cloning, Replication, Data Mobility, Encryption, and Thin Provisioning. In SDS since intelligence of the storage system is decoupled from the underlying physical hardware we can save our costs on this repetitive features. We can also make any x86 commodity hardware into a robust enterprise storage with help of some SDS solutions like Data core, Nexenta.

Cloud integration: any new software or HW solution is not complete without integration to public cloud and same is the case for SDS. SDS can be used to pool resources from cloud and also manage both in house and public cloud assets under single pane. Since SDS is the solution that can abstract storage from underlying physical hardware it will be useful in seamless data transfer between private to public cloud and vice versa.

In short software-defined storage solutions is a fundamental component of software defined data center , providing a range of scale-out solutions to meet rapidly growing and changing data demands.

2015 in review

January 3, 2016 Srik Uncategorized

The WordPress.com stats helper monkeys prepared a 2015 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 1,600 times in 2015. If it were a cable car, it would take about 27 trips to carry that many people.

Click here to see the complete report.

learning about all flash arrays

November 1, 2015November 1, 2015 Srik ALL FLASH ARRAYS flash, purestorage

I started learning about flash arrays as all flash array started occupying data center floors and trend is going to continue for next 5 to 10 years. Gartner predicts in a report that “By 2019, 20% of traditional high-end storage arrays will be replaced by dedicated solid-state arrays (SSA).”.let me start this flash series by explaining new glossary which is mostly used in by all flash array (AFA) vendors.

1)PE cycles
2)different types of flash available now
3)over provisioning
4)compression and Dedupe
5)garbage collection

PE cycles:the life expectancy of a flash drive is expressed in program/erase( PE )cycles. flash cells wear out a little every time they are erased or programmed. This is similar to erasing same spot of paper with an eraser multiple times which may result in tearing of paper.

Different types of flash available now:below is the picture of different types of flash available as of now and their differences.

Over provisioning:-This is the inclusion of extra storage capacity in solid-state drive. That extra capacity is not visible to the host as available storage. It is like under promise and over deliver .vendor will give you with more hidden capacity which will help in distributing total number of writes and erases over more number of flash cells. This will increase life expectancy of drive.

Compression and Dedupe: There is a thin line of difference between dedupe and compression which most people fail to understand(including me.it took me reading multiple websites to understand it 🙂 ).

Dedupe: Dedupe occurs at file level. For suppose you sent a mail with an attachment of 1mb size to 10 people. The exchange server will save only one copy of attachment and mark remaining as duplicate.this saves a lot of space as we used just 1mb instead of 10mb.

Compression: With compression you are using some algorithm or other to reduce the size of a particular file by eliminating redundant bits. But if your users or applications have stored the same file multiple times, then no matter how good your compression method is your storage will end up with multiple copies of the compressed files.

Garbage collection:It took me some time to understand this.I will put it my way and provide you the links of blogs where I understood better.

flash writes data in different way. when we are trying to update data ,instead of rewriting in old cells, it writes in new free cells. This will make old cells data invalid or stale. Garbage collection is a process by which this old stale data is erased and make those stale cells available for next usage.You can find better diagrammatic explanation in wiki and also in some tech blogs in below links.

Link to understand garbage collection

why do front end director ports WWN does not change after replacement in symmetrix?

April 19, 2015April 19, 2015 Srik EMC EMC, FA WWN

why do front end director ports WWN does not change after replacement in symmetrix?

This question used to bother me for long time.I used to answer myself like their must be some mechanism EMC is using to get same WWN as old director port to even new director after replacement.Recently I found a primus which helped me to understand deep down on FA WWN.

I will follow the primus emc223285 and try decode Symmetrix WorldWide Names (WWNs) on a VMAX.

We will be using the following WWN as an example: 50000972081349AD

5000097-these hexa numbers are assigned by IEEE and is the vender UID of Symmetrix V-Max.so these
numbers same for any FA WWN of VMAX array.

now let us break down reamaining WWN 2081349AD.
Start by breaking down the hexadecimal digit WWN into binary.

2 0 8 1 3 4 9 A D
0010 0000 1000 0001 0011 0100 1001 1010 1101

bit35 <—————————————–< bit 0

Starting from left to right, number the bits from 35 on down to 0.

Follow below screenshot along with descritpion to undesrtand well.

Bit arrangement description:

Bits 35 through 33:Bits 35 through 33 deal with the build location of the array. For any given WWN, one of the 3 bits will be set; the other two bits will be not be set. If bit 35 is set, the array was made in China and the Serial Number starts with CN49xxxxxxx. If bit 34 is set, the array was made in Europe and the Serial Number starts with CK29xxxxxxx. Lastly if bit 33 is set, the array was made in the USA and the Serial Number starts with HK19xxxxxxx. In the example provided, bit 33 is set, which indicates the example WWN has a Serial Number starting in HK19xxxxxxx.

Bits 32 through 26:bits 32 through 26, deal with the Symmetrix Model Type.
Refer to the following chart for a breakdown of the bits.

In the example WWN, the break down of bits 32 through 26 is 0 0 0 0 0 1 0 which indicates the WWN is HK1926xxxxx.

Bits 25 through 10:Bits 25 through 10 encode the last 5 digits of the Symmetrix Serial Number. Take those bits and place them into a scientific calculator set to binary, convert the binary into decimal, and you will receive the last 5 digits of the Symmetrix Serial Number. In the example, bits 26 through 10 come out to be 000 0001 0011 0100 10. Placing this into a calculator and converting from binary to decimal yields 1234. If the yield comes out to be a 4-digit number, front pad the number with a 0 to make 5 digits (i.e., 01234). The full Serial Number for the Symmetrix in the example WWN is HK192601234.

Bits 9 through 6: Bits 9 through 6 hold the encoding for the processor letter (CPU letter) for the director. Use the break out of the bits to the chart below.

Bits 5 through 2: Bits 5 through 2 hold the encoding for the director number for the director. Use the break out of the bits to the chart below.

the last two bits:Lastly, the last two bits, bits 1 and 0, hold the encoding for the director port. If the bits are 00, then it is the 0 or A port. If the bits are 01, then it is the 1 or B port. In the example, it is port 1 or B.

So, finally, WWN 50000972081349AD decodes to VMAX / VMAX 20K HK192601234 director 12g port 1/B 🙂

what is Inquiry utility (INQ)?

Aside March 20, 2015March 20, 2015 Srik EMC

So what is INQ?

Inquiry utility (INQ) is a command-line troubleshooting utility that displays information on storage devices, typically Symmetrix. By default, INQ returns the device name, Symmetrix ID, Symmetrix LUN, and capacity. This utility will operate independently of any other EMC software.Use the INQ Utility to collect system information to provide to EMC Global Services for problem troubleshooting.

But we generally use EMC grab report for that?

Yes ,even INQ is also one of several tools bundled and run as part of the host grab utilities (EMC Grab and EMCReports).

Can we analyze INQ output?

Luckily we can analyze INQ output. below is the process. Generally we see two types of INQ output. It is Enginuity level dependent.

for older versions INQ output will be in below format

When running inq or syminq, you’ll see a column titled Ser Num. This column has quite a bit of information hiding in it.

Device Product Device

—————- ————————- ——————————–

Name Type Vendor ID Rev Ser Num Cap(KB)

—————- —– ——- ——— ——- ——— ——–

/dev/dsk/c1t0d0 EMC SYMMETRIX 5265 73009150 459840

/dev/dsk/c1t4d0 BCV EMC SYMMETRIX 5265 73010150 459840

/dev/dsk/c1t5d0 GK EMC SYMMETRIX 5265 73019150 2880

/dev/dsk/c2t6d0 GK EMC SYMMETRIX 5265 7301A281 2880

Using the first and last serial numbers as examples, the serial number is broken out as follows:

73 Last two digits of the Symmetrix serial number

009 Symmetrix device number

15 Symmetrix director number. If <= 16, using the A processor

0 Port number on the director

If Device Serial Number = “71018000” Legend = SSVVVDDP

SS = Last 2 Digits of System S/N V = Volume Number (000 – FFF) DD = Director Number (01 – 16 is A director, 17 – 32 is B

director) P = Port (0 – 3)

In new INQ outputs generally we can directly find columns for array SN and device id and device WWN(i don’t have inq output from newe version so not posting it here)

what if I don’t have INQ utility and can get only multipathing output?

Yes. if mutlipathing output is from powerpath then we can directly get most of details

#powermt display dev=all ====> Display All Attached LUNs

Mostly we used to run this command powermt, which will display all the attached logical devices to the server.

Pseudo name=disk915

Symmetrix ID=000290103691

Logical device ID=06B8

state=alive; policy=SymmOpt; priority=0; queued-IOs=0;

==============================================================================

————— Host ————— – Stor – — I/O Path — — Stats —

### HW Path I/O Paths Interf. Mode State Q-IOs Errors

==============================================================================

3 0/4/0/0/0/1.0x5006048c52a862e7.0x40a6000000000000 c14t4d6 FA 8cB active alive 0 2

3 0/4/0/0/0/1.0x5006048c52a862f7.0x40a6000000000000 c15t4d6 FA 8dB active alive 0 2

5 0/5/0/0/0/1.0x5006048c52a862e8.0x40a6000000000000 c16t4d6 FA 9cB active alive 0 2

5 0/5/0/0/0/1.0x5006048c52a862f8.0x40a6000000000000 c17t4d6 FA 9dB active alive 0 2

If I have only native multipathing?

So now we have to use a bit of technique to decode device WWN which is part of CTD addressing

General EMC device WWN: 600009700001926055542533030363338

you can break above wwn this way 60000970000 192605542 5330 30363338

192605542-serial nuumber of array in decimal

last 8 digits-30363338 are the symm device in ASCII

ASCII to Hexa

30=0

36=6

33=3

38=8

so device is 0638

so now we can say device is 0638 from array 5542

where exactly I can use this INQ or multipathing output ?

These outputs will be helpful for us in knowing what storage devices server able to see from san side. this can help us in resolving device missing or path down tickets.

where do I get INQ utility?

Below is the FTP link

INQ utility

what else?

Yup, I am done :). If you have more info or any corrections please feel free to mail me..

SAN QUES SET 01

Aside March 7, 2015March 7, 2015 Srik Interview Ques & Exp

1) what is FC controller and disk controller & disk array controller?
A. FC Controller is nothing but HBA (Host Bus Adapter)
The disk controller is the circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive.
A disk array controller is a device which manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware
RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache.

A disk array controller name is often improperly shortened to a disk controller. The two should not be confused as they provide very different functionality.

2) Difference and usage of RAID 0+1 and RAID 1+0
A)I don’t want to write answer again. There are already so many blogs explaining different raids and
there performance.

3)Default ID for SCSI HBA?
A) In a typical parallel SCSI subsystem, each device has assigned to it a unique numerical ID. As a rule, the host adapter appears as SCSI ID 7, which gives it the
highest priority on the SCSI bus (priority descends as the SCSI ID descends; on a 16-bit or “wide” bus, ID 8 has the lowest priority, a feature that maintains
compatibility with the priority scheme of the 8-bit or “narrow” bus).so answer is 7.

4) Highest and lowest priority in SCSI?
A) Each SCSI device is addressed on the bus via a specific number. For narrow SCSI (which allows up to 8 total devices), these are numbered 0 through 7; for wide SCSI
(16 devices) the numbering is 0 through 15. The priority that a device has on the SCSI bus is based on its ID number. For the first 8 IDs, higher numbers have higher
priority, so 7 is the highest and 0 the lowest. For Wide SCSI, the additional IDs from 8 to 15 again have the highest number as the highest priority, but the entire
sequence is lower priority than the numbers from 0 to 7. So the overall priority sequence for wide SCSI is 7, 6, 5, 4, 3, 2, 1 , 0, 15, 14, 13, 12, 11, 10, 9, 8.

5) SAN TOPOLOGY?
A) We have different types of topologies like core edge and mesh topology.

6) How many disks are minimum for Raid 5?
A) 3 disks are minimum for RAID 5.

7)Can Hotspare assigned for RAID0?
A) For Raid0 we don’t have any parity. so there is no point in assigning hot spare since we cannot rebuild data from failed disk.

8) What is HA?
A) HA means High Availbiliy.We do lot of things to maintain High Availbiliy in SAN like maintaining redundan fabrics, raid system, hot spares.
All these things help us in avoiding single point of failures and allows system to handle faults of failures.

SSKANTH

all about storage,cloud and DevOps

AWS re:Invent -New services and features

NVMe -NextGen SSD Interface

What is Software Defined Storage?

2015 in review

learning about all flash arrays

why do front end director ports WWN does not change after replacement in symmetrix?

what is Inquiry utility (INQ)?

SAN QUES SET 01