Cisco Tidal – Riding the (Hadoop) Elephant

So, now that I’ve done my best to explain the origins of Hadoop and where it is today, as well as what it can do for us tomorrow, I’d like to spend a bit of time talking about the potential Cisco sees in Hadoop, and what they’re doing to put that power in their users’ hands.

The first thing any server manufacturer who wants to cash in on the Hadoop hype needs to do is to choose a Hadoop distribution to support. Seeing that Cisco and EMC are already tightly aligned in several areas within the data center, it was a natural fit for Cisco to piggyback onto the EMC strategy and utilize their expertise in the area.
The Cisco Hadoop solution consists of Cisco’s C-series rack servers, built into half rack and full rack combinations (for both performance and for capacity approaches). Cisco then layers on the EMC Greenplum MR stack (which is an OEM version of MapR’s Hadoop distribution) and some added treats to help accelerate performance and add extra value to the MapR stack.

Cisco UCS has shown that it can provide some of the industry’s finest density, performance and management without sacrificing anything along the way, so coupling the platform with a tried and tested Hadoop solution is a natural step in Cisco’s evolution within the server market.

All very fun stuff, but wait….. There’s more!

Cisco has taken a look at where many organizations struggle when it comes to running large Hadoop environments, and has decided that workload scheduling is the major roadblock to enterprise-wide Hadoop adoption.  The reason for this is that most Hadoop implementations start really small and are used for a specific problem a company is looking to solve. As the company realizes Hadoop can be used for many different areas, they start to deploy many small Hadoop clusters, each a silo that is focused on a specific area. As the technology gains momentum internally, and as users learn how to apply the Hadoop approach to a broader cross section of questions to which they want answers, their Hadoop number of nodes expand quickly from the 10 to 100 and beyond!
At this point, managing workload scheduling and execution for many applications and hundreds of Hadoop jobs is an unruly task and one that requires extra tools, as well as a centrally-managed approach versus a departmental management approach.
To facilitate this kind of control Cisco is leaning on the acquisition it made of Tidal Software, and more specifically the Tidal Enterprise Scheduler, which is a comprehensive workload scheduler and management system. With the 6.1 release the software can now control workloads and their complex scheduling of up to around 40,000 jobs per month, and is certified to do so on Cloudera, MapR and EMC/Greenplum distributions.

 

Cisco is obviously leading the Tier-1 charge towards providing a comprehensive and easy to manage solution for clients looking to dive into the exciting world of Hadoop. Not only that, but they have taken it a step further with the ability to manage the Tidal Workload Scheduler from an iPhone.  Yes, that’s right: you can now run 1000’s of Hadoop jobs while sipping a cold beer hundreds or thousands of miles away from the physical iron. Now that’s convenience, and with the Enterprise in mind.

Posted in Uncategorized | Leave a comment

Wulfman’s Hadoop for Dummies

It seems like innovation and change in the IT industry come in waves, and currently there are a number of areas in which we are seeing massive changes in approach, opinion and architecture. Whether it’s the “Bring your own device (BYOD)” revolution, solid state technology as the new golden child of storage, or finding new ways to manipulate and drive value from data, there are big changes going on all around us.

One area that shows a ton of promise, but thus far has not been adopted or seriously considered by most organizations is Hadoop technology, which focuses on that third point mentioned above… The ability to find new ways to analyze and crunch data in order to derive more actionable value from it.

Many people reading this blog may be asking themselves something like “I’ve heard of Hadoop, but what exactly is it? And why would it be important?”

I’ll do my best to answer the above questions by breaking it down into a few sections…

What is Hadoop?

Hadoop is an open source project that is based on a few different modules first pioneered by Google as a way to crunch massive amounts of data, in an efficient, distributed and scalable way. These modules have a core of:

  • Hadoop Kernel – a modified Linux OS kernel
  • MapReduce – the programming model that allows Hadoop to crunch massive (think Google) sized data sets
  • HDFS – The Hadoop Distributed File System which is the repository all the data to be crunched sits on

You may ask how this differs from traditional databases and applications that use the RDBMS – Relational Database Management System (think SQL, Oracle, MySQL, DB2), and at a glance the two may seem very similar. But the real problem is that RDBMS’s in their current form are not built to deal with huge data sets that need to be pulled in from many different sources, and processed in a very quick manner.

So in essence Hadoop allows companies to ingest data sets that are too large and unruly for their existing RDMS instances, and crunch that data to get intelligence and information they can act on within their respective markets. The idea is, that with Hadoop companies can crunch a lot more external (think web based data, location information, tweets… Which can have tons of value, but is usually too large/unruly to be analyzed by an RDBMS) data, and correlate it against their own internal data bases which is smaller and more focused. This will yield market intelligence and allow clients to solve all types of market challenges they in the past couldn’t solve in an easy and economically viable way.

The Who’s Who in Hadoop’land

Many of you must now be saying to yourselves “why hasn’t everyone started using Hadoop” or “why isn’t everyone at least talking about or planning around Hadoop”?

While many people are talking about Hadoop, the movement towards using it at any real scale has not happened, and the reasons for that is the same as many new open source projects. Usually open source projects are the brain children of highly innovative companies, in this case Google and to a lesser extent Yahoo! And once released into the wild, these projects need companies to pick them up, add to their code and find ways to bring the value of the innovative technology to the masses.

Currently Hadoop has very few experts in the field adept at building and managing its complex requirements and differing approach to information management and ingestion. So what we are seeing is a grass roots type movement where Hadoop is being used in smaller, specialized instances to solve specific business problems.

For Hadoop to really gain the momentum many hope it will there needs to be a strong and committed ecosystem of partners who make the technology accessible and useable by normal IT outfits, who may not have the expertise or time to fully re-train and re-organize the way they approach crunching data and analyzing it.

Thus far there have been a few formal distribution companies that have signed onto the Hadoop project and have put solid, stable and viable Hadoop products into the marketplace, along with the support structure and in some cases enough ease of use features so that non-Hadoop users feel comfortable with the technology. Below is a quick rundown of each and a bit about where they currently are in their development and client engagement cycle:

1)      Cloudera – Which is by far the most established Hadoop distro, with the largest number of clients and application development partners, longest time in the market, the largest number of Hadoop developers and the largest number of Hadoop code updates submitted. So you may be asking why I am even going to cover any other vendors, as this one seems to have the most solid footing and long standing history on the landscape. The “gotcha” moment with Cloudera is that they are not 100% open source, as their “Cloudera Manager” is proprietary and fully controller by Cloudera, and for some purists that may not be the route they want to take. Cloudera also had names like HP, Cisco, IBM, Dell, Netapp, Oracle and many other large Tier1 OEM’s.

2)      Hortonworks – Known at the #2 for basically ALL of the above mentioned areas, and have been in the game almost as long as Cloudera but with a true open source product, unlike the “freemium” option that Cloudera operates on.  Their banner partners include HP, Netapp and Symantec

3)      MapR – Which has been getting some recent press around its EMC partnership, but which has been producing less revenue and clients than the above two companies. Up until this point they have also been a lot less of a heavy hitter due to their lack of developers releasing and pushing Hadoop code out into the wild for clients to use. EMC and Cisco are really their only big partners, with EMC using their code to run their Hadoop solution on their Greenplum appliances.

The next few years will see a race between these three companies, their partners and respective ISV’s to see who can dominate the upcoming Hadoop craze and drive the most end user value. It will be an exciting time, and one I look forward to providing updates on.

 

 

Posted in Uncategorized | Leave a comment

Private Clouds – not just about the Automation and Orchestration

This week, I finished up a series of road-shows and it was reminiscent of “engagements” that I did years ago…

Customer Event 14Nov12 Atlanta

As I sat down and started building my message and the presentation, I remembered how important Data Center Optimization really was, and how often it was left behind.  The framework that I created years ago still lives on as a guidepost to the areas that must be addressed if you are going to build a true cloud strategy and achieve the results that so many desire.  Ironically enough, I can’t stop myself from thinking, what’s old is new again!

 

Let’s start this conversation with what is relevant with every IT manager, at any level (C,V,D,M), and that is always top of mind…COSTS.  Cost constraints in IT are always top of mind, unless you are in the business of technology/technology is your business.  “Do more with less, do more with what we have…just do more, but with no additional budget…” these are the conversations that we commonly hear through-out the year, especially during budget planning seasons.

The inverse to these conversations, which I am still glad to hear happen, are “IT – keep us competitive, give us speed to market, allow our users the ability and freedom to roam and work on their time to support our customers…”  All while maintaining security, reliability, and up-time.  These are the conversations we should all love to have, because they support the health and growth of the customer, beyond the day to day cost and operational conversations.

Conversations like this bring cloud, specifically PRIVATE CLOUD, to the front table as companies look at ways to provide the business with the flexibility they so badly desire.  Confused on Private Cloud (I find sales reps are…)  Private Cloud is the “intersection of virtualization and cloud technologies…” Tom Bittman, VP Gartner – has a great video to frame this…


“It’s a Private Cloud we desire, but it’s a legacy infrastructure that holds us back…” Frank Ball

With the advent of High-Availability and Fault Tolerance in Virtualization tools, more and more organizations are migrating physical work-loads to virtual.  As they “cross this chasm,” they are looking forward into the future for Automation, Orchestration, and self-service models to support the business (aka – Private Cloud).

But herein lies the problem.  Most companies jumped feet first into this pool called “virtualization,” and they forgot a lot of things along the way.  Left behind were critical areas of savings that could have benefited the organization and potentially funded the acceleration of the “Journey to the Cloud…”  Combine that with aged infrastructure that is not “Cloud Ready,” and IT services, processes and even people that aren’t aligned appropriately.  Well, we seem to have one foot stuck in the past as we try to look forward to the future.  Not only is it costing us a LOT of money to support all of this “stuff,” but we aren’t really giving the business what they asked us for to begin with.

The 7 areas of Optimization?

1 – Applications
2 – Operating Systems
3 – HyperVisor Selection
4 – Platform Selection
5 – Networking and Security (virtualization and consolidation)
6 – Storage (virtualization and consolidation)
7 – Business Continuity and Disaster Recovery Planning

These areas need to be addressed when we first set out on the path to virtualization.  Not just for cost saving, but to gives us the foundation to be able to provide true self-service models through Automation and Orchestration.

But most of us grabbed a hyper-visor and starting “playing” with workloads to solve a specific problem…Data Centers/Infrastructure was growing at an alarming rate, and we were running out of space… So we looked to virtualization to solve our “platform consolidation” problems.  Although servers were a big part of the equation, they weren’t the only piece to consider.  We still needed to address the “pipes, plumbing, and hot water heater.”

Network Consolidation and virtualization can contribute a 25% cost savings by migrating from “old to new.”

Storage Consolidation and virtualization can contribute a 50% cost by migrating from “old to new.”

What’s old is new again – and there are more efficient ways of running and managing our data centers, and Private Clouds provide a great opportunity for the future.

Standby – more to follow!

Posted in Uncategorized | Leave a comment

Adcap’s Journey to the Cloud Event

November 13th – Ft Lauderdale, FL

Posted in Uncategorized | Leave a comment

A Personal and Professional Journey

Moments of well-deserved introspection are rare; one should seize them with a passion not to be matched!

Many of you may be wondering what it is I mean when I write “well-deserved introspection”, as one can easily be introspective anytime one wishes. But if you think about it a bit more carefully, in order to really be introspective you need to have a strong base of history and experience on which to reflect, analyze and assess.

If you met me five years ago you would never have thought that I had a techy bone in my body, however the more I worked with clients and saw how virtualization allowed them to transform their businesses, the more enthusiastic I became about diving into the technology side of things head first!  While I am by no means a card-carrying expert, I have major experience with virtualization (and have been known to draw the odd diagram on a whiteboard)!

Over the past half-decade I have come to love what the mantra “virtualize it” can accomplish when approached in the right way, and when acted upon as often as possible.

Since 2008 I’ve dedicated myself at times to specific technologies (for those who don’t know me: HP and VMware), but in the end I have come back to focusing on my clients’ business outcomes and on deploying the technologies and vendors which make the most sense to support their business.

Be that as it may, in many cases these days I find myself and my clients deploying a vast array of technologies within their environments.  I also see a trend in certain areas.

Companies that are “technology-driven” are focused on finding ways to use IT as a competitive advantage and as a tool to create new revenue streams, while companies that are “technology-enabled” are focused on using IT to drive costs down and react to new pressures. There are companies who see IT as a burden and/or nothing more than a cost center, but for the most part I won’t be talking about those companies.

Technology-driven can be linked to “bleeding edge” technology, while technology-enabled can be linked to “leading edge”.

What I am increasingly seeing within these areas of the marketplace is that there is an increasing focus on efficient virtual machine management and rapid provisioning. In order to facilitate both of these activities at the same time one must have an infrastructure that has the deepest integrations possible, and at the most granular layers with the underlying hypervisor to ensure it can work in lockstep with the bare metal. But there also needs to be tools manage hundreds/thousands of servers at a very high level so that quick decisions can be made to ensure rapid deployment of new services without affecting existing application service levels.

All of the major infrastructure providers in the industry have strong solutions for virtualization, cloud computing and general data center use cases. But other than one company, they are all still using a “1.0” approach (well, maybe some are up to a “1.5”). By this I mean that they have adequate solutions for their own company, but for those technology-driven/enabled companies, there is something left for want; it was that “want” that forced a moment of introspection for me. That moment of introspection opened my eyes to a new technology in the marketplace.

“What is that technology?”, you may be asking yourself. Well, it’s none other than Cisco UCS (Unified Computing System), a Virtualization 2.0 platform. I’ve seen what the ability to easily and efficiently manage 10,000 virtual machines can do for an organization, especially when we’re talking about driving revenue!

Honestly, this has been a bit of a journey for me, and I hope to share those experiences with you all!!

Think of UCS as the chef of your kitchen: irreplaceable, because they understand how to work quickly, they’re easy to deal with, and they are able to provide the results you always expect: a GREAT end result!

That being said, with every meal one needs great sides, which means I’ll be filling these pages with all the amazing things you can do with Cisco’s other solutions, and the incredible alliance of partners they are developing solutions alongside.

Coming up next: Hadoop for dummies, and what Cisco is doing to help bring this exciting technology to the masses!

 

 

 

Posted in Uncategorized | 2 Comments

Welcome to our world!!!

Welcome friends — Standby for Brilliance!!!

Posted in Uncategorized | 1 Comment