Archives

The Cost of Adopting Hadoop

Apache Hadoop is a popular open source software framework for developing distributed computing applications. It offers a real potential for creating a new business edge, and companies from many domains are starting to integrate it into their existing infrastructures.

However, because Hadoop has strong roots in the open source community, there is a general perception that the only cost associated with the framework is the time to learn how to use it. While this assumption is true for just getting started with Hadoop, the question about cost becomes more complex as you get further into implementing the software into a product.

During my recent consultations with product development companies around incorporating Hadoop as part of their product offerings, I noticed that the majority of the discussion revolved around workforce costs, support costs, integration overheads, etc. Although Hadoop is a free software framework, there are various “hidden costs” behind adopting it. Below is a brief checklist to help you identify if integrating Hadoop in a sustainable product offering is really cost-effective:

Cost of Search: A significant cost is incurred during the upfront evaluation study and proof-of-concept implementation for Hadoop. These tasks need to be done by expert resources (which can be costly) since you are basing your decision of whether or not to use Hadoop based on the results.

Cost of Acquisition: Although Hadoop itself is free, most businesses prefer to partner with a Hadoop-focused development partner to integrate the software into their existing solution. Having a partnership with a Hadoop vendor also ensures that if there is an issue with the software, you don’t have to wait for the Hadoop open source community to come up with a fix (which could take months or even years). There are also associated support and customization costs for specific business requirements, along with the cost of integrating Hadoop with the company’s existing platform.

Cost of Workforce: Regardless of whether you develop the solution in-house or outsource it, you will need to identify people with the right skill sets at a reasonable cost. You must also account for the costs of training your team, migrating your existing application and users, and developing processes and best practices.

Cost of Maintenance: In addition to general product support costs, companies must also continuously upgrade and maintain their product in order to keep up with the many disruptive innovations in the market.

So instead of looking at Hadoop as just free software, you should view it as a technology investment, weighing the immediate costs against the long-term benefits for overcoming the limitations imposed by your existing infrastructure.

About the Author

Navneet Kumar is a Technical Architect at GlobalLogic with over eight years of experience. He specializes in architecting and building custom Big Data Hadoop solutions for data ingestion, transformation, and analysis.

Agile Thinking: How Can I Help You?

One of the key fundamental elements of Agile is its focus on delivering a testable or demonstrable end-to-end functional slice that provides business value. This approach is the key catalyst of some behavioral, cultural, and structural changes.

As a product owner, I am interested more in the working functionality of a product rather than the individual goals of team members. For instance, just the front-end of a user-story is of no use to me. Even if the developers finish coding, I cannot use the product if it still needs to be tested. This implicitly means that cross-functional team members have to collaborate and help each other to attain Sprint goals.

One of the key values that resonates time and again whenever we talk about teams and collaboration is, “How can I help you?” In other words, how can I help you finish your (or our, to be precise) task or user-story? Asking how you can help your teammates is one of the most important unwritten guidelines for an Agile team, which is why, as an Agile coach, I want it to be prominently displayed in the team area. It should be promoted as the most important component of a team’s DNA.

So how does Agile collaboration work in practice? Below are some examples:

Taking the time to fix a build that is preventing the whole team from using Continuous Integration before working on your own task
Pairing up with your colleague to help fix something that is blocking him/her
Instead of taking up a new user-story from the board, taking up one of the remaining tasks of the existing user-story so that it reaches the tester faster and therefore moves to the DONE column faster

Looks interesting and doable, doesn’t it? It pays rich dividends in attaining Sprint goals, as well. However, Agile is a cultural change and therefore requires a change in mindset. Any kind of cultural shift is tough and not easy to implement. For instance, even though most Agile teams know that collaboration and helping each other is the “soul” of the Agile method, in reality most teams continue to work in silos because of many reasons.

One important factor in making these behavioural changes effective has to do with executive support. If the management team doesn’t embrace the collaborative approach, teams will continue to work in silos. For instance, without making fundamental changes in the way management defines individual KPIs, taking a collaborative approach may backfire on the team.

I recently met someone at an Agile conference who shared just this type of experience. He said, “I worked as a tester in an Agile team at my previous organization. As a team, we were collaborating together. We did dev-box testing (i.e., identifying issues on a developer’s machine itself before committing). I used to sit with developers to define acceptance criteria and test cases. All those things helped us in delivering great software applications with minimal bugs. However, when the time came for performance appraisals, my boss told me that I didn’t do my work well because I didn’t find enough bugs as part of testing.”

So although this tester’s team embraced the idea of a collaborative way of working, his organization still continued to focus on individual performances, which obviously doesn’t work.

The philosophy of “how can I help you?” is immensely important for the success of any Agile project. As it becomes a fundamental element of team collaboration, it requires management support as well.

How to Build the Biggest Lab in Ukraine

Avid is one of the largest clients of GlobalLogic. Its products include audio and video production solutions, media content storage, broadcasting and news production, and many of our engineers create and support them in Kyiv. A large number of tech specialists are also working in all Avid’s offices in the US, Germany and Canada.

Recently, GlobalLogic and Avid celebrated the seventh anniversary of their partnership. On this occasion we talked to people who launched the first Avid projects and asked them to recall how it all started in 2008.

After Avid decided to start doing some engineering outside of the company, its management started to consider different countries and partners. ‘A work group was created within the company, and it communicated with all the potential vendors, including GlobalLogic,’ remembers Igor Byeda, Senior Vice President and Managing Director of GlobalLogic Ukraine.

‘Communication with Avid took about 6 months, but eventually we won the contract, getting ahead of the largest Ukrainian IT companies,’ Igor says. ‘A strong team with deep technical expertise as well as our ability to build good relations with the client were the main factors that led us to success.’ GlobalLogic’s expertise in Agile had a reasonable impact as well. ‘The client was very interested in this practice and looked for a partner that could implement an Agile transformation within all the engineering of the company’, adds Yuliya Dubova, GlobalLogic Avid Lab Head.

As a result, Avid’s management chose two countries: Ukraine and China. For a few years the company was working simultaneously with two partners: Ukrainian GlobalLogic and Chinese Dextris.

At first, Avid planned a very limited scope of collaboration. ‘Initially, our task was to help develop a new framework for audio plug-ins and port completed plug-ins to a new framework and a new hardware architecture,’ recalls Volodymyr Vorobyov who started working for Avid as a Tech Lead. ‘We took Motorola plug-ins, partially coded in assembler, rewrote them in C/C++ and optimized them for a new architecture.’

‘We managed to exceed our client’s expectations and get off to a good start,’ Igor Byeda adds, ‘which is why a new project with Avid started soon thereafter. It consisted of creation of a new version of Pinnacle Studio, a home video editor bought earlier by Avid’. The third important project, which involved porting of a widely used Media Composer from a 32-bit architecture to a 64-bit one, took place in 2008 as well.

“Porting of Media Composer took much time and involved a lot of difficulties that were skillfully resolved by our engineers. After all, we got a possibility to work on the most elaborate part of this product, namely on support and, later, development of video codecs. At the moment, all the company’s expertise in video compression is concentrated in Kyiv, and it highlights the confidence, which our client has in us,’ says Artem Kharchenko, Avid Video and Storage Head.

‘The extension of our collaboration developed gradually, in small steps. At every stage we proved that we can exceed our client’s expectations…’ Volodymyr remembers. ‘Eventually, new projects in video and related systems were launched.’

“This way our collaboration lasted for more than a year,’ Igor adds. No later than in autumn 2009 Avid made a strategic decision to significantly increase its engineering operations at GlobalLogic. Since then active development of many of the company’s products has begun in Kyiv. In fact, GlobalLogic has got access to all Avid’s solutions, except the ones Dextris was working at.

‘I joined the team five years ago. At that time we got onto a brand new product line: Media Enterprise,’ Yuliya recalls. ‘It includes media data storing and processing solutions for large broadcasting companies and news agencies. HBO, NBC and National Geographic are among Avid’s clients.’

‘In 2010 we planned to extend very aggressively, and a lot of people doubted if we were able to find so many engineers in one city at that time,’ Igor says. Growth in volumes was followed by a qualitative transformation. ‘All the new teams which the customer has launched with us initially worked by Agile methodology trying various structures and configurations of Scrum teams,’ Yuliya Dubova says. ‘Over time more and more teams in Avid itself began using Agile after they worked with us.’

Having a successful experience of collaboration with Avid’s R&D department, GlobalLogic started working with other branches of the client, such as Professional services and Customer support. ‘We got these projects due to the fact that we already had an extensive engineering expertise in development of Interplay MAM product, which had to be customized for specific broadcaster needs,’ Yuliya explains. ‘In a similar way we started to work with IT and Marketing departments of Avid”.

After a few years of parallel collaboration with two countries, Avid got all of its offshore engineering transferred to Ukraine. ‘Today many of Avid’s products are mostly developed in Kyiv,’ Volodymyr Vorobyov summarizes.

‘After Avid decided to move all the development to Kyiv, we were confronted with a very serious and interesting task: to build a lab to allocate a huge amount of servers and remote data storages (ISIS); transfer them from China; adopt our competitor’s knowledge and quickly form a team in Ukraine to deliver real results. And we reached total success due to the professionalism of the IT team and well-coordinated cooperation with Avid’s management,’ Artem Kharchenko remembers.

These days Avid is going through a new stage of transformation. ‘The current tendency of a media market is that people are reluctant to pay for software, yet they are ready to pay for media content,’ Yuliya Dubova comments. This is why all the Avid’s projects are being transferred to a new service-oriented architecture. A cloud platform called Avid Everywhere is being created. It allows users to produce, store and sell their media content. ‘For instance, Avid has recently transferred its Pro Tools to a cloud and made it accessible for free. Traditionally, it was a very expensive product, developed for professional musicians,’ Yuliya adds.

According to Igor, partnership with Avid is an important stage in the history of GlobalLogic. Firstly, the company has got a very big and enhancing lab, which boosted GlobalLogic’s growth in Ukraine. Next, strong business relations were built with Avid’s management, which is now ready to recommend GlobalLogic to new customers. And finally, the company got expertise in Digital Media. Consequently, GlobalLogic has won a contract with Harmonic, which has also installed a large lab in Ukraine.

‘If we compare our business to a house built of bricks, than collaboration with Avid is a massive part of this house’s foundation,’ Igor Byeda sums up

Printing a Physical Scrum Board from Atlassian Jira

It’s a well-known fact that physical Scrum Boards provide many benefits over their electronic counterparts. With physical boards, the current sprint state is transparently visible to anybody in the team and to the stakeholders. As a team member, you are no longer required to explain to someone what exactly the team is focusing on right now, as anyone can look at the physical board at any point of time. Also, during standup, story-card and sprint progress gets more attention than individual progress. You can set up your physical board the way you want, and you don’t have to work around the limitations of any electronic tool.

I would need an entire other blog post to continue listing the advantages of the physical Scrum Board. However, I’d like to focus instead on how to overcome the practical issues of setting up a physical Scrum board. Believe it or not, it can sometimes be difficult just to obtain the right stationary for index cards. Also, teams sometimes hesitate to hand-write cards since not everyone’s handwriting is legible.

While working on a practical (and cheaper) solution, I thought of automatically printing index cards on colored A4 paper through Jira excel export of the sprint backlog and Windows-Word mail-merge feature. As you can see, the result turned out to be awesome!

Below are my observations on the process:

Printing index cards happens only at the beginning of the sprint. In case you encounter new subtasks or bugs, team members must hand-write and paste the index cards.
The team may think that now they have to maintain two boards (i.e., electronic and physical) simultaneously. That’s not the case, as updating physical cards can only happen during standup, which shouldn’t be an issue to anybody.
Depending upon your needs, you can use whatever index card template you want.
To paste the paper index cards, we used Blu-Tack, which is available almost everywhere.

About the Author

Shrikant Vashishtha is the Director of Engineering for GlobalLogic’s CTO department. He has over 15 years of experience in the IT industry and is currently based in Noida, India.

Cloud Trends and Challenges

This blog is taken from an interview conducted by Omniscope.

Could you please tell us about your background and current focus in the cloud space?

I am associated with GlobalLogic Inc for last 9 years and have been fortunate to work on the most complex and interesting breeds of solutions across – telecom, retail, commerce, social, mobile, stocks/trading, geospatial analytics, image processing, project management (as a domain) domains. In the past I have worked for Microsoft and HCL Technologies.

The background and transitioning into a cloud-fanatic has been pretty logical from the temporal perspective. As a core application developer and solutions architect, the right hosting strategy was one of the important aspects while working on the deployment architectures. Around 8-9 years back (pretty historical from technology perspective) the shared/dedicated hosting with the third party providers had been a prime choice to do away with the high maintenance in-house deployment strategies. But it was still meticulously involving in staying on top of various quality attributes viz. availability, scalability, reliability, maintainability, security etc.

This was more of a data center approach and the cloud was still in inception. Around the same time, the emergence of the cloud was pretty evident from the heavy focus it was getting early on from Amazon and Microsoft. Being closer to the Microsoft stack, I personally started with some interesting POCs in GlobalLogic’s Innovation and Technology Group on SQL Azure. Soon realizing the value, we constituted a study/POC group where we started envisioning and designing some interesting ideas into solutions based on cloud. It was still more around the Microsoft implementation of the cloud but the group definitely dissected the cloud solutioning in depth from the best practices and application development patterns perspective.

We also got into some large-scale cloud based products and expanded the scope to the AWS. The domain was application development over and above PaaS based clouds. It was equally interesting to get into the OpenStack working for one of the biggest OpenStack contributors, where I got a chance to work on extending the PaaS layer. It was one step deeper down into the cloud stack (SaaS, PaaS, IaaS, Virtualization, OS) where along with the platform services; it was more of infrastructure and virtualization as well.

The journey into the non-Microsoft non- proprietary Linux/Python based open-source development was an eye opener indeed. Over the same timeline it was equally rewarding to participate in the organization activities around cloud group and the advisory activities across the organization.

From the organization perspective there are some interesting cloud-centric use cases across the stack that I am a part of, along with the consultancy and development on the cloud based product development and advisory activities. The cloud computing and the taxonomy around it is not a special competency these days, it is more of a default approach; it is an UBER standard to which all the solutions have to comply to and if there are people and solutions who don’t, they should be ready to embrace extinction.

Which companies do you think are the leaders in the cloud space currently? What makes them tick with the customers?

The landscape is huge and categorized into Scope (public to private to hybrid) and Abstraction (IaaS to PaaS).

In the public space, the big 3 (Amazon, Microsoft and Google) take the top spot. The Amazon AWS being a pure-play IaaS is far ahead in the game due to the lead-time they got starting early on and the exceptional catalogue and services they provide; they surely have the cloud to beat. The Microsoft Azure is the biggest PaaS player with hands into the IaaS as well; the well-integrated Microsoft development ecosystems and huge Microsoft community offers them great benefits in terms of acceptability; they are the developers’ delight. The Google Compute/App Engine is going to be highly focused on the IaaS side early on and side-by-side polishing their PaaS side; they are the provider to watch because they are pretty hands on and experts on scale.

The level to which AWS and Azure have penetrated and adapted to the customer behavior gives them huge edge. Google, starting afresh and best practices already laid out from the well accepted AWS and Azure, is surely going to be a prime contender. All have pretty flexible cost structures with the Bulk Usage plans / Startup plans / Free limited developer access plans. Also, the availability and uptime guarantee SLAs are also pretty comparable and similar. Generally, the PaaS model introduces the cloud lock-in (but offering a well-integrated developer experience for Rapid Application and Solution Development) and the IaaS allows for cloud portability to an extent (of course bringing in some amount of extra overhead of creating a platform), the selection pretty much depends on the end customers’ priorities.

The private cloud usage has increased significantly from 2013 to 2014, not at the cost of public cloud usage though, but there is a clear segment, which demands security and performance over the theoretically infinite pool of resources on the public cloud. The key contributors to the private space are vmware and OpenStack (open-source) with the latter giving tough time to the long-time leader vmware. Being an open source, OpenStack does bring in some amount of integration and support overheads but the OpenStack community is pretty mature and responsive.

To utilize the in-house security and performance along with the infinity from public cloud, the hybrid clouds have been gaining a huge traction. An in-house private cloud co-operating and well integrated with the public cloud is a well desirable use case. The AWS and Microsoft have already made headway into the hybrid cloud space with the introduction of Direct-Connect and ExpressRoute respectively. These services enable the PRIVATE connections between the datacenters and the in-house infrastructure (bypassing the public internet) offering reliability, performance and high security.

As a note, a multi-cloud strategy is the one best suited for more resilient large-scale solutions. Big players have seen the downtimes, which have affected the customers badly.

What’s your view about Google’s recent massive price drop for its Cloud platform? Is that likely to alter the Cloud landscape (now or in future)?

The Amazon AWS did respond by slashing the prices of its compute and storage services, I think the very next day. Google has all the ammunition to be innovative and creative to break into the public cloud game. On the other hand, Amazon has a huge customer base which can back it through and through.

It’ll be interesting to see how they get more creative and innovative at managing their data centers and virtualization strategies on one hand and play with their profit margins on the other hand to maximize their market share. In my opinion it is going to be an ever-lasting battle with the end customers gaining big time. It is all about who has the best supply chain efficiency and best approach towards managing the costs of maintaining the data centers.

What selection criteria do you think customers (IT Decision Makers at companies) typically consider while choosing a particular Cloud company. Do these differ by size/ type of the company and/ or by specific workload being considered? Who are the typical people/ titles involved in the decision-making?

This may sound like a digression but will converge later. For the business critical solutions (which almost every solution is in current cut-throat market) a multi cloud and hybrid cloud strategy is absolutely a must from continuity perspective. There have been bad times for all of the providers and they leave the customers scarred big time. Even well within the availability SLAs of three, four or even five 9s, what customers need to realize is that downtime is a reality and Murphy’s Law applies equally everywhere. In one of my solutions with one of the big cloud providers, there was a downtime of around 2 hours just before the press release and in my early days, I wasn’t mature enough to have thought around this with many more other mistakes (delivery next month they had said). Very important…

This brings in working out deployments across multiple clouds. If not from the entire solution availability perspective but designing a neat failover mechanism to a feature degraded backup deployment on another cloud are all what the Architects need to think through. I still am not on to your question but bear with me…This brings the cloud portability perspective into the game i.e. to design the solutions in a cloud agnostic fashion to avoid maintainability and sustainability issues.

This is an important point and once you have your tradeoffs and priorities in place, identifying the providers best suited for your kind of solution can be figured out. For an instance, HP and Rackspace are OpenStack based public cloud providers and are working big time to beef up their PaaS aspects. Having a single solution based on the OpenStack platform components gives you all the positives of PaaS cloud and you get cloud portability out of the box. You can surely do this using a third party platform component and use it over and above IaaS clouds (AWS and Microsoft IaaS) but really depends on your cloud strategy.

Moving on, you definitely need to think through the other instruments based on the categories: Compute, Storage, Network, Cross-Service, Support, SLA, Management, Billing/Metering requirements, and how the cloud providers you have shortlisted map to your long-term roadmap.

In some of the cases the customer has a key role to play in selecting a particular cloud provider. The existing customer partnerships and relations drive it in considerable number of instances but surely a thorough investigation and thorough analysis by the participating Architects in complete guidance from the CTO is done before really embracing a particular or a group of providers.

What role does Open Source play in Cloud related decisions for companies? Do you foresee the importance of Open Source increasing or decreasing in future? How?

The open source and community driven initiatives are taking a key role in driving the market. The OpenStack, Cloud Stack and Eucalyptus are the key providers. The OpenStack in particular which is backed by companies like HP, Rackspace, IBM, Dell, AT&T, Ubuntu, RedHat, Nebula as platinum members and the other supporting companies (in 100s) with more than 17000 developers contributing from across 140 countries, is getting embraced full heartedly.

The public clouds for the companies like HP are implemented using OpenStack with products like Helion giving a well-integrated one click deployable private cloud installer are getting amazing traction. I have worked on OpenStack Trove project myself and the experience has been amazing so far. I was amazed by the maturity of the community and the processes for the continuous development and integration, especially which is driven by developers around different countries.

The benefits from end customer perspective, they are community driven and have key players backing them. The other important benefit is the cloud agnostic and cloud portability strategy you embrace out of the box when you work on an OpenStack based cloud (public or private). For an instance, imagine you deciding on Rackspace as the base cloud and then don’t really have to make breaking changes to have a multi cloud strategy implemented with HP as the other provider. Both based on OpenStack, no issues on maintainability. Secondly, if you decide with a private/hybrid cloud strategy down the line for some of your secure on-premise deployments, you bring a private OpenStack cloud into your on-premise cloud ecosystem and the solution is in place from day one.

The importance will surely increase for some time in the near future but the biggies are catching up. As mentioned earlier, the solutions like ExpressRoute and Direct-Connect may turn out to be the private cloud killers and open source based solutions right now play heavily on the private cloud side only. The public clouds based on open source are there but its niche so far has been the private cloud. It’ll be interesting to see the amalgamation of an open source private cloud connected with a proprietary public cloud using the private connections.

What are the key challenges that companies are facing to embrace the Cloud technologies? Will security always be an obstacle for the growth of cloud services in? Are these challenges unique for large vs. small companies?

The cloud providers are dealing with a completely unique kind of challenges and are trying to be innovative. Their problems span from, how to maintain their data centers, which virtualization technologies to use for better performance and isolation, how to provide the 100 percent of availability across their data centers, how to keep the developers/businesses interested in them, how to stay abreast on the latest technology trends and how to make them available as service son their particular cloud (for PaaS players majorly), how to provide the integration points so that their cloud solution fits into the enterprises seamlessly.

The cloud business users are worried about uptime, availability, performance and security challenges for their business critical applications, which are cloud targeted. They demand from the technology specialists or the application architects and developers the solutions around these very basic challenges. For the cloud technology brigade, the biggest challenge is around thinking an infinitely decoupled solution and lay it over the thought process of cloud portability. As a very basic example, majority of the developer community thinks of the public cloud as an infinite resource pool but they need to realize that it comes with a cost and which is pay-by-use and they still need to max out on the resources they are already using than depend purely on the scaling features. The challenge is around spreading a cloud centric thought process and devising strategies and patterns around them.

Security is just one aspect and surely the most important one. The private and hybrid cloud strategies have solved it to an extent. As mentioned earlier, with the private connections from on-premise infrastructure pool (whether private cloud or not) and the public clouds, the security issues are getting well taken care of. But if the solutions are not architected with a cloud centric, multi/hybrid cloud strategy with immense focus on cost utilization and cloud portability, all these advancements are still not going to solve the challenges.

What’s your view on the future of the Cloud (next 5-10 years)? (e.g. – in terms of competition, new business models, technology, mobile, etc.?)

It is bound to go the way other consumer software has gone. Microsoft Office was available on OSX and now on the iOS. Is Apple going to be worried about the usage of Pages, Numbers, Keynote etc. getting hit by that, absolutely not as this is what consumers want from an iOS device from usability and coherence perspective.

How about taking the multi-cloud solution decisions from the end consumers and providing it as a cloud platform capability. The consumer going to a particular cloud and able to select which all other cloud providers s/he wants to use for their application deployment. How it turns out to be in reality is beyond the scope of imagination and millions of baby steps may be needed for providing the customer, such an amazing experience.

For the recent future, in my opinion it’ll be about the hybrid/multi cloud strategies and the features the big cloud providers give around them. This brings in the cloud brokering products into the picture and there importance increases with the hybrid/multi cloud strategies being realized as the best options. Competition wise, Google is definitely going to change the way the cloud operates right now and is bound to bring in some innovative options for the end users. The open source is going to be running at pace for the times to come and is definitely going to play and important role in the overarching hybrid cloud strategy.

It definitely is the time to think about the cloud-portable PaaS and SaaS aspects for the IaaS centric providers because the rapid application development needs are going to drive the usage and acceptability in the times to come. Some of the IaaS providers have already started beefing up their platform capabilities and are investing heavily on the PaaS side. The private cloud usage will suffer because of the products around private connectivity of on-premise infrastructure with the public cloud data centers but it still is at early stages with the prime questions around how it is going to operate across the data centers, which are distributed worldwide.

Big data Analytics is another key area, which is going to play a huge role in the consumer and enterprise technology landscape. Its provisioning and functioning well within the cloud platform, as a managed service has also been a key area where the cloud providers are building capabilities on.

The mix of the Big-Data analytics (the ongoing use cases and the emerging one from the Internet-Of- Things) and cloud is already getting and will continue getting huge traction over the next 5-10 years.

Brewing Innovation One Cup at a Time

As a global software engineering firm with offices all around the world — and therefore 24×7 operations — we have a serious love affair with coffee. In fact, we estimate that our Kyiv office alone consumes more than seven tons of coffee per year! Not only does it help our engineers concentrate, but the coffee machine often becomes the focal point for social interaction and exchanging ideas. Of course, as in any office, no one wants to refill the water reservoir. Even though it only takes a minute, it’s a minute that could be spent more productively. Thankfully, our offices are filled with innovative people who are motivated to solve inefficiencies in any form — especially those that affect their coffee consumption.

Andrii Zakharov, a software engineer in Kyiv, took it upon himself to create a solution that automatically refills the coffee machine’s water reservoir. Standard coffee machines track water levels by using a magnetized float in the water reservoir. Once the float sinks to the bottom and touches a magnetic reed relay, the machine stops making coffee until the reservoir is refilled. To bypass this system, Andrii inserted another detector between the float and the bottom relay. When the float touches the new detector, an Arduino microcontroller transmits a signal to an electric pump, which then fills the reservoir directly from a 20-liter tank via a flexible pipe. Andrii also used a 3D printer to create a plastic cover that fully fits over the coffee machine’s water reservoir. He even included LED indicators to display information about the device’s status and current operational mode!

This impressive solution is already installed in one of our Kyiv office blocks, serving over 100 people. If we decide to add similar systems throughout our 1,400-person office in Kyiv, we could potentially save more than 1,000 hours per year that were previously dedicated just to refilling water reservoirs! It’s this sort of attention to detail that makes our engineers so special. Andrii took a seemingly minor problem and created a solution that makes a major impact on GlobalLogic’s productivity. I tip my hat to Andrii and look forward to seeing what he comes up with next!

Sparking Innovation through Product Review Engineering

In today’s technology environment of instant gratification, there is no “good enough.” With so many competitors in a single market, users expect their devices to have it all: reliable software, slick user interfaces, and frequent new feature rollouts. To meet these expectations, product companies often implement innovation programs to tap into the creativity of their employees. As a software services company, GlobalLogic is no different. Even though the products we develop belong to our customers, we are constantly thinking of ways to innovate. One tool for doing this is a process called “product review engineering.”

Product review engineering is simply a process by which employees review an existing product (usually within the first six months of launch) to identify requirements gaps and to suggest improvements. At GlobalLogic, we typically create a team that consists of developers and QA engineers who actually worked on the product (i.e., the knowledge holders) and a group who has no experience with the product (i.e., the fresh eyes). Together they review the current features of the product leveraging their own unique knowledge bases and even conduct market research to identify future development opportunities, from new market trends to new security requirements. Below is a sample workflow of a product review engineering process.

The great thing about product review engineering is that it doesn’t require the use of a specific tool such as Scrum or Jira. You simply write down your observations and then work with your team to publish a formal report at the end of the review. Another benefit is that it substantially increases the knowledge base of the team. Not only do the “freshers” learn about a new product, but they are also potentially exposed to new domain and technology knowledge. Even the developers and QA engineers who initially worked on the product get to expand their domain knowledge by conducting market research. As a result of product review engineering, employees can build their skill sets and offer innovative ideas on future projects.

Although the product review engineering process can be extensive, we have not found it very difficult to attract volunteers. Many times when you are working on a project, you are completely engaged in your particular tasks, whether it’s writing code or searching for bugs. Product reviews give employees an opportunity to look at the bigger picture and take part in activities outside of their typical duties. It also empowers them to make a real impact on a product or (in our case) a customer relationship. The prospect of being personally responsible for a product’s future evolution is very exciting.

Finally, product review engineering is beneficial for services companies such as GlobalLogic. Since our primary goal is to help our customers be successful through long-term partnerships, we are always looking for new ways to innovate. By proactively reviewing a customer’s launched products, we can offer them more than just maintenance services. The process also aligns the Delivery and Account Management teams, ensuring that everyone is on the same page regarding a customer’s products and future opportunities. This multi-team approach is crucial to smoothly managing a customer’s long-term product roadmap.

While product review engineering is still a relatively new concept, it has significant potential to improve both an organization’s product management and employee innovation programs. Not only is it a relatively simple process, but it can easily be tailored to any organization’s requirements. I have personally participated in several product reviews at GlobalLogic, and I really believe that it is the answer to helping product companies exceed user expectations.

Vaibhav Pathak has over six years of software engineering experience and is currently a senior lead test engineer at GlobalLogic. His areas of interest include RCS and synchronization-based applications, OEM layer and OEM integrated applications, device self-services applications, and next-generation network applications.

The Internet of Things: Part III

Making fast, context-aware decisions

In my last post in this series, we worked through a real-life example of an “Internet of Things” system, using Uber as a case study. We saw that the key characteristic of an IoT system is its ability to use the observations and requests it receives from sensors and people to make quick, context-aware decisions and then act on them. In this article, we’ll explore the heart of such a system and describe how such quick decision-making can be automated while taking into account the full “context” of the situation.

Figure 1: The Heart of the basic IoT System Architecture—Data Analytics

One widely accepted architecture for making fast, context-aware decisions is called the “Lambda Architecture”[1]. This architecture was originally proposed by former Twitter engineer and “Storm”[2] author Nathan Martz in a 2011 blog post[3], and is expanded upon in his book “Big Data”[4].

The concept behind Martz’s Lambda architecture is that there are two layers of analytics involved in rapid decision-making. One, called the “Batch Layer”, uses traditional big-data analytics techniques such as map-reduce to continuously mine multiple data sources for information, relationships and insights. This activity is termed “batch” because—while it may happen frequently—it is generally done on a scheduled basis, rather than on-demand in response to an external request or event. While big data analytic tools and data manipulation capabilities have improved enormously over the last several years, it still frequently takes minutes, tens of minutes, or even hours to run complex analysis tasks over large, heterogeneous data sets. This makes a “batch-oriented” big data analytics approach by itself impractical for rapid decision-making on the scale we need to support the Internet of Things: that is, action taken in tenths of seconds (100’s of milliseconds) or less. While it can be done in other ways, the batch layer corresponds to what we call “Deep Insights” in our basic IoT system architecture (Figure 1).

The second layer in Martz’s Lambda approach is called the “Speed Layer”. This is the layer responsible for rapid decision-making. As any engineer or manager knows, the key to quickly making good decisions is having all the relevant information available when you need it. In the Lambda architecture, the output of the “Batch” layer is continuously processed and the most recent results made available to the speed layer; technically, these are provided in a view. The speed layer uses this pre-processed information whenever a new request comes in, enabling it to make an informed decision, in context, in literally the blink of an eye [5]. While it’s not the only approach, the speed layer corresponds to what we call “Fast Decision Making” in our basic IoT system architecture (Figure 1).

Other techniques besides Lambda are used when even faster results are required; for example, in automated stock trading, or bidding in ad networks. These techniques can deliver decisions on the order of 10’s of milliseconds–with the sacrifice of some versatility. For interacting with humans, however, the Lambda approach gives a good combination of speed and configurability.

As we saw in our Uber example in Part 2 of this series, the decision on which driver and car are best to assign to a particular passenger pick-up request depends on a number of factors. The considerations might include:

Are the car and driver available? That is, is the driver carrying another passenger right now, or on a break?
How long will it take the car to get from where it is now to where the passenger is waiting to be picked up?
Where does the passenger want to go? If they haven’t told us explicitly, what can we guess about where they are likely to go based on past trips, or past trips of other people whose behavior is similar to them?
If we don’t know and can’t guess where this particular passenger wants to go, what is the most likely destination or travel time of passengers being picked up from that location?
How good a customer is the passenger requesting a ride? Should he or she be given special treatment? For example, if multiple people are waiting for rides in the same area, should they be picked up out of turn?
How much revenue does the driver generate for our company? Should this driver be given the pick of the likely best (largest) fares based on our policy?
Of the possible drivers who are close, how much has each one made in fares so far today? Should we try to even this out, or reward the best drivers?

Again, a reminder that I have no inside knowledge of Uber’s algorithms; the factors Uber actually considers may be totally different from the ones I describe—in fact, they probably are different. However these are representative of the kinds of information a similar service might want to include when assigning a car.

Of the factors mentioned above, some you would probably want to pre-compute in “batch” mode include:

Based on pre-assigned behavior categories, what is the likely destination of each category based on a given type of pickup location in a given area? For example, an analysis might find a person of behavior category “frequent business traveler” arriving at a pickup location of type “airport away from home” outside of regular business hours is most likely to want to go to a location of type “hotel” in a nearby metropolitan area. When arriving at “home airport” in the same situation, their most likely destination may be “home”.
Without knowing anything about a particular passenger, what is the most likely destination and/or length of trip for each type of pickup location in a given area? For example, a pickup in a particular shopping area in Manhattan might have the most common drop-off location in another shopping area, while a pickup at an office building may have a different typical destination.
Best customers (or customer ranking).
Best drivers (or driver ranking).
Aggregated total fares paid per driver since their current shift started.

In the lambda architecture, these factors and others would be analyzed by the “Batch Layer” on a scheduled basis (or accumulated and stored as they happen), and made available to the “Speed Layer” for its consideration when a passenger requests a pickup. The schedule of how often to update this information depends on how frequently each factor was likely to change, how costly each is to compute, and how important the up-to-the-minute accuracy of each piece of information is to the operation of the business. Factors that are quick to calculate or that change frequently (like the location of a car) are not computed by the batch layer, but instead are looked up or computed by speed layer at the time a request comes in. As computers and analytics grow ever faster, the line between what can be computed “on the fly” and what is best computed in the batch layer will continue to shift. However the key objective remains to make sure that at the time each decision is made, all the information needed is available to insure the decision supports the goals of the system and organization that deployed it.

A human operator or decision-maker would also, ideally, make decisions in a very similar way to the one we just outlined. Each decision would be aligned with the current, overall goals of the company, and would consider the full context. In the case of a taxi dispatcher, you would expect the very best ones to understand and account for these same factors whenever dispatching a cab. For example, if Mrs. Jones was an important customer—or the owner of the cab company—a human dispatcher would know this and would make sure she had the best experience possible. If Bob was their best driver and a lucrative fare came along, the human dispatcher might send Bob to pick that person up, to encourage his continued loyalty to the company. This type of contextual information is what you learn from experience, and accounting for it is what makes a human decision maker good at his or her job.

The Lambda architecture is one means of letting machines exhibit the same type of “intelligence” that you would hope to see in your best human employees when performing a similar quick decision-making task, consistently and at scale. In reality, of course, the machines are executing a program—more specifically, an analytic and decision-making algorithm—and not exhibiting intelligence in the same sense that a human would. In particular, the machines are not bringing to bear their general knowledge of the way the world works, and applying that general knowledge to a specific business situation. A human, for example, would probably not need to be taught that if Mrs. Jones owns the taxi company, then whenever she calls and urgently requests a cab you give her the one closest to her, even if that means another customer needs to wait a little longer. The human’s general social intelligence and life experience would, hopefully, make that the default behavior.

With the machine algorithm, the actual intelligence needs to come from the people developing the algorithm—both the programmers and the business people who work with them to craft the company goals. This hybrid combination of business, computer science, data analysis and algorithmic knowledge is an emerging field often called “data science”. Because the algorithms behind these decision-making systems directly drive business results—and, in a real sense, are the business—we are no longer talking about conventional IT systems. Conventional systems provide information to human decision-makers, but next-generation IoT systems actually are the decision makers. The staff developing the algorithms behind these systems needs to embody the business acumen and decision-making prowess that was formerly in human hands, and put these into algorithms and systems.

In one sense this is what computer science has always done: to automate formerly human tasks and, to an extent, decision-making. However the scope and impact of decision-making that is now practical and economically feasible based on current technology is a step-change. To address these upcoming business challenges, a whole new level of thinking is required, demanding business-oriented developers, and development-oriented business people. Over time I believe we will see these people become as important to next-generation internet businesses as “quants” have become to Wall Street.

Our goal in this blog has been to show you how the “heart” of an IoT system—its analytics capability—can deliver context-aware decisions in fractions of a second. In our next post in this series, we’ll talk about the technical and economic factors that will make the Internet of Things a practical reality.

Dr. Jim Walsh is CTO at GlobalLogic, where he leads the company’s innovation efforts. With a Ph.D. in Physics and over 30 years of experience in the IT sector, Dr. Walsh is a recognized expert in cutting-edge technologies such as cloud and IoT.

References

[1] http://lambda-architecture.net

[2] http://en.wikipedia.org/wiki/Storm_(event_processor)

[3] http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html

[4] http://manning.com/marz/

[5] A normal human eye blink takes in the range of 100 to 400ms; http://en.wikipedia.org/wiki/Blink.

Using Story Points to Estimate Software Development Projects in the Commercial Phase

Accurately estimating a software development project’s total effort is an essential step to providing your customer with a competitive proposal in the commercial phase of a new opportunity. However, both traditional and Agile-based techniques have drawbacks that can negatively impact the process. In this white paper, we will demonstrate a new approach to project estimation that integrates the best features of both traditional and Agile-based techniques.

The Internet of Things: Part II

Uber is Driving to Where IoT is Headed

In my previous blog, we discussed the concept of IoT. Now let’s look at a current real-life example of an “Internet of Things” application so we can think about how such systems will work in the future. Most IoT examples are along the lines of the “smart coffee cup” example in the previous blog, or intelligent appliances. While these are certainly valid examples, they are also very narrowly focused on “Things”, which is only one part of IoT. The real game-changing aspect of IoT is not so much the “Things”, as it is the systems that reason about things and that cause those things to act. Over time, these full-featured IoT applications will impact the way we live and work just as profoundly as “Web 2.0” applications like Facebook, Twitter and other social media apps do today. And as we’ll see in our example, this is already starting to happen.

Uber is a mobile application that connects people with taxis and cars for hire. Uber may not be an obvious candidate to be considered an “Internet of Things” application, but I would argue that it is a very good example of what IoT really is, and where it is headed. It also has the advantage of being a current real-world, functioning, revenue-generating system–unlike the rather silly (though hopefully evocative) “smart coffee cup” example above.

By the way, I’m intentionally picking an example of which I personally have no inside or proprietary knowledge; that way, I don’t disclose anybody’s secrets. Note this also means that I am speculating on some of the implementation details, and my selected application—Uber (uber.com)—may implement something differently than the way I describe it. Nonetheless, my guess is I’m not too far off.

If you haven’t used Uber before, here’s how it works. Download the app on your mobile device and register with Uber using your credit card.

When you need a ride, you launch the Uber app on your mobile device.
The Uber app shows your current location on a map, together with the current location of all nearby cars for hire that are currently available. It also shows you the time it would take for the next available car to get to your location. In my case, that time was 2 minutes—but my office is near an airport!
As you watch the map, the locations of the nearby cars change—you can see them all moving around in “real time”.
You can input your destination and get a price quote before you call a car. In my case, the quote was $23 to $31 for a car to take me home—a 15-mile (25 km) drive that typically takes about 25 minutes.
When you decide you want to call a car, you press a button to confirm where you want to be picked up—in case you want to be picked up in a different location. If you want to be picked up where you currently are (or, more specifically, where your phone is), you press one more button to call a car to your current location; otherwise you indicate your preferred pick up place by touching a location on the map. In case you haven’t been counting, this is two button presses—and no data entry—to bring a car to your current location (if you don’t check the price).
As soon as you choose to call a car, you receive the name of the driver and a description of the car (license plate number, color and make of the car) that is coming to pick you up. You can also see your car instantly change course on the map as it comes to get you, which is pretty cool. Social ratings (e.g. “5 stars”) for the car and driver together with comments from previous customers are shown for you to review during the time you have to wait for the car to arrive (2 minutes in my case).
As the driver gets closer, you get a countdown timer and can see the car’s current location. There’s also an option to exchange messages with the driver while he or she is on the way. That’s handy if you need to say, for example, “I’m waiting inside the main lobby door.” You also get an automatic notification that your car has arrived.
When the car arrives, you hop in. If you got a price quote (I always do), the driver already knows where to take you without asking, because you entered the destination address to get the quote. The driver also gets turn-by-turn directions to your destination on his or her phone without having to enter any data.
When you arrive at your destination, you hop out and say thanks. You do not need to pay, leave a tip or even take your phone out of your pocket. The tip is included in the rate (20% by default, though you can change it), and the cost of the ride is automatically deducted from your credit card. A receipt is automatically emailed to you with details of your route as a reminder.

That’s all there is to it. I can summon a car to take me home with a couple of key presses, and without taking money or a credit card out of my pocket. There are various wrinkles and refinements. For example, you can choose the type of car you want (e.g. taxi, limo, SUV, or personal car) which will be reflected in the price you pay; you can rate the driver and car; you can split fares; and various other passenger conveniences. There are also numerous features for the driver, including the ability to rate passengers, and the best places to go to look for new passengers (based on current demand plus movie and show times, seasonal patterns, and so on). Every driver and passenger I’ve talked with agrees this is a very cool system. It’s also a fairly lucrative one. Uber earns its money by taking a 20% cut of each fare, which yielded an estimated $220M in revenues in 2013, on estimated total fares collected of $1.1B.

Now cool as this may be, why is it a good example of IoT? First, note the central role played by the locations of both the car and the passenger, and the near real-time nature of the information that is exchanged. The GPS (global positioning system) chip as well as other location-determining systems that run on each person’s smartphone are what provides the location information. While these location sensors and services are contained in our mobile devices, they are typical of any sensor in any “thing” connected to the Internet of Things. In the Uber case, the “thing” is actually the sensors contained in our and the driver’s smart mobile devices, not literally the passenger, driver or the car. The “Internet” piece is the connection of each sensor, through the mobile device, to Uber’s “brain” and “memory”, which is hosted in the cloud.

I think it’s quite revealing that few people I’ve talked with actually think of Uber as an IoT application even though—as we shall see—it’s a great example. Other than perhaps the car, I didn’t really talk much about “Things” when I described Uber, and I think most users would do the same. Instead we talk primarily of the people involved; that is, passengers and drivers. This is partly because our mobile devices and the sensors they contain have become shorthand for ourselves. We also think of our human needs for transportation and, perhaps, for comfort; that is clearly more important to us than the car itself, as a thing. And unless we’re technically oriented, we never even give a thought to the sensors contained inside of our smart phones.

Our tendency to focus on people and human needs will continue, even in the Internet of Things era. If we ever have that sensor-enabled coffee cup I described in part 1 of this series, I bet we will think “I need more coffee” rather than “my cup says it’s time to refill it”. Our needs and ourselves will remain at the center, no matter how smart our devices become—at least for a very long time.

Continuing our analysis of Uber, once the app is launched, the location sensors associated with both the passenger’s and the driver’s mobile devices (the actual “things” being monitored) are regularly broadcasting their location to a “back end” system that is hosted by Uber on the “Internet Cloud”. While it seems to us like we are directly summoning a car closest to us, that’s not exactly what happens. What happens instead is that our mobile device sends a message to Uber’s Cloud service, saying that person “x” (you, as identified by your physical phone) at such-and-such a geographic location (determined by the sensor inside your phone) wants a car sent to that location, or to another location nearby.

When it receives such a request, Uber’s Cloud service then uses near-real time analytics to determine which car is the best fit to service the request. I am not privy to Uber’s specific algorithms—and could not talk about them even if I was, as these are core to Uber’s business proposition—but presumably they look for geographic proximity / time to arrive, loyalty and lifetime value of that particular driver to Uber, lifetime revenue of the driver (to break ties if two people request service), customer ratings, how lucrative the fare is likely to be, and other factors such as how recently a given driver has been given a fare, the expected duration of the trip, and maybe the driver’s work schedule. Some of this information will be computed on-the-fly, given the most recent information available—such as the current location of each car. Other information, such as the average fare paid by pick-ups in a given location at a particular time, or the total lifetime value of a particular driver to Uber, are likely to be pre-computed on a scheduled basis, in batch mode.

The net result is that a car is assigned to the passenger very quickly–probably within about a tenth of a second after the system receives your request. Because complex calculations like “lifetime value” are pre-computed, sophisticated metrics can be used to improve the value of business decisions even while making them very quickly. In theory, this allows much better split-second decision making than all but the very best human dispatcher could make. Finally, the result of a decision is “actuated”; that is, put into effect. In the case of Uber, this actuation takes the form of sending a notification to a driver’s phone asking them to pick up a certain passenger at a certain location. In addition to this instant response, sometimes a complex process may be initiated or taken to the next step as well—for example, perhaps Uber rewards drivers for reaching a certain number of miles, and a particular journey may trigger the process to produce and mail the driver a trophy or gift card.

Figure 1: Basic IoT System Architecture—assigning meaning to data

Other IoT applications tend to follow a similar architecture pattern to the one we’ve just outlined for Uber:

Requests, as well as data from sensors and human observers, are transmitted to a cloud-based system;
A “fast decision making” subsystem quickly processes the incoming observations and requests—for example, “I need a car”;
The fast decision making system uses the “context” provided by a system that analyzes multiple sources of data. This context informs its quick decision-making;
Based on the context, the fast decision making system triggers some action—for example, sending a message to a driver to pick us a particular individual at a particular location;
More complex actions can also be triggered by events and analysis—for example, releasing a driver from service if his or her ratings are too low, or sending them a reward for traveling a certain number of miles.

While well suited for the Internet of Things, a similar architecture pattern can be useful in many situations. Whenever quick “contextualized” action is needed in response to a stream of incoming data and requests this approach can be useful. We’ve found elements of this “IoT” architecture very well suited to situations as apparently diverse as mobile advertising and information security, for example.

The key to IoT is the ability to put current observations and requests in a “context”, and then respond to them intelligently. When these observations come from sensors and the response is delivered through mechanical or electronic actuators, the “Internet of Things” label most obviously applies. But the heart of any IoT system is its ability to respond intelligently to events by taking autonomous action. In the Uber case, the heart of their business is their system’s ability to intelligently assign drivers to passengers in a way that encourages the loyalty of both–it’s not just the sensors and actuators in the phones, important as those are. When we talk about the Internet of Things, it’s important to remember that it’s not just the things but context-driven intelligence—or analytics—that will determine the success of the next generation of Internet applications.

The next post in this series describes how an Internet of Things application can quickly make context-aware decisions.

References

http://www.forbes.com/sites/aswathdamodaran/2014/06/10/a-disruptive-cab-ride-to-riches-the-uber-payoff/

URL copied!